如何从列表中删除基于关键字的行子集?

2022-03-11 00:00:00 python text-files

问题描述

我有以下文件:

This
is
a
testfile 
wj5j keyword 1
WFEWF
O%LWJZ keyword 2
which
should
lpokpij keyword 3
123123das
kpmnvf keyword 4
just
contain
the 
following
lines.

其中我需要删除关键字1和amp;关键字2之间以及关键字3和关键字4之间的行子集,因此如下所示:

This
is
a
testfile 
which
should
just
contain
the 
following
lines.

我尝试了下面的方法,它只打印包含关键字的代码行,而不打印其中的那些行。我的想法是,如果我打印了所有行,就可以从文件中删除它们

with open ("newfile_TEST1.txt", mode = "r") as file:
    keywords = ['keyword 1', 'keyword 2','keyword 3','keyword 4']
    lines = file.readlines()
    for lineno, line in enumerate(file,1):
        matches = [k for k in keywords if k  in line]
        if matches:
            print(line)

我可以做些什么来改进我的代码?


解决方案

我会使用从第一次匹配到Netx匹配为止为True的FLAIR。则为FALSE:

with open ("./txt.txt", mode = "r") as file:
    keywords = ['keyword 1', 'keyword 2','keyword 3','keyword 4']
    lines = file.readlines()
    glitch_flair=False
    for lineno, line in enumerate(lines,1):
        matches = [k for k in keywords if k  in line]
        if not matches and not glitch_flair:
            print(line, end='')
        elif matches:
            glitch_flair=not glitch_flair

相关文章