英文:
How do I print lines in between two searches
问题
你的脚本中有一些问题,导致它只打印了以pat1
开头的一行。以下是修复后的脚本:
import os
import re
pat1 = r"FEATURE AAA"
pat2 = r"[^\\]"$"
with open('all.txt') as f:
inside_feature = False
for line in f:
if re.search(pat1, line):
inside_feature = True
print(line, end='')
continue
if inside_feature and not re.search(pat2, line):
print(line, end='')
if inside_feature and re.search(pat2, line):
inside_feature = False
f.close()
这个修复后的脚本应该可以正确打印位于pat1
和pat2
之间的行,其中pat1
行以"FEATURE AAA"开头,而pat2
行不以\"
结尾。
英文:
My text as follows:
FEATURE AAA cdslmd 22.02 28-jul-2023 1 \
3F76BA3DA7102935AD12 VENDOR_STRING=J:PERM DUP_GROUP=NONE \
vendor_info=13-jun-2023 \
ISSUER=CDNS2161b9360e2037d6f7bd35a11244661a ISSUED=13-jun-2023 \
EC0C" V7.1_LK=3F06FAFDD30D943EC2DD
FEATURE BBB cdslmd 22.02 28-jul-2023 2 7F061A2D977BFEF08F2F \
VENDOR_STRING=J:PERM DUP_GROUP=NONE vendor_info=13-jun-2023 \
ISSUER=CDNS2161b9360e2037d6f7bd35a11244661a ISSUED=13-jun-2023 \
A843" V7.1_LK=BFC68A9D5606A8296604
Expected result:
FEATURE AAA cdslmd 22.02 28-jul-2023 1 \
3F76BA3DA7102935AD12 VENDOR_STRING=J:PERM DUP_GROUP=NONE \
vendor_info=13-jun-2023 \
ISSUER=CDNS2161b9360e2037d6f7bd35a11244661a ISSUED=13-jun-2023 \
EC0C" V7.1_LK=3F06FAFDD30D943EC2DD
My script:
import os
import re
pat1 = r"[A-Z]+.AAA"
pat2 = r"[^\\]$"
with open('all.txt') as f:
match = False
for l in f:
if re.match(pat1, l):
match = True
print (l)
continue
if re.search(pat2, l):
match = False
continue
if match:
print (l)
f.close()
Outcome of above script:
FEATURE AAA cdslmd 22.02 28-jul-2023 1 1F469A0DE7A48BDAE16B \
It is printing only one line that starts with pat1
. I wanted to print lines between pat1
and pat2
(where the end of line does not end with "" after pat1
).
答案1
得分: 1
r"[^\\]$"
将匹配每行末尾的换行符 (\n
)。不过,您不需要正则表达式来进行这个检查,您可以简单地这样做:
if not l.strip().endswith('\\'):
match = False
continue
另外,为了获得您期望的输出,即也打印不以 \
结尾的行,您需要交换您的 if 语句的顺序:
if re.match(pat1, l):
match = True
print(l)
continue
if match:
print(l)
if not l.strip().endswith('\\'):
match = False
continue
另一种方法是,pattern2 也可以是 r"[^\\]\n$"
,但我觉得第一种方法更容易阅读。
英文:
r"[^\\]$"
will match the newline character (\n
) at the end of every line. You don't need a regex for that check though, you can simply do
if not l.strip().endswith('\\'):
match = False
continue
Additionally, to get your desired output, i.e. also print the line that does not end with \
, you will need to switch the order of your if statements:
if re.match(pat1, l):
match = True
print (l)
continue
if match:
print (l)
if not l.strip().endswith('\\'):
match = False
continue
Alternatively, pattern2 could also be r"[^\\]\n$"
, but I feel like the first approach is easier to read
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论