英文:
How to print a Ptython3 from a regex result that expand over 2 lines
问题
以下是已翻译的代码部分:
import re
pattern = re.compile(r"(==========)([\r\n]+.*)")
count = 0
for line in open('bookmarks.txt', encoding="utf-8"):
for match in re.finditer(pattern, line):
count += 1
print(line)
print("The amount of notes are: ", count)
请注意,我已经更正了代码中的HTML实体,以便它可以正常运行。
英文:
I have a small python 3 script that reads a file where all the bookmarks are stored. My regex works in notepad++.
my regex is:
(==========)([\r\n]+.*)
My text file
==========
Book1 (Author 1)
- bookmark
text
==========
Book2 (Author 2)
- bookmark1
text
==========
Book1 (Author 1)
- bookmark2
text
==========
Book2 (Author 2)
- bookmark2
text
==========
My Python script is as follows:
import re
pattern = re.compile("(==========)([\r\n])(.*)")
count=0
for line in open(r'bookmarks.txt', encoding="utf-8"):
for match in re.finditer(pattern, line):
count=count+1
print(line)
print("The amount of notes are: ",count)
The problem with this is the printed lines are only showing the "==========" part and excluding the:
==========
Book1 (Author 1)
I have tried different ways but none of them are showing what i'm looking for, any hint?
Thanks
答案1
得分: 1
你正在逐行搜索,因此在没有==========
的行上无法匹配您的模式。您可以尝试像以下这样做:
with open(r'bookmarks.txt', encoding="utf-8") as file:
bookmarks = file.read()
pattern = re.compile("(==========)(\r\n)([^\n]+)")
count = 0
for match in pattern.finditer(bookmarks):
count += 1
print(match[0])
print("笔记数量为:", count)
将整个bookmarks.txt
文件读入一个字符串,然后开始搜索。不太清楚您想要检索书签的哪些部分,所以我将它限制在第一行。
结果如下:
==========
Book1 (Author 1)
==========
Book2 (Author 2)
==========
Book1 (Author 1)
==========
Book2 (Author 2)
笔记数量为: 4
英文:
You are searching line by line, so for lines without the ==========
there's no way to match your pattern. You could try something like the following instead:
with open(r'bookmarks.txt', encoding="utf-8") as file:
bookmarks = file.read()
pattern = re.compile("(==========)(\r\n)([^\n]+)")
count = 0
for match in pattern.finditer(bookmarks):
count += 1
print(match[0])
print("The amount of notes are: ", count)
Read the whole bookmarks.txt
file into a string and then start searching. It's not exactly clear what parts of the bookmarks you want to retrieve, so I've limited it to the first line.
Result here:
==========
Book1 (Author 1)
==========
Book2 (Author 2)
==========
Book1 (Author 1)
==========
Book2 (Author 2)
The amount of notes are: 4
答案2
得分: 1
Your finditer-regex is applied to single lines of the input file only. Therefore, it cannot match the book-lines after "========". Why it finds anything at all? That's because you are allowing empty book-lines ((.*)
).
It's not clear to me, what output you expect, but the following piece of code at least prints the separator line together with the book-line:
import re
pattern = re.compile(r"(==========)([\r\n]+)(.+)")
count=0
with open('bookmarks.txt', 'r', encoding='utf-8') as file:
bookmarks = file.read()
for match in re.finditer(pattern, bookmarks):
count=count+1
print(match.group(0))
print("The amount of notes are: ",count)
Note, that I replaced (.*)
by (.+)
.
英文:
Your finditer-regex is applied to single lines of the input file only. Therefore, it cannot match the book-lines after "========". Why it finds anything at all? That's because you are allowing empty book-lines ((.*)
).
It's not clear to me, what output you expect, but the following piece of code at least prints the separator line together with the book-line:
import re
pattern = re.compile(r"(==========)([\r\n]+)(.+)")
count=0
with open('bookmarks.txt', 'r', encoding='utf-8') as file:
bookmarks = file.read()
for match in re.finditer(pattern, bookmarks):
count=count+1
print(match.group(0))
print("The amount of notes are: ",count)
Note, that I replaced (.*)
by (.+)
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论