2023年2月6日 16:34:22go评论83阅读模式

英文:

How to print a Ptython3 from a regex result that expand over 2 lines

问题

以下是已翻译的代码部分：

import re
pattern = re.compile(r"(==========)([\r\n]+.*)")
count = 0
for line in open('bookmarks.txt', encoding="utf-8"):
    for match in re.finditer(pattern, line):
        count += 1
        print(line)
print("The amount of notes are: ", count)

请注意，我已经更正了代码中的HTML实体，以便它可以正常运行。

英文:

I have a small python 3 script that reads a file where all the bookmarks are stored. My regex works in notepad++.

my regex is:

(==========)([\r\n]+.*)

My text file

==========
Book1 (Author 1)
- bookmark
text
==========
Book2 (Author 2)
- bookmark1
text
==========
Book1 (Author 1)
- bookmark2
text
==========
Book2 (Author 2)
- bookmark2
text
==========

My Python script is as follows:

import re
pattern = re.compile(&quot;(==========)([\r\n])(.*)&quot;)
count=0
for line in open(r&#39;bookmarks.txt&#39;, encoding=&quot;utf-8&quot;):
    for match in re.finditer(pattern, line):
        count=count+1
        print(line)
print(&quot;The amount of notes are: &quot;,count)

The problem with this is the printed lines are only showing the "==========" part and excluding the:

==========
Book1 (Author 1)

I have tried different ways but none of them are showing what i'm looking for, any hint?

Thanks

答案1

得分: 1

你正在逐行搜索，因此在没有==========的行上无法匹配您的模式。您可以尝试像以下这样做：

with open(r'bookmarks.txt', encoding="utf-8") as file:
    bookmarks = file.read()
pattern = re.compile("(==========)(\r\n)([^\n]+)")
count = 0
for match in pattern.finditer(bookmarks):
    count += 1
    print(match[0])
print("笔记数量为：", count)

将整个bookmarks.txt文件读入一个字符串，然后开始搜索。不太清楚您想要检索书签的哪些部分，所以我将它限制在第一行。

结果如下：

==========
Book1 (Author 1)
==========
Book2 (Author 2)
==========
Book1 (Author 1)
==========
Book2 (Author 2)
笔记数量为： 4

英文:

You are searching line by line, so for lines without the ========== there's no way to match your pattern. You could try something like the following instead:

with open(r&#39;bookmarks.txt&#39;, encoding=&quot;utf-8&quot;) as file:
    bookmarks = file.read()
pattern = re.compile(&quot;(==========)(\r\n)([^\n]+)&quot;)
count = 0
for match in pattern.finditer(bookmarks):
    count += 1
    print(match[0])
print(&quot;The amount of notes are: &quot;, count)

Read the whole bookmarks.txt file into a string and then start searching. It's not exactly clear what parts of the bookmarks you want to retrieve, so I've limited it to the first line.

Result here:

==========
Book1 (Author 1)
==========
Book2 (Author 2)
==========
Book1 (Author 1)
==========
Book2 (Author 2)
The amount of notes are:  4

答案2

得分: 1

Your finditer-regex is applied to single lines of the input file only. Therefore, it cannot match the book-lines after "========". Why it finds anything at all? That's because you are allowing empty book-lines ((.*)).

It's not clear to me, what output you expect, but the following piece of code at least prints the separator line together with the book-line:

import re
pattern = re.compile(r"(==========)([\r\n]+)(.+)")
count=0
with open('bookmarks.txt', 'r', encoding='utf-8') as file:
    bookmarks = file.read()
for match in re.finditer(pattern, bookmarks):
    count=count+1
    print(match.group(0))
print("The amount of notes are: ",count)

Note, that I replaced (.*) by (.+).

英文:

It's not clear to me, what output you expect, but the following piece of code at least prints the separator line together with the book-line:

import re
pattern = re.compile(r&quot;(==========)([\r\n]+)(.+)&quot;)
count=0
with open(&#39;bookmarks.txt&#39;, &#39;r&#39;, encoding=&#39;utf-8&#39;) as file:
    bookmarks = file.read()
for match in re.finditer(pattern, bookmarks):
    count=count+1
    print(match.group(0))
print(&quot;The amount of notes are: &quot;,count)

Note, that I replaced (.*) by (.+).

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何从跨越2行的正则表达式结果中打印Python3。

问题

答案1

答案2

Matplotlib子图图例在包含许多元素时与图表重叠。

可以在列表推导式中初始化变量吗？

如何将线添加到链接的堆叠条形图类别

如何将字符串编码十次，然后解码十次 python

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。