问题

[VC1000]: [VC1000]是我试图匹配的字符串（实际字符串要长得多）。
[Venture Capital]: 这是有关风险投资的课程。
[4]: [4]代表学分。
[This is a class about venture capital and more description, that could mention a future course like VC2000 but might not]: 这是关于风险投资的课程，以及更多描述，可能提到未来课程，如VC2000，但也可能不提到。

你目前的正则表达式 (^\*?[A-Z]{2}\s?[0-9]{4}) (.*?)([0-9]|[0-9]-[0-9]+)\s?cr\. 已经很接近了，但是你需要修改它来捕获描述部分。你可以使用以下正则表达式来实现：

(^\*?[A-Z]{2}\s?[0-9]{4})\s(.*?)([0-9]|[0-9]-[0-9]+)\s?cr\.

这个正则表达式在原有的基础上做了以下修改：

去掉了描述部分前面的空格，以便正确捕获描述。
将描述部分的捕获括号移到了描述部分的前面，这样它就会捕获所有描述内容。

这个正则表达式应该能够匹配你所需的所有组。

英文:

This is the string I'm trying to match on (real one is much longer).

VC1000 Venture Capital 4 cr.
This is a class about venture capital
and more description, that could mention a future course like
VC2000 but might not
VC2000 venture capital II 4 cr.
Another description about blah
VC 3000 venture capital III 4-6 cr.
back again

I'm trying to get groups that would look like

[VC1000]
[Venture Capital]
[4]
[This is a class about venture capital and more description, that could mention a future course like VC2000 but might not]

I almost got it but I'm not sure how to get the description between class listings. Right now I have:

(^\*?[A-Z]{2}\s?[0-9]{4}) (.*?)([0-9]|[0-9]-[0-9]+)\s?cr\.

But i'm not sure how to proceed. Adding .* matches too much, and doing .* with the first group from above prevents the first group getting caught every other match.

What's the trick I'm missing?

答案1

得分: 2

尝试(regex101)：

import re

pat = r'^([A-Z]{2}\s*\d{4})\s+([^\n]+?)(\d+-?\d*\s+cr\.)$(.*?)(?=^[A-Z]{2}\s*\d{4}\s+[^\n]+?\d+-?\d*\s+cr\.$|\Z)'
pat = re.compile(pat, flags=re.S|re.M)

text = '''\
VC1000 Venture Capital 4 cr.
This is a class about venture capital
and more description, that could mention a future course like
VC2000 but might not
VC2000 venture capital II 4 cr.
Another description about blah
VC 3000 venture capital III 4-6 cr.
back again'''

for a, b, c, d in pat.findall(text):
    print(a)
    print(b)
    print(c)
    print(d)
    print('-' * 80)

打印：

VC1000
Venture Capital 
4 cr.

This is a class about venture capital
and more description, that could mention a future course like
VC2000 but might not

--------------------------------------------------------------------------------
VC2000
venture capital II 
4 cr.

Another description about blah

--------------------------------------------------------------------------------
VC 3000
venture capital III 
4-6 cr.

back again
--------------------------------------------------------------------------------

英文:

Try (regex101):

import re

pat = r&#39;^([A-Z]{2}\s*\d{4})\s+([^\n]+?)(\d+-?\d*\s+cr\.)$(.*?)(?=^[A-Z]{2}\s*\d{4}\s+[^\n]+?\d+-?\d*\s+cr\.$|\Z)&#39;
pat = re.compile(pat, flags=re.S|re.M)

text = &#39;&#39;&#39;\
VC1000 Venture Capital 4 cr.
This is a class about venture capital
and more description, that could mention a future course like
VC2000 but might not
VC2000 venture capital II 4 cr.
Another description about blah
VC 3000 venture capital III 4-6 cr.
back again&#39;&#39;&#39;

for a, b, c, d in pat.findall(text):
	print(a)
	print(b)
	print(c)
	print(d)
	print(&#39;-&#39; * 80)

Prints:

VC1000
Venture Capital 
4 cr.

This is a class about venture capital
and more description, that could mention a future course like
VC2000 but might not

--------------------------------------------------------------------------------
VC2000
venture capital II 
4 cr.

Another description about blah

--------------------------------------------------------------------------------
VC 3000
venture capital III 
4-6 cr.

back again
--------------------------------------------------------------------------------

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

匹配一切，直到整个正则表达式再次匹配。

问题

答案1

如何使用自定义的自助法估计来更新linearmodels PanelResults对象？

数据库为什么在成功的Django POST请求时没有收到数据？

Firebase函数v2触发器使用Python脚本实时数据库不起作用。

如何在Renpy的Show()函数中使用函数？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论