英文:
python regex, How to match everything except headings?
问题
你可以尝试以下正则表达式来匹配你所需的文本部分:
> (?! \d+\. | [a-z]\. )[^\n]+
这个正则表达式会匹配以 >
开头,但不以数字加点或小写字母加点开头的文本行。
英文:
I have this piece of text:
> 2. Shifting Your Mindset: Transforming your mindset from a negative or limited mindset to a positive one requires conscious effort and
> practice. Here are some key strategies to help you make that shift
>
> a. Self-Awareness: Start by becoming aware of your thoughts and inner
> dialogue. Notice any negative self-talk or limiting beliefs that may
> be holding you back financially.
>
> b. Reframing: Challenge and reframe negative thoughts or situations
> into positive ones. Instead of dwelling on financial setbacks, focus
> on the lessons learned and the potential opportunities they may
> present.
I want to match everything except the headings started with a number or a letter and followed by a dot(.), so the output should be:
> Transforming your mindset from a negative or limited mindset to a positive one requires conscious effort and
> practice. Here are some key strategies to help you make that shift:
>
> Start by becoming aware of your thoughts and inner
> dialogue. Notice any negative self-talk or limiting beliefs that may
> be holding you back financially.
>
> Challenge and reframe negative thoughts or situations
> into positive ones. Instead of dwelling on financial setbacks, focus
> on the lessons learned and the potential opportunities they may
> present.
So I tried this pattern: (?!\s*[a-z]\.\s.*:)(?!\d+\.\s*.*:).*
but I couldn't match them.
答案1
得分: 1
如果您使用 re.sub
,您可以专注于输出中 不希望 的部分。
我将使用以下特征来定义一个 "标题":
- 以一行的开头开始(可能在一些空白字符之后)
- 其第一个字符是字母数字字符
- 以冒号结束,距离第一个(非空格)字符不超过30个字符,可能后面跟有一些空白字符。
import re
s = """Shifting Your Mindset: Transforming your mindset from a negative or limited mindset to a positive one requires conscious effort and practice. Here are some key strategies to help you make that shift:
a. Self-Awareness: Start by becoming aware of your thoughts and inner dialogue. Notice any negative self-talk or limiting beliefs that may be holding you back financially.
b. Reframing: Challenge and reframe negative thoughts or situations into positive ones. Instead of dwelling on financial setbacks, focus on the lessons learned and the potential opportunities they may present."""
s = re.sub(r"^ *\w[^\r\n:]{0,30}:\s*", "", s, flags=re.M)
print(s)
这会输出:
Transforming your mindset from a negative or limited mindset to a positive one requires conscious effort and practice. Here are some key strategies to help you make that shift:
Start by becoming aware of your thoughts and inner dialogue. Notice any negative self-talk or limiting beliefs that may be holding you back financially.
Challenge and reframe negative thoughts or situations into positive ones. Instead of dwelling on financial setbacks, focus on the lessons learned and the potential opportunities they may present.
请注意:以上内容已经被翻译。
英文:
If you use re.sub
you can focus on the parts that you don't want in the output.
I'll use these characteristics of what constitutes a "header":
- Starts with at the start of a line (possibly after some white space)
- Its first character is alphanumeric
- It ends with a colon, no further than 30 characters from the first (non space) character, potentially followed by some white space.
import re
s = """Shifting Your Mindset: Transforming your mindset from a negative or limited mindset to a positive one requires conscious effort and practice. Here are some key strategies to help you make that shift:
a. Self-Awareness: Start by becoming aware of your thoughts and inner dialogue. Notice any negative self-talk or limiting beliefs that may be holding you back financially.
b. Reframing: Challenge and reframe negative thoughts or situations into positive ones. Instead of dwelling on financial setbacks, focus on the lessons learned and the potential opportunities they may present."""
s = re.sub(r"^ *\w[^\r\n:]{0,30}:\s*", "", s, flags=re.M)
print(s)
This outputs:
Transforming your mindset from a negative or limited mindset to a positive one requires conscious effort and practice. Here are some key strategies to help you make that shift:
Start by becoming aware of your thoughts and inner dialogue. Notice any negative self-talk or limiting beliefs that may be holding you back financially.
Challenge and reframe negative thoughts or situations into positive ones. Instead of dwelling on financial setbacks, focus on the lessons learned and the potential opportunities they may present.
答案2
得分: 0
谢谢大家的反馈,然而,我已经阅读了关于正则表达式的文档,最终找到了完美的答案,即以下模式:
(?<=:)\s*.*
这个正则表达式使用了后向断言,将完美地排除了冒号后面的所有字符串,后面可以跟着0个或多个空格,这意味着它将匹配所有的内容段落。
英文:
Thank you all for your feedback, however, I read the re documentation and finally I got the perfect answer which is this pattern:
(?<=:)\s*.*
this regex uses the lookbehind assertion will perfectly exclude every string that comes after a colon followed by 0 or more white spaces, which means, it will match all content paragraphs.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论