英文:
Regex replace: return empty none/empty string if no match
问题
Here is the translated content without the code:
所以我认为我了解一些正则表达式,但似乎我找到了我的知识的尽头的情况。无论如何,我尝试了以下链接https://stackoverflow.com/questions/53119343/regex-replace-function-in-cases-of-no-match-1-returns-full-line-instead-of-nu,但主要的区别是我不仅想用匹配项替换输入,还想在匹配项之间插入一些字符。简而言之,我想将输入标准化为特定的模式。我想匹配并捕获输入的特定部分,但不是全部。
替换字符串:
\g<from_day>.\g<from_month>-\g<until_day>.\g<until_month>
输入:
28.11 16.12
"13.01 23,09"
01.08.-31.12
"01.01,-51.12"
"01,01.-31,12."
01083112
1.02 - 4.3
当前输出:
28.11-16.12.-.
13.01-23.09.-.
01.08-31.12.-.
.-..-.
01.01-31.12.-.
.-..-.
1.02-4.3.-.
期望/希望的输出:
28.11-16.12
13.01-23.09
01.08-31.12
01.01-31.12
1.02-4.3
https://regex101.com/r/M3arvW/1
英文:
So i thought i know a bit of regex but it seems i found a case where my knowledge is at is end.
Anyway i tried the following https://stackoverflow.com/questions/53119343/regex-replace-function-in-cases-of-no-match-1-returns-full-line-instead-of-nu
But the main difference is i want to not only replace the input with the match but also insert some characters inbetween the matches. Simply put i want to standardize the input to a certain pattern.
The regex i want to match and capture specific parts of the input but not everything
^[\D]*(?P<from_day>(0?[1-9])|([12][0-9])|3[01])[\.\-\s,■]+(?P<from_month>(0?[1-9])|(1[0-2]))[\.\-\s,■]*(?P<until_day>(0?[1-9])|[12][0-9]|3[01])[\.\-\s,■]+(?P<until_month>(0?[1-9])|1[012])[\D]*$
the replacement string:
\g<from_day>.\g<from_month>-\g<until_day>.\g<until_month>
Input:
28.11 16.12
"13.01 23,09"
01.08.-31.12
"01.01,-51.12"
"01,01.-31,12."
01083112
1.02 - 4.3
Current output:
28.11-16.12.-.
13.01-23.09.-.
01.08-31.12.-.
.-..-.
01.01-31.12.-.
.-..-.
1.02-4.3.-.
Expected/desired:
28.11-16.12
13.01-23.09
01.08-31.12
01.01-31.12
1.02-4.3
答案1
得分: 2
你应该将你的正则表达式更改为这样:
^\D*(?P<from_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<from_month>1[0-2]|0?[1-9])[.\s,■-]*(?P<until_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<until_month>1[012]|0?[1-9]).*|.+
这将处理除没有匹配项的情况之外的所有问题。对于没有匹配项的情况,你应该使用一个 lambda
函数 re.sub
来替换为空字符串。
Python 代码:
>>> import re
>>> arr = ['"01,01.-31,12."', '01083112', '1.02 - 4.3', '"01.01,-51.12"']
>>> rx = re.compile(r'^\D*(?P<from_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<from_month>1[0-2]|0?[1-9])[.\s,■-]*(?P<until_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<until_month>1[012]|0?[1-9]).*|.+')
>>> for i in arr: print (rx.sub(lambda m: m.group('from_day') + '.' + m.group('from_month') + '-' + m.group('until_day') + '.' + m.group('until_month') if m.group('from_day') else '', i))
...
01.01-31.12
1.02-4.3
英文:
You should change your regex to this:
^\D*(?P<from_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<from_month>1[0-2]|0?[1-9])[.\s,■-]*(?P<until_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<until_month>1[012]|0?[1-9]).*|.+
This will take care of all the issues except when there is no match. For no match you should use a lambda
function re.sub
to replace with an empty string.
Python Code:
>>> import re
>>> arr = ['"01,01.-31,12."', '01083112', '1.02 - 4.3', '"01.01,-51.12"']
>>> rx = re.compile(r'^\D*(?P<from_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<from_month>1[0-2]|0?[1-9])[.\s,■-]*(?P<until_day>[12]\d|3[01]|0?[1-9])[.\s,■-]+(?P<until_month>1[012]|0?[1-9]).*|.+')
>>> for i in arr: print (rx.sub(lambda m: m.group('from_day') + '.' + m.group('from_month') + '-' + m.group('until_day') + '.' + m.group('until_month') if m.group('from_day') else '', i))
...
01.01-31.12
1.02-4.3
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论