英文:
Regular expression that matches only the letters in a given word
问题
我正在尝试创建一个正则表达式,该表达式将允许测试字符串满足以下条件:
- 测试字符串可以是任意长度,包含任何字母。
- 正则表达式中的所有字符必须匹配。
- 正则表达式中的所有字母必须仅匹配与它们在正则表达式中重复的次数相同的次数。
- 如果正则表达式中的字母重复多次,它们可以在测试字符串中连续出现,也可以被其他字母分隔。
- 测试字符串中的字母不必按照正则表达式的顺序排列。
因此,如果:
测试字符串 = polymmorphiiic
以下正则表达式将匹配:
ppy
ooyc
ciii
而以下内容将失败:
pmy
iy
yrz
感谢您的帮助,因为我无法满足上述所有要求。
英文:
I am trying to create a regex expression which will allow the test string to match with the following conditions:
- The test string can be any length and contain any letters.
- ALL characters in the regex must be matched.
- All letters in regex must be matched ONLY the number of times they are repeated in the regex expression.
- If a letter is repeated more than once in the regex it can be consecutive in the test string or separated by other letters.
- The letters in the test string don't have to be in the same order as the regex.
So if:
TEST STRING = polymmorphiiic
the following regex would match
ppy
ooyc
ciii
whereas the following would fail:
pmy
iy
yrz
Your help appreciated as I can get all the above satisfied.
I have tried breaking down all the requirements and creating regex string for each but I can't get over string that will cover all.
答案1
得分: 2
您可以使用re.findall()从字符串中获取匹配的字符。对排序后的结果和排序后的模式进行比较,可以处理顺序和字符数量的差异。
import re
testString = "polymmorphiiic"
for p in ("ppy", "ooyc", "ciii", "pmy", "iy", "yrz"):
print(p, "matches:", sorted(p) == sorted(re.findall(f"[{p}]", testString)))
ppy matches: True
ooyc matches: True
ciii matches: True
pmy matches: False
iy matches: False
yrz matches: False
请注意,您不需要正则表达式来实现这个过程,使用集合就足够了来筛选匹配的字符:
for p in ("ppy", "ooyc", "ciii", "pmy", "iy", "yrz"):
matches = set(p).intersection
print(p, "matches:", sorted(p) == sorted(filter(matches, testString)))
甚至顺序搜索也可以工作:
for p in ("ppy", "ooyc", "ciii", "pmy", "iy", "yrz"):
print(p, "matches:", sorted(p) == sorted(filter(p.__contains__, testString)))
英文:
You can use re.findall() to get the matching characters out of the string. A comparison between the sorted result and the sorted pattern will handle the differences of order and character counts.
import re
testString = "polymmorphiiic"
for p in ("ppy", "ooyc", "ciii", "pmy", "iy", "yrz"):
print(p,"matches:",sorted(p) == sorted(re.findall(f"[{p}]",testString)))
ppy matches: True
ooyc matches: True
ciii matches: True
pmy matches: False
iy matches: False
yrz matches: False
Note that you don't need a regular expression to implement this process, a set is sufficient to filter on the matching characters:
for p in ("ppy", "ooyc", "ciii", "pmy", "iy", "yrz"):
matches = set(p).intersection
print(p,"matches:",sorted(p) == sorted(filter(matches,testString)))
Even a sequential search would work:
for p in ("ppy", "ooyc", "ciii", "pmy", "iy", "yrz"):
print(p,"matches:",sorted(p) == sorted(filter(p.__contains__,testString)))
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论