匹配给定单词中仅包含字母的正则表达式: “`regex ^[a-zA-Z]+$ “`

huangapple go评论61阅读模式
英文:

Regular expression that matches only the letters in a given word

问题

我正在尝试创建一个正则表达式,该表达式将允许测试字符串满足以下条件:

  1. 测试字符串可以是任意长度,包含任何字母。
  2. 正则表达式中的所有字符必须匹配。
  3. 正则表达式中的所有字母必须仅匹配与它们在正则表达式中重复的次数相同的次数。
  4. 如果正则表达式中的字母重复多次,它们可以在测试字符串中连续出现,也可以被其他字母分隔。
  5. 测试字符串中的字母不必按照正则表达式的顺序排列。

因此,如果:

测试字符串 = polymmorphiiic

以下正则表达式将匹配:

ppy
ooyc
ciii

而以下内容将失败:

pmy
iy
yrz

感谢您的帮助,因为我无法满足上述所有要求。

英文:

I am trying to create a regex expression which will allow the test string to match with the following conditions:

  1. The test string can be any length and contain any letters.
  2. ALL characters in the regex must be matched.
  3. All letters in regex must be matched ONLY the number of times they are repeated in the regex expression.
  4. If a letter is repeated more than once in the regex it can be consecutive in the test string or separated by other letters.
  5. The letters in the test string don't have to be in the same order as the regex.

So if:

TEST STRING = polymmorphiiic

the following regex would match

ppy
ooyc
ciii

whereas the following would fail:

pmy
iy
yrz

Your help appreciated as I can get all the above satisfied.

I have tried breaking down all the requirements and creating regex string for each but I can't get over string that will cover all.

答案1

得分: 2

您可以使用re.findall()从字符串中获取匹配的字符。对排序后的结果和排序后的模式进行比较,可以处理顺序和字符数量的差异。

import re

testString = "polymmorphiiic"

for p in ("ppy", "ooyc", "ciii", "pmy", "iy", "yrz"):
    print(p, "matches:", sorted(p) == sorted(re.findall(f"[{p}]", testString)))
    
ppy matches: True
ooyc matches: True
ciii matches: True
pmy matches: False
iy matches: False
yrz matches: False

请注意,您不需要正则表达式来实现这个过程,使用集合就足够了来筛选匹配的字符:

for p in ("ppy", "ooyc", "ciii", "pmy", "iy", "yrz"):
    matches = set(p).intersection
    print(p, "matches:", sorted(p) == sorted(filter(matches, testString)))

甚至顺序搜索也可以工作:

for p in ("ppy", "ooyc", "ciii", "pmy", "iy", "yrz"):
    print(p, "matches:", sorted(p) == sorted(filter(p.__contains__, testString)))
英文:

You can use re.findall() to get the matching characters out of the string. A comparison between the sorted result and the sorted pattern will handle the differences of order and character counts.

import re

testString = "polymmorphiiic"

for p in ("ppy", "ooyc", "ciii", "pmy", "iy", "yrz"):
    print(p,"matches:",sorted(p) ==  sorted(re.findall(f"[{p}]",testString)))

ppy matches: True
ooyc matches: True
ciii matches: True
pmy matches: False
iy matches: False
yrz matches: False

Note that you don't need a regular expression to implement this process, a set is sufficient to filter on the matching characters:

for p in ("ppy", "ooyc", "ciii", "pmy", "iy", "yrz"):
    matches = set(p).intersection
    print(p,"matches:",sorted(p) == sorted(filter(matches,testString)))

Even a sequential search would work:

for p in ("ppy", "ooyc", "ciii", "pmy", "iy", "yrz"):
    print(p,"matches:",sorted(p) == sorted(filter(p.__contains__,testString)))

huangapple
  • 本文由 发表于 2023年5月28日 05:52:24
  • 转载请务必保留本文链接:https://go.coder-hub.com/76349191.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定