英文:
Regex expression to find number sequence?
问题
我正在尝试使用正则表达式从以下文本主体中提取票号:
TICKET #IM40135514 OPENED
在这种情况下,应该返回
IM40135514
我不确定正确的正则表达式是什么。我尝试过
number = re.findall("TICKET (\w{2}\d{7}+))", filetext)
但一直出现错误。
英文:
I am trying to use regex to return the ticket number from the following body of text:
TICKET #IM40135514 OPENED
In this case, would return
IM40133514
I am not sure what the proper regex expression would be. I tried
number=re.findall("TICKET (\w{2}d{7}+))", filetext)
but keep getting an error.
答案1
得分: 1
使用re.findall
在这里应该没问题:
inp = "TICKET #IM40135514 OPENED"
nums = re.findall(r'\bTICKET #(\S+)', inp)
print(nums) # ['IM40135514']
请注意,我在正则表达式模式中使用了原始字符串,这通过前缀r
来指示。
英文:
Using re.findall
should be fine here:
<!-- language: python -->
inp = "TICKET #IM40135514 OPENED"
nums = re.findall(r'\bTICKET #(\S+)', inp)
print(nums) #['IM40135514']
Note that I am using a raw string for the regex pattern, which is indicated with a prefix of r
.
答案2
得分: 1
代码部分不要翻译:
s = "TICKET #IM40135514 OPENED"
ticket = s.split()[1].replace("#", "")
print(ticket)
翻译结果:
IM40135514
英文:
Regex is not required here. You can just split()
the text by a space, grab the middle string, and remove the "#".
s = "TICKET #IM40135514 OPENED"
ticket = s.split()[1].replace("#", "")
print(ticket)
and the ticket # is returned,
IM40135514
答案3
得分: 0
- 除非使用原始字符串,否则需要双重转义,以便正则表达式引擎获取完整的转义序列。
r"\w"
等同于"\w"
。 - 在你的表达式中缺少一个 #。
- 你需要转义
\w{2}\d{7}
中的 d。 -
- 表示重复一次或多次,并不会在
{7}
后编译。
- 表示重复一次或多次,并不会在
建议使用 regex101.com 来构建你的表达式,它会将它们分解并向你解释。
表达式 r"TICKET #(\w{2}\d{7})"
与你的要求非常接近,可能适合你的需求。请注意,\w
匹配数字,所以如果你需要特定的字母,可以使用 [a-zA-Z]{2}
。
英文:
There's a few things wrong with your expression.
- Unless using a raw string, you'll need to double escape so that the regex engine gets the full escape sequence.
r"\w"
is equivalent to"\\w"
. - You're missing a # in your expression.
- You need to escape the d in
\w{2}\d{7}
. - The + means it repeats one or more times and doesn't compile after
{7}
like that.
Recommend using regex101.com to build your expressions as it breaks them down and explains them to you.
The expression r"TICKET #(\w{2}\d{7})"
closely matches what you had and might work for you. Note that \w
matches numbers as well so if you specifically want letters, you can use [a-zA-Z]{2}
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论