Python正则表达式带有可选组,至少匹配一次。

huangapple go评论66阅读模式
英文:

Python regex with optional Groups, one match at least

问题

Here is the translated code section you provided:

所以我想我可以快速使用re.match()处理给定的字符串,但我卡住了。

有一个给定的字符串列表

 - 时间是12小时3分钟12秒
 - 时间是3分钟12秒
 - 它是12小时3分钟
 - 准备好了12秒
 - 时间是6小时

我想提取成3个组H、M和S,类似于
> (?: (\d{1,2})小时)?
>
> (?: (\d{1,2})分钟)?
>
> (?: (\d{1,2})秒)?

很容易地,我可以通过group(1-3)访问H、M和S组件。我只想限制匹配以满足条件,即至少一个可选组必须触发,否则就不匹配。否则,我猜这个表达式可以是可选的空,匹配一切。

这里有一个示例链接:
https://regex101.com/r/LKAKbx/5

我如何从匹配中仅获取数字作为组,例如:

> 时间是12小时3分钟12秒
>
> group(1) = 12,group(2) = 3,group(3) = 12

或者

> 准备好了12秒
>
> group(1) = None,group(2) = None,group(3) = 12
英文:

So I thought I could just quickly do a re.match() with my given String, but I'm stuck.

With a given list of Strings

  • The Time is 12H 3M 12S
  • The Time is 3M 12S
  • It is 12H 3M
  • Ready in 12S
  • The Time is 6H

I would like to extract into 3 groups H, M and S, somehow like
> (?: (\d{1,2})H)?
>
> (?: (\d{1,2})M)?
>
> (?: (\d{1,2})S)?

Easyliy I could then access the H, M and S components by group(1-3). I just would like to restrict the match to fulfill the creteria, that at least one of the optionl groups has to be triggered or it's no match. Else this expression is optionally empty and matches everything, I guess.

Here's a link to the example:
https://regex101.com/r/LKAKbx/5

How can I get the numbers only as groups from match, eg:

> The Time is 12H 3M 12S
>
> group(1) = 12, group(2) = 3, group(3) = 12

Or

> Ready in 12S
>
> group(1) = None, group(2) = None, group(3) = 12

答案1

得分: 2

使用正向先行断言确保我们至少有一个HMS

import re

strings = [
    'The Time is 12H 3M 12S',
    'The Time is 3M 12S',
    'It is 12H 3M',
    'Ready in 12S',
    'The Time is 6H',
]

for s in strings:
    res = re.search(r'(?= \d{1,2}[HMS])(?: (\d{1,2})H)?(?: (\d{1,2})M)?(?: (\d{1,2})S)?', s)
    #          这里 __^^^^^^^^^^^^^^^^^
    print(res.groups())

输出:

('12', '3', '12')
(None, '3', '12')
('12', '3', None)
(None, None, '12')
('6', None, None)
英文:

Use a positive lookahead to make sure we have at least one of H, M or S.

import re

strings = [
    'The Time is 12H 3M 12S',
    'The Time is 3M 12S',
    'It is 12H 3M',
    'Ready in 12S',
    'The Time is 6H',
]

for s in strings:
    res = re.search(r'(?= \d{1,2}[HMS])(?: (\d{1,2})H)?(?: (\d{1,2})M)?(?: (\d{1,2})S)?', s)
    #          here __^^^^^^^^^^^^^^^^^
    print(res.groups())

Output:

('12', '3', '12')
(None, '3', '12')
('12', '3', None)
(None, None, '12')
('6', None, None)

huangapple
  • 本文由 发表于 2020年1月4日 00:21:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/59581903.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定