问题:在正则表达式中捕获特定组。

huangapple go评论58阅读模式
英文:

Problem to capture specific group in regular expression

问题

以下是您提供的正则表达式的翻译:

我正在尝试开发一个匹配具有以下格式数据的正则表达式:

430::1820-07-27::Vitorino Pinheiro Lacerda::Rodrigo Pinheiro Lacerda::Custodia Maria Alvares::::
430::1873-05-12::Vitorino Teixeira Pires::Jose Teixeira::Ana Martins Pires::Doc.danificado.::
425::1724-09-06::Xavier Araujo Costa::Bernardo Araujo::Angela Costa::::
425::1714-07-30::Xavier Araujo Ferreira::Geraldo Araujo::Ana Ferreira::Jose Araujo Ferreira,Irmao. Proc.21011.::
425::1689-11-02::Xisto Magalhaes Cunha::Francisco Fernandes::Maria Francisca::Doc.danificado.::
426::1898-11-18::Zacarias Rodrigues Mano::Manuel Rodrigues Mano::Felicidade Jesus Tarrio::::
426::1900-11-12::Zacarias Silva Mariz::Luis Silva Mariz::Felicidade Correia Santos::::
426::1785-10-20::Zeferino Antonio Pereira Nobre::Antonio Pereira Nobre::Maria Josefa Garcia::::
425::1809-01-27::Zeferino Antonio Vassalo::Simao Vassalo::Maria Jose::::

目前,我已经成功获取了大多数特定组的内容,但我正在努力获取一个可以捕获在“,”和下一个出现的“.”之间内容的表达式。一个示例可以是先前呈现的第四种情况,其中应该捕获的内容是“Irmao”。

这是到目前为止我获得的正则表达式:

(?P<folder>\d*)::(?P<date>\d{4}-\d{2}-\d{2})::(?P<name>.+?)::(?P<father>.+?)::(?P<mother>.+?)::(?P<observations>.*((?:,)?(?P<family>[^\.]*)?))(?:::)?

英文:

I'm trying to develop a regular expression that matches data with this format:

430::1820-07-27::Vitorino Pinheiro Lacerda::Rodrigo Pinheiro Lacerda::Custodia Maria Alvares::::
430::1873-05-12::Vitorino Teixeira Pires::Jose Teixeira::Ana Martins Pires::Doc.danificado.::
425::1724-09-06::Xavier Araujo Costa::Bernardo Araujo::Angela Costa::::
425::1714-07-30::Xavier Araujo Ferreira::Geraldo Araujo::Ana Ferreira::Jose Araujo Ferreira,Irmao. Proc.21011.::
425::1689-11-02::Xisto Magalhaes Cunha::Francisco Fernandes::Maria Francisca::Doc.danificado.::
426::1898-11-18::Zacarias Rodrigues Mano::Manuel Rodrigues Mano::Felicidade Jesus Tarrio::::
426::1900-11-12::Zacarias Silva Mariz::Luis Silva Mariz::Felicidade Correia Santos::::
426::1785-10-20::Zeferino Antonio Pereira Nobre::Antonio Pereira Nobre::Maria Josefa Garcia::::
425::1809-01-27::Zeferino Antonio Vassalo::Simao Vassalo::Maria Jose::::

For now, I managed to obtain most of the specific groups however I am struggling to obtain an expression to obtain a group that captures the content between a "," and the next occurrence of a ".". An example can be the fourth case previously presented where the content that should be captured is "Irmao".

Here's the regular expression I obtained so far:

(?P&lt;folder&gt;\d*)::(?P&lt;date&gt;\d{4}-\d{2}-\d{2})::(?P&lt;name&gt;.+?)::(?P&lt;father&gt;.+?)::(?P&lt;mother&gt;.+?)::(?P&lt;observations&gt;.*((?:,)?(?P&lt;family&gt;[^\.]*)?))(?:::)?

答案1

得分: 1

在命名组"observations"中,您可以选择匹配一个逗号,然后匹配除逗号或句点之外的任何字符,直到匹配第一个句点。

(?P<observations>[^,\n]*(?:,(?P<family>[^,.\n]*)\.)?.*)::

完整的模式:

(?P<folder>\d*)::(?P<date>\d{4}-\d{2}-\d{2})::(?P<name>.+?)::(?P<father>.*?)::(?P<mother>.*?)::(?P<observations>[^,\n]*(?:,(?P<family>[^,.\n]*)\.)?.*)::

正则表达式演示

英文:

In the named group observations, you can optionally match a comma, then any char except a comma or dot until you match the first dot.

(?P&lt;observations&gt;[^,\n]*(?:,(?P&lt;family&gt;[^,.\n]*)\.)?.*)::

The full pattern:

(?P&lt;folder&gt;\d*)::(?P&lt;date&gt;\d{4}-\d{2}-\d{2})::(?P&lt;name&gt;.+?)::(?P&lt;father&gt;.*?)::(?P&lt;mother&gt;.*?)::(?P&lt;observations&gt;[^,\n]*(?:,(?P&lt;family&gt;[^,.\n]*)\.)?.*)::

Regex demo

huangapple
  • 本文由 发表于 2023年3月4日 09:43:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/75633171.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定