正则表达式:如何在不使用子组的情况下获取坐标?

huangapple go评论58阅读模式
英文:

Regex: How to get coordinates without using sub-groups?

问题

我一直在尝试提取以下格式的坐标集合:

[-34.0, 23, 0.555],[3, 4, 5],....

对于第一个集合,我希望提取“-34.0”,“23”和“0.555”。对于第二个集合,“3”,“4”和“5”。

我已经找到了一种方法在stackoverflow上进行提取,并通过在https://regexr.com上进行自己的实验,但它意味着“.0”和“.555”也会被提取为子组,这不是我所希望的。

[([-]?\d+(.\d+)?),\s([-]?\d+(.\d+)?),\s([-]?\d+(.\d+)?)]

然而,我的初始替代方案不起作用。为什么这些无效,以及如何创建符合我的要求的正则表达式?

a: 未将[\d]中的左括号识别为特殊字符,因此将右括号与[\.组件的左括号关联

[([-]?\d+[.[\d]+]?),\s([-]?\d+[.[\d]+]?),\s([-]?\d+[.[\d]+]?)]

b: 未将+符号识别为特殊字符

[([-]?\d+[.\d+]?),\s([-]?\d+[.\d+]?),\s([-]?\d+[.\d+]?)]

谢谢您的时间!

更新:

我现在已经意识到非捕获组的功能。

首先 - 谢谢!它完成了我需要的工作。

其次 - 我仍然对为什么其他选项不起作用感到好奇,所以我会至少保留这个问题24小时左右。

更新 v2:

问题已得到完全回答。非常感谢大家!

英文:

I've been trying to extract sets of coordinates in the following format:

[-34.0, 23, 0.555] , [3, 4, 5], ....

For the first set, I wish to extract "-34.0", "23", and "0.555". For the second set, "3", "4", and "5".

I've found a way to do so on stackoverflow and through my own experiments on <https://regexr.com>, but it implies that ".0" and ".555" will also be extracted as subgroups, which I do not wish for.

\[([-]?\d+(\.\d+)?),\s([-]?\d+(\.\d+)?),\s([-]?\d+(\.\d+)?)\]

正则表达式:如何在不使用子组的情况下获取坐标?

However, my initial alternatives are not working. Why are these not valid, and how to create a regex within my requirements?

a: Does not register the left bracket on [\d] as a special character and thus associates the right bracket to the [\. component's left bracket

\[([-]?\d+[\.[\d]+]?),\s([-]?\d+[\.[\d]+]?),\s([-]?\d+[\.[\d]+]?)\]

正则表达式:如何在不使用子组的情况下获取坐标?

b: Does not register the + sign as a special character

\[([-]?\d+[\.\d+]?),\s([-]?\d+[\.\d+]?),\s([-]?\d+[\.\d+]?)\]

正则表达式:如何在不使用子组的情况下获取坐标?

Thank you for your time!

Update:

I have now been made aware of the non-capturing group feature.

First of all - thank you! It did the job I needed.

Second of all - I'm still curious as to why the other options didn't work, so I'll leave this up for the next 24 hours or so, at least.

Update v2:

Questions fully answered. Thank you so much, everyone!

答案1

得分: 1

你的模式不匹配,因为\d+[\.[\d]+]?匹配一个或多个数字\d+,后跟一个字符类[\.[\d]+,该字符类重复匹配列出的字符之一,然后是一个可选的]

你可以使用3个捕获组以及可选的非捕获组(?:...)编写模式:

\[(-?\d+(?:\.\d+)?),\s(-?\d+(?:\.\d+)?),\s(-?\d+(?:\.\d+)?)\]

查看正则表达式演示

一些符号注解:

  • [-]? ---> -?

  • [\.] ---> [.]\.

  • \d+[\.[\d]+]? ---> 我认为你想表达\d+[.\d]*,其中[.\d]*也可以匹配只有点的情况,因为字符类允许可选地重复列出的字符。

有关符号的详细信息,请参阅字符类

英文:

Your pattern does not match because \d+[\.[\d]+]? matches one or more digits \d+ followed by a character class [\.[\d]+ that repeats matching on of the listed characters and then an optional ]

You could write the pattern using 3 capture groups, with opitional non capturing groups (?:...)?

\[(-?\d+(?:\.\d+)?),\s(-?\d+(?:\.\d+)?),\s(-?\d+(?:\.\d+)?)]

See a regex demo.

Some notation notes:

  • [-]? ---> -?
  • [\.] ---> [.] or \.
  • \d+[\.[\d]+]? ---> I think you meant \d+[.\d]* where [.\d]* can also match only dots as the character class allows optional repeating of the listed characters.

For the notation, see character classes

huangapple
  • 本文由 发表于 2023年8月4日 00:46:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/76830103.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定