英文:
Regex: How to get coordinates without using sub-groups?
问题
我一直在尝试提取以下格式的坐标集合:
[-34.0, 23, 0.555],[3, 4, 5],....
对于第一个集合,我希望提取“-34.0”,“23”和“0.555”。对于第二个集合,“3”,“4”和“5”。
我已经找到了一种方法在stackoverflow上进行提取,并通过在https://regexr.com上进行自己的实验,但它意味着“.0”和“.555”也会被提取为子组,这不是我所希望的。
[([-]?\d+(.\d+)?),\s([-]?\d+(.\d+)?),\s([-]?\d+(.\d+)?)]
然而,我的初始替代方案不起作用。为什么这些无效,以及如何创建符合我的要求的正则表达式?
a: 未将[\d]
中的左括号识别为特殊字符,因此将右括号与[\.
组件的左括号关联
[([-]?\d+[.[\d]+]?),\s([-]?\d+[.[\d]+]?),\s([-]?\d+[.[\d]+]?)]
b: 未将+
符号识别为特殊字符
[([-]?\d+[.\d+]?),\s([-]?\d+[.\d+]?),\s([-]?\d+[.\d+]?)]
谢谢您的时间!
更新:
我现在已经意识到非捕获组的功能。
首先 - 谢谢!它完成了我需要的工作。
其次 - 我仍然对为什么其他选项不起作用感到好奇,所以我会至少保留这个问题24小时左右。
更新 v2:
问题已得到完全回答。非常感谢大家!
英文:
I've been trying to extract sets of coordinates in the following format:
[-34.0, 23, 0.555] , [3, 4, 5], ....
For the first set, I wish to extract "-34.0", "23", and "0.555". For the second set, "3", "4", and "5".
I've found a way to do so on stackoverflow and through my own experiments on <https://regexr.com>, but it implies that ".0" and ".555" will also be extracted as subgroups, which I do not wish for.
\[([-]?\d+(\.\d+)?),\s([-]?\d+(\.\d+)?),\s([-]?\d+(\.\d+)?)\]
However, my initial alternatives are not working. Why are these not valid, and how to create a regex within my requirements?
a: Does not register the left bracket on [\d]
as a special character and thus associates the right bracket to the [\.
component's left bracket
\[([-]?\d+[\.[\d]+]?),\s([-]?\d+[\.[\d]+]?),\s([-]?\d+[\.[\d]+]?)\]
b: Does not register the +
sign as a special character
\[([-]?\d+[\.\d+]?),\s([-]?\d+[\.\d+]?),\s([-]?\d+[\.\d+]?)\]
Thank you for your time!
Update:
I have now been made aware of the non-capturing group feature.
First of all - thank you! It did the job I needed.
Second of all - I'm still curious as to why the other options didn't work, so I'll leave this up for the next 24 hours or so, at least.
Update v2:
Questions fully answered. Thank you so much, everyone!
答案1
得分: 1
你的模式不匹配,因为\d+[\.[\d]+]?
匹配一个或多个数字\d+
,后跟一个字符类[\.[\d]+
,该字符类重复匹配列出的字符之一,然后是一个可选的]
。
你可以使用3个捕获组以及可选的非捕获组(?:...)
编写模式:
\[(-?\d+(?:\.\d+)?),\s(-?\d+(?:\.\d+)?),\s(-?\d+(?:\.\d+)?)\]
查看正则表达式演示。
一些符号注解:
-
[-]?
--->-?
-
[\.]
--->[.]
或\.
-
\d+[\.[\d]+]?
---> 我认为你想表达\d+[.\d]*
,其中[.\d]*
也可以匹配只有点的情况,因为字符类允许可选地重复列出的字符。
有关符号的详细信息,请参阅字符类。
英文:
Your pattern does not match because \d+[\.[\d]+]?
matches one or more digits \d+
followed by a character class [\.[\d]+
that repeats matching on of the listed characters and then an optional ]
You could write the pattern using 3 capture groups, with opitional non capturing groups (?:...)?
\[(-?\d+(?:\.\d+)?),\s(-?\d+(?:\.\d+)?),\s(-?\d+(?:\.\d+)?)]
See a regex demo.
Some notation notes:
[-]?
--->-?
[\.]
--->[.]
or\.
\d+[\.[\d]+]?
---> I think you meant\d+[.\d]*
where[.\d]*
can also match only dots as the character class allows optional repeating of the listed characters.
For the notation, see character classes
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论