英文:
How to match characters between two occurrences of the same but random string
问题
基本字符串如下:
repeatedRandomStr ABCXYZ /an/arbitrary/@#-~/sequence/of_characters=I+WANT+TO+MATCH/repeatedRandomStr/the/rest/of/strings.etc
关于这个基本字符串,我知道的是:
ABCXYZ
是固定的,始终存在。repeatedRandomStr
是随机的,但它的第一次出现始终在开头,并且在ABCXYZ
之前。
到目前为止,我已经查看了正则表达式上下文匹配、递归和子程序,但是我自己找不到解决方案。
我目前的解决方案是首先确定 repeatedRandomStr
是什么,使用以下表达式:
^(.*)\sABCXYZ
然后使用以下表达式:
repeatedRandomStr\sABCXYZ\s(.*)\srepeatedRandomStr
来匹配我想要的内容 $1
。但这需要两个单独的正则表达式查询。我想知道是否可以在单个执行中完成这个任务。
英文:
Base string looks like:
repeatedRandomStr ABCXYZ /an/arbitrary/@#-~/sequence/of_characters=I+WANT+TO+MATCH/repeatedRandomStr/the/rest/of/strings.etc
The things I know about this base string are:
ABCXYZ
is constant and always present.repeatedRandomStr
is random, but its first occurrence is always at the beginning and beforeABCXYZ
So far I looked at regex context matching, recursion and subroutines but couldn't come up with a solution myself.
My currently working solution is to first determine what repeatedRandomStr
is with:
^(.*)\sABCXYZ
and then use:
repeatedRandomStr\sABCXYZ\s(.*)\srepeatedRandomStr
to match what I want in $1
. But this requires two separate regex queries. I want to know if this can be done in a single execution.
答案1
得分: 1
在使用RE2库的Go语言中,除了您提到的方法外,没有其他方法:先提取ABCXYZ
之前的值,然后使用正则表达式匹配两个字符串之间的字符串,因为RE2不支持反向引用,也不会支持。
如果可以切换到PCRE或兼容的正则表达式,可以使用以下表达式:
^(.*?)\s+ABCXYZ\s(.*)
^(.*?)\s+ABCXYZ\s(.*?)
请参见正则表达式演示。
详细说明:
-
^
- 字符串的开头 -
(.*?)
- 第1组:零个或多个非换行字符,尽可能少 -
\s+
- 一个或多个空格字符 -
ABCXYZ
- 一些固定的字符串 -
\s
- 一个空格字符 -
(.*)
- 第2组:零个或多个非换行字符,尽可能多 -
\1
- 与第1组中的值相同。
英文:
In Go, where RE2 library is used, there is no way other than yours: keep extracting the value before the ABCXYZ
and then use the regex to match a string between two strings, as RE2 does not and won't support backreferences.
In case the regex flavor can be switched to PCRE or compatible, you can use
^(.*?)\s+ABCXYZ\s(.*)
^(.*?)\s+ABCXYZ\s(.*?)
See the regex demo.
Details:
^
- start of string(.*?)
- Group 1: zero or more chars other than line break chars as few as possible\s+
- one or more whitespacesABCXYZ
- some constant string\s
- a whitespace(.*)
- Group 2: zero or more chars other than line break chars as many as possible\1
- the same value as in Group 1.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论