英文:
How to match characters between two occurrences of the same but random string
问题
基本字符串如下:
repeatedRandomStr ABCXYZ /an/arbitrary/@#-~/sequence/of_characters=I+WANT+TO+MATCH/repeatedRandomStr/the/rest/of/strings.etc
关于这个基本字符串,我知道的是:
ABCXYZ是固定的,始终存在。repeatedRandomStr是随机的,但它的第一次出现始终在开头,并且在ABCXYZ之前。
到目前为止,我已经查看了正则表达式上下文匹配、递归和子程序,但是我自己找不到解决方案。
我目前的解决方案是首先确定 repeatedRandomStr 是什么,使用以下表达式:
^(.*)\sABCXYZ
然后使用以下表达式:
repeatedRandomStr\sABCXYZ\s(.*)\srepeatedRandomStr
来匹配我想要的内容 $1。但这需要两个单独的正则表达式查询。我想知道是否可以在单个执行中完成这个任务。
英文:
Base string looks like:
repeatedRandomStr ABCXYZ /an/arbitrary/@#-~/sequence/of_characters=I+WANT+TO+MATCH/repeatedRandomStr/the/rest/of/strings.etc
The things I know about this base string are:
ABCXYZis constant and always present.repeatedRandomStris random, but its first occurrence is always at the beginning and beforeABCXYZ
So far I looked at regex context matching, recursion and subroutines but couldn't come up with a solution myself.
My currently working solution is to first determine what repeatedRandomStr is with:
^(.*)\sABCXYZ
and then use:
repeatedRandomStr\sABCXYZ\s(.*)\srepeatedRandomStr
to match what I want in $1. But this requires two separate regex queries. I want to know if this can be done in a single execution.
答案1
得分: 1
在使用RE2库的Go语言中,除了您提到的方法外,没有其他方法:先提取ABCXYZ之前的值,然后使用正则表达式匹配两个字符串之间的字符串,因为RE2不支持反向引用,也不会支持。
如果可以切换到PCRE或兼容的正则表达式,可以使用以下表达式:
^(.*?)\s+ABCXYZ\s(.*)
^(.*?)\s+ABCXYZ\s(.*?)
请参见正则表达式演示。
详细说明:
- 
^- 字符串的开头 - 
(.*?)- 第1组:零个或多个非换行字符,尽可能少 - 
\s+- 一个或多个空格字符 - 
ABCXYZ- 一些固定的字符串 - 
\s- 一个空格字符 - 
(.*)- 第2组:零个或多个非换行字符,尽可能多 - 
\1- 与第1组中的值相同。 
英文:
In Go, where RE2 library is used, there is no way other than yours: keep extracting the value before the ABCXYZ and then use the regex to match a string between two strings, as RE2 does not and won't support backreferences.
In case the regex flavor can be switched to PCRE or compatible, you can use
^(.*?)\s+ABCXYZ\s(.*)
^(.*?)\s+ABCXYZ\s(.*?)
See the regex demo.
Details:
^- start of string(.*?)- Group 1: zero or more chars other than line break chars as few as possible\s+- one or more whitespacesABCXYZ- some constant string\s- a whitespace(.*)- Group 2: zero or more chars other than line break chars as many as possible\1- the same value as in Group 1.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论