如何在两个相同但随机字符串的出现之间匹配字符

huangapple go评论118阅读模式
英文:

How to match characters between two occurrences of the same but random string

问题

基本字符串如下:

repeatedRandomStr ABCXYZ /an/arbitrary/@#-~/sequence/of_characters=I+WANT+TO+MATCH/repeatedRandomStr/the/rest/of/strings.etc

关于这个基本字符串,我知道的是:

  • ABCXYZ 是固定的,始终存在。
  • repeatedRandomStr 是随机的,但它的第一次出现始终在开头,并且在 ABCXYZ 之前。

到目前为止,我已经查看了正则表达式上下文匹配、递归和子程序,但是我自己找不到解决方案。

我目前的解决方案是首先确定 repeatedRandomStr 是什么,使用以下表达式:

^(.*)\sABCXYZ

然后使用以下表达式:

repeatedRandomStr\sABCXYZ\s(.*)\srepeatedRandomStr

来匹配我想要的内容 $1。但这需要两个单独的正则表达式查询。我想知道是否可以在单个执行中完成这个任务。

英文:

Base string looks like:

repeatedRandomStr ABCXYZ /an/arbitrary/@#-~/sequence/of_characters=I+WANT+TO+MATCH/repeatedRandomStr/the/rest/of/strings.etc

The things I know about this base string are:

  • ABCXYZ is constant and always present.
  • repeatedRandomStr is random, but its first occurrence is always at the beginning and before ABCXYZ

So far I looked at regex context matching, recursion and subroutines but couldn't come up with a solution myself.

My currently working solution is to first determine what repeatedRandomStr is with:

^(.*)\sABCXYZ

and then use:

repeatedRandomStr\sABCXYZ\s(.*)\srepeatedRandomStr

to match what I want in $1. But this requires two separate regex queries. I want to know if this can be done in a single execution.

答案1

得分: 1

在使用RE2库的Go语言中,除了您提到的方法外,没有其他方法:先提取ABCXYZ之前的值,然后使用正则表达式匹配两个字符串之间的字符串,因为RE2不支持反向引用,也不会支持。

如果可以切换到PCRE或兼容的正则表达式,可以使用以下表达式:

^(.*?)\s+ABCXYZ\s(.*)
^(.*?)\s+ABCXYZ\s(.*?)

请参见正则表达式演示

详细说明

  • ^ - 字符串的开头

  • (.*?) - 第1组:零个或多个非换行字符,尽可能少

  • \s+ - 一个或多个空格字符

  • ABCXYZ - 一些固定的字符串

  • \s - 一个空格字符

  • (.*) - 第2组:零个或多个非换行字符,尽可能多

  • \1 - 与第1组中的值相同。

英文:

In Go, where RE2 library is used, there is no way other than yours: keep extracting the value before the ABCXYZ and then use the regex to match a string between two strings, as RE2 does not and won't support backreferences.

In case the regex flavor can be switched to PCRE or compatible, you can use

^(.*?)\s+ABCXYZ\s(.*)
^(.*?)\s+ABCXYZ\s(.*?)

See the regex demo.

Details:

  • ^ - start of string
  • (.*?) - Group 1: zero or more chars other than line break chars as few as possible
  • \s+ - one or more whitespaces
  • ABCXYZ - some constant string
  • \s - a whitespace
  • (.*) - Group 2: zero or more chars other than line break chars as many as possible
  • \1 - the same value as in Group 1.

huangapple
  • 本文由 发表于 2023年1月30日 16:59:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/75282106.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定