Golang的regexp.ReplaceAllString函数忽略替换字符串”$X_”。

huangapple go评论109阅读模式
英文:

Golang regexp.ReplaceAllString ignores the replacement string "$X_"

问题

我正在尝试使用我在这里找到的正则表达式将CamelCase转换为snake_case。这是我正在使用的代码片段:

in := "camelCase"
var re1 = regexp.MustCompile(`(.)([A-Z][a-z]+)`)
out := re1.ReplaceAllString(in, "$1_$2")

该正则表达式将匹配lCase。这里的$1l$2Case,所以使用替换字符串"$1_$2"应该得到camel_Case。然而,实际结果是cameCase

将替换字符串更改为"$1_"会得到came。如果我将其更改为"$1+$2",结果将如预期的那样是camel+Case请参见playground)。

目前,我的解决方法是将"$1+$2"作为替换字符串,然后使用strings.Replace将加号替换为下划线。这是一个错误吗,还是我在这里做错了什么?

英文:

I'm trying to convert CamelCase to snake_case using the regex I found here. Here's a snippet of the code I'm using:

in := "camelCase"
var re1 = regexp.MustCompile(`(.)([A-Z][a-z]+)`)
out := re1.ReplaceAllString(in, "$1_$2")

The regex will match lCase. $1 here is l and $2 is Case, so using the replacement string "$1_$2" should result in camel_Case. Instead, it results in cameCase.

Changing the replacement string to "$1_" results in came. If I change it to "$1+$2", the result will be camel+Case as expected (see playground).

Right now, my workaround is to use "$1+$2" as the replacement string, and then using strings.Replace to change the plus sign to an underscore. Is this a bug or am I doing something wrong here?

答案1

得分: 5

修复方法是使用${1}_$2(或${1}_${2}以保持对称性)。

根据https://golang.org/pkg/regexp/#Regexp.Expand(我强调):

在模板中,变量由形式为$name或${name}的子字符串表示,其中name是一个非空的字母、数字和下划线的序列。

...

在$name形式中,name被认为是尽可能长的:$1x等同于${1x},而不是${1}x,$10等同于${10},而不是${1}0。

因此,在$1_$2中,实际上是在寻找一个名为1_的组,然后是另一个名为2的组,并将它们组合在一起。

至于为什么使用$1_$2(或者同样地使用$foo$2)会导致“驼峰式”,同样的文档解释如下:

对于超出范围或未匹配索引的引用,或者在正则表达式中不存在的名称,将替换为一个空切片。

因此,用"$1_$2"替换等同于只用"$2"替换。

英文:

The fix is to use ${1}_$2 (or ${1}_${2} for symmetry).

Per https://golang.org/pkg/regexp/#Regexp.Expand (my emphasis):

> In the template, a variable is denoted by a substring of the form
> $name or ${name}, where name is a non-empty sequence of letters,
> digits, and underscores.
>
> ...
>
> In the $name form, name is taken to
> be as long as possible: $1x is equivalent to ${1x}, not ${1}x, and,
> $10 is equivalent to ${10}, not ${1}0.

So in $1_$2, you're actually looking for a group named 1_ and then another group named 2 and putting them together.

As to why using $1_$2 (or $foo$2 for that matter) results in "cameCase," that same documentation says:

> A reference to an out of range or unmatched index or a name that is
> not present in the regular expression is replaced with an empty slice.

So replacing with "$1_$2" is equivalent to replacing with just "$2".

huangapple
  • 本文由 发表于 2016年7月20日 13:18:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/38472840.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定