英文:
Golang regexp.ReplaceAllString ignores the replacement string "$X_"
问题
我正在尝试使用我在这里找到的正则表达式将CamelCase转换为snake_case。这是我正在使用的代码片段:
in := "camelCase"
var re1 = regexp.MustCompile(`(.)([A-Z][a-z]+)`)
out := re1.ReplaceAllString(in, "$1_$2")
该正则表达式将匹配lCase
。这里的$1
是l
,$2
是Case
,所以使用替换字符串"$1_$2"
应该得到camel_Case
。然而,实际结果是cameCase
。
将替换字符串更改为"$1_"
会得到came
。如果我将其更改为"$1+$2"
,结果将如预期的那样是camel+Case
(请参见playground)。
目前,我的解决方法是将"$1+$2"
作为替换字符串,然后使用strings.Replace
将加号替换为下划线。这是一个错误吗,还是我在这里做错了什么?
英文:
I'm trying to convert CamelCase to snake_case using the regex I found here. Here's a snippet of the code I'm using:
in := "camelCase"
var re1 = regexp.MustCompile(`(.)([A-Z][a-z]+)`)
out := re1.ReplaceAllString(in, "$1_$2")
The regex will match lCase
. $1
here is l
and $2
is Case
, so using the replacement string "$1_$2"
should result in camel_Case
. Instead, it results in cameCase
.
Changing the replacement string to "$1_"
results in came
. If I change it to "$1+$2"
, the result will be camel+Case
as expected (see playground).
Right now, my workaround is to use "$1+$2"
as the replacement string, and then using strings.Replace
to change the plus sign to an underscore. Is this a bug or am I doing something wrong here?
答案1
得分: 5
修复方法是使用${1}_$2
(或${1}_${2}
以保持对称性)。
根据https://golang.org/pkg/regexp/#Regexp.Expand(我强调):
在模板中,变量由形式为$name或${name}的子字符串表示,其中name是一个非空的字母、数字和下划线的序列。
...
在$name形式中,name被认为是尽可能长的:$1x等同于${1x},而不是${1}x,$10等同于${10},而不是${1}0。
因此,在$1_$2
中,实际上是在寻找一个名为1_
的组,然后是另一个名为2
的组,并将它们组合在一起。
至于为什么使用$1_$2
(或者同样地使用$foo$2
)会导致“驼峰式”,同样的文档解释如下:
对于超出范围或未匹配索引的引用,或者在正则表达式中不存在的名称,将替换为一个空切片。
因此,用"$1_$2"
替换等同于只用"$2"
替换。
英文:
The fix is to use ${1}_$2
(or ${1}_${2}
for symmetry).
Per https://golang.org/pkg/regexp/#Regexp.Expand (my emphasis):
> In the template, a variable is denoted by a substring of the form
> $name or ${name}, where name is a non-empty sequence of letters,
> digits, and underscores.
>
> ...
>
> In the $name form, name is taken to
> be as long as possible: $1x is equivalent to ${1x}, not ${1}x, and,
> $10 is equivalent to ${10}, not ${1}0.
So in $1_$2
, you're actually looking for a group named 1_
and then another group named 2
and putting them together.
As to why using $1_$2
(or $foo$2
for that matter) results in "cameCase," that same documentation says:
> A reference to an out of range or unmatched index or a name that is
> not present in the regular expression is replaced with an empty slice.
So replacing with "$1_$2"
is equivalent to replacing with just "$2"
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论