英文:
Why are numbers removed in a regex.ReplaceAllString()
问题
这个示例清楚地展示了我的困境。
最终,我想要将一个混乱的字符串拆分成单词。对我来说,"2015"是一个单词,"$100"也是一个单词,但如果输入是"One. 2wo, (three)",我希望得到[One 2wo three]。因为Go语言不允许使用支持Unicode的正则表达式,所以我想先删除所有的"垃圾字符",然后使用strings.Fields()
函数。
问题是任何数字都被删除了:
reg := regexp.MustCompile(`[\[\](){}"?!,-:;,']`)
fmt.Println(reg.ReplaceAllString("one 1 zer0", ""))
// 输出"one zer",而我期望得到"one 1 zer0" :(
英文:
This play clearly demonstrates my predicament.
Ultimately I'm trying to split an unruly string into words. To me "2015" is a word and so is "$100" but if the input is "One. 2wo, (three)" I want [One 2wo three]. Because go doesn't allow a Unicode aware regex I thought I'd first remove all "junk characters" and then use strings.Fields()
The problem is that any numbers are stripped:
reg := regexp.MustCompile(`[\[\](){}"?!,-:;,']`)
fmt.Println(reg.ReplaceAllString("one 1 zer0", ""))
// outputs "one zer" when I'd expect "one 1 zer0" :(
答案1
得分: 4
[,-:]
匹配范围在 ,
到 :
之间的所有字符。这个范围恰好包含了所有的 ASCII 数字(参见 ascii(7))。将 -
放在末尾即可:
reg := regexp.MustCompile(`[\[\](){}"?!,:;,'-]`)
英文:
[,-:]
matches all characters in the range ,
–:
. This range happens to contain all ASCII digits (see ascii(7)). Put the -
at the end instead:
reg := regexp.MustCompile(`[\[\](){}"?!,:;,'-]`)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论