替换 Ruby 字符串而不使用捕获组。

huangapple go评论57阅读模式
英文:

Ruby string replace without capture group

问题

Ruby的String.sub(pat, txt)函数族执行替换操作,并且支持在txt中捕获组,即使pat是一个普通的字符串(至少它似乎支持\0来在替换文本中打印整个pat)。我不知道这是否是有意的,但如果txt是来自API输入的变量,那么它是不安全的。是否有替代的库函数可以用来执行字符串替换而不具有这个缺陷?或者有没有一种方式可以对txt进行转义。

英文:

Ruby's String.sub(pat, txt) family of functions performs substitution and supports capturing groups in txt even if pat is a plain string (at least it seems to support \0 to print all of pat in the replaced text). I don't know if this is intended, but is unsafe if txt is e.g. a variable taken as input from an API. Is there an alternative library function I can use to perform string replace without this drawback? Or alternatively some way to escape txt.

答案1

得分: 2

你的意思是这样的:

>> "mytext".sub("text", " \\0 替换为 \\` \\0")
=> "my text 替换为 my text"

你可以转义反斜杠:

>> "mytext".sub("text", " \\\\0 替换")
=> "my \\0 替换"

def b_esc(str) = str.gsub("\\", "\\\\\\\\")

>> "mytext".sub("text", b_esc(" \\0 替换"))
=> "my \\0 替换"

# 有多少个反斜杠?
>> "\\0".chars        #=> ["\\", "0"] # 嗯.. 是的
>> "\\0".bytes        #=> [92, 48]
#                      一个 ^
>> b_esc("\\0").bytes #=> [92, 92, 48]
#                      两个 ^   ^

或者使用一个块:

>> "mytext".sub("text") { " \\0 替换" }
=> "my \\0 替换"

或者也可以使用 []=

>> t = "mytext"
=> "mytext"
>> t["text"] = " \\0 替换"
=> " \\0 替换"
>> t
=> "my \\0 替换"

请注意,在字符串替换中,诸如 $& 这样的字符组合被视为普通文本,而不是特殊的匹配变量。然而,您可以使用以下组合引用一些特殊的匹配变量:

  • \&\0 对应于 $&,其中包含完全匹配的文本。
  • \' 对应于 $',其中包含匹配后的字符串。
  • \\\`` 对应于 $``,其中包含匹配前的字符串。
  • \+ 对应于 $+,其中包含最后一个捕获组。

https://docs.ruby-lang.org/en/3.2/String.html#class-String-label-Substitution+Methods

英文:

You mean like this:

>> "mytext".sub("text", " \\0 replaced by \\` \\0")
=> "my text replaced by my text"

You can escape backslashes:

>> "mytext".sub("text", " \\\
>> "mytext".sub("text", " \\\\0 replaced")
=> "my \\0 replaced"
def b_esc(str) = str.gsub("\\", "\\\\\\\\")
>> "mytext".sub("text", b_esc(" \\0 replaced"))
=> "my \\0 replaced"
# how many backslashes?
>> "\\0".chars        #=> ["\\", "0"] # uhh.. yes
>> "\\0".bytes        #=> [92, 48]
#                      one ^ 
>> b_esc("\\0").bytes #=> [92, 92, 48]
#                      two ^   ^
replaced") => "my \
>> "mytext".sub("text", " \\\\0 replaced")
=> "my \\0 replaced"
def b_esc(str) = str.gsub("\\", "\\\\\\\\")
>> "mytext".sub("text", b_esc(" \\0 replaced"))
=> "my \\0 replaced"
# how many backslashes?
>> "\\0".chars        #=> ["\\", "0"] # uhh.. yes
>> "\\0".bytes        #=> [92, 48]
#                      one ^ 
>> b_esc("\\0").bytes #=> [92, 92, 48]
#                      two ^   ^
replaced" def b_esc(str) = str.gsub("\\", "\\\\\\\\") >> "mytext".sub("text", b_esc(" \
>> "mytext".sub("text", " \\\\0 replaced")
=> "my \\0 replaced"
def b_esc(str) = str.gsub("\\", "\\\\\\\\")
>> "mytext".sub("text", b_esc(" \\0 replaced"))
=> "my \\0 replaced"
# how many backslashes?
>> "\\0".chars        #=> ["\\", "0"] # uhh.. yes
>> "\\0".bytes        #=> [92, 48]
#                      one ^ 
>> b_esc("\\0").bytes #=> [92, 92, 48]
#                      two ^   ^
replaced")) => "my \
>> "mytext".sub("text", " \\\\0 replaced")
=> "my \\0 replaced"
def b_esc(str) = str.gsub("\\", "\\\\\\\\")
>> "mytext".sub("text", b_esc(" \\0 replaced"))
=> "my \\0 replaced"
# how many backslashes?
>> "\\0".chars        #=> ["\\", "0"] # uhh.. yes
>> "\\0".bytes        #=> [92, 48]
#                      one ^ 
>> b_esc("\\0").bytes #=> [92, 92, 48]
#                      two ^   ^
replaced" # how many backslashes? >> "\
>> "mytext".sub("text", " \\\\0 replaced")
=> "my \\0 replaced"
def b_esc(str) = str.gsub("\\", "\\\\\\\\")
>> "mytext".sub("text", b_esc(" \\0 replaced"))
=> "my \\0 replaced"
# how many backslashes?
>> "\\0".chars        #=> ["\\", "0"] # uhh.. yes
>> "\\0".bytes        #=> [92, 48]
#                      one ^ 
>> b_esc("\\0").bytes #=> [92, 92, 48]
#                      two ^   ^
".chars #=> ["\\", "0"] # uhh.. yes >> "\
>> "mytext".sub("text", " \\\\0 replaced")
=> "my \\0 replaced"
def b_esc(str) = str.gsub("\\", "\\\\\\\\")
>> "mytext".sub("text", b_esc(" \\0 replaced"))
=> "my \\0 replaced"
# how many backslashes?
>> "\\0".chars        #=> ["\\", "0"] # uhh.. yes
>> "\\0".bytes        #=> [92, 48]
#                      one ^ 
>> b_esc("\\0").bytes #=> [92, 92, 48]
#                      two ^   ^
".bytes #=> [92, 48] # one ^ >> b_esc("\
>> "mytext".sub("text", " \\\\0 replaced")
=> "my \\0 replaced"
def b_esc(str) = str.gsub("\\", "\\\\\\\\")
>> "mytext".sub("text", b_esc(" \\0 replaced"))
=> "my \\0 replaced"
# how many backslashes?
>> "\\0".chars        #=> ["\\", "0"] # uhh.. yes
>> "\\0".bytes        #=> [92, 48]
#                      one ^ 
>> b_esc("\\0").bytes #=> [92, 92, 48]
#                      two ^   ^
").bytes #=> [92, 92, 48] # two ^ ^

Or use a block:

>> "mytext".sub("text") { " \\0 replaced" }
=> "my \\0 replaced"

Or maybe []=:

>> t = "mytext"
=> "mytext"
>> t["text"] = " \\0 replaced"
=> " \\0 replaced"
>> t
=> "my \\0 replaced"

> Note that within the string replacement, a character combination such as $& is treated as ordinary text, and not as a special match variable. However, you may refer to some special match variables using these combinations:
>
> \& and \0 correspond to $&, which contains the complete matched text.
> \' corresponds to $', which contains string after match.
> <code>\`</code> corresponds to <code>$`</code>, which contains string before match.
> \+ corresponds to $+, which contains last capture group.

https://docs.ruby-lang.org/en/3.2/String.html#class-String-label-Substitution+Methods

答案2

得分: 0

在我的情况下,最终选择了index + insert,因为单词总是以该模式结尾。如果不是这种情况,[]=就是我正在寻找的解决方案。

def prepend(txt, pat, word)
  index = txt.index(pat)
  return if index.nil?

  txt.insert(index, word)
  true
end
英文:

Ended up going with index + insert in my case since the word always ended with the pattern. []= was the solution I was looking for if this weren't the case.

def prepend(txt, pat, word)
  index = txt.index(pat)
  return if index.nil?

  txt.insert(index, word)
  true
end

huangapple
  • 本文由 发表于 2023年5月28日 01:36:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/76348209.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定