"Kotlin正则表达式中的冗余字符转义"

huangapple go评论128阅读模式
英文:

"Redundant character escape" in Kotlin Regular Expression

问题

I'm learning Kotlin and tried some regular expression things.

What I want to do is just remove all "comment" formats in the string (including " in middle).
Like this: normal "commen\"t" to normal

So I wrote this code:

val comments = fileContents.replace(Regex("\"(.?(\\\\\")?)*?\""), "")

and I got: `Redundant character escape '\"' in RegExp" warning.

I tried IDE quick fix thing, and it changed '\"' to '\"' which is not a happy result for me.

Also, the same regular expression worked without any warning or error in Python.
Here is the Python code:

re.sub(r'"(.?(\\")?)*?"', "", _f.read())

How can I remove this warning?

英文:

I'm learning Kotlin and tried some regular expression things.

What I want to do is just remove all "comment" formats in the string (including \" in middle).
Like this: normal "commen\"t" to normal

So I wrote this code:

val comments = fileContents.replace(Regex("\"(.?(\\\")?)*?\""), "")

and I got: `Redundant character escape '\\\"' in RegExp" warning.

I tried IDE quick fix thing, and it changed '\\\"' to '\"' which is not a happy result for me.

Also, the same regular expression worked without any warning or error in Python.
Here is the Python code:

re.sub(r'"(.?(\\")?)*?"', "", _f.read())

How can I remove this warning?

答案1

得分: 2

以下是代码部分的翻译:

""""(.?(\\")?)*?"""".toRegex()
println("\\\\")            // 包含两个反斜杠字符的字符串
println("\\\\".toRegex())  // 包含单个反斜杠字面量的正则表达式

println("\\")              // 包含单个反斜杠字符的字符串
println("\\".toRegex())    // 在线程“main”中的异常 java.util.regex.PatternSyntaxException: 靠近索引1的意外内部错误 \
val fileContents = """normal "commen\"t" end"""

val regex = """"(.?(\\")?)*?"""".toRegex()
val comments = fileContents.replace(regex, "")

println(comments)   // 打印出“normal  end”

请注意,我已经省略了评论和其他非代码内容。如果您需要更多翻译或有其他问题,请随时提问。

英文:

tl;dr

Use raw Strings for creating Regex:

""""(.?(\\")?)*?"""".toRegex()

As you've written yourself, you need to escape some characters to actually get the regular expression you're looking for.
Disregarding any special characters that need escaping, I assume you try to reach the following pattern: "(.?(\")?)*?".

To have an actual backslash character in your regular expression as literal character, you have to write four backslashes, like in Java.
This is because the backslash is both an escape character for regular Strings as well as in Regexs.

The expression "\\" yields a string containing a single backslash. However, to get a literal backslash character in a regular expression, you have to escape it with another backslash character.
That is:
The expression "\\\\" turns into a String containing of two '\' characters, that is "\\".
The String "\\" turned into a Regex becomes a regular expression containing a single backslash literal \.

You can see this more clearly by executing the following code:

println("\\\\")            // String with two backslash characters
println("\\\\".toRegex())  // Regex with single backslash literal

println("\\")              // String with single backslash characters
println("\\".toRegex())    // Exception in thread "main" java.util.regex.PatternSyntaxException: Unexpected internal error near index 1 \

In general I'd recommend to use raw strings, whenever creating a regular expression in Kotlin. They're delimited by triple quotes (""") instead of single quotes (").
In raw string literals, the backslash character is not an escape character, thus you neither have to escape it (for the String) nor the single quotes.

Instead of "\"(.?(\\\\\")?)*?\"" you can write """"(.?(\\")?)*?"""".

Additionally, you can use the extension function
fun String.toRegex(): Regex, to convert your String to a Regex object, but that's just a question of preference.

All in all, your code could look like:

val fileContents = """normal "commen\"t" end"""

val regex = """"(.?(\\")?)*?"""".toRegex()
val comments = fileContents.replace(regex, "")

println(comments)   // prints "normal  end"

huangapple
  • 本文由 发表于 2023年5月6日 17:37:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/76188171.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定