在Python中替换文本文件中的反斜杠后跟双引号。

huangapple go评论68阅读模式
英文:

Replace backslash followed by double quotation in a text file in Python

问题

import re

file_path = "backslash_double_quotation.txt"
with open(file_path, "r") as input_file:
    raw_text = input_file.read()
processed_text = re.sub(r'"', '', raw_text)
print(raw_text)
print(processed_text)
英文:

I have a text file, and its content is like this:

"good to know it \" so nice \" "

I use Python to read its contents and want to replace " with an empty string.

The code I am using is:

import re

file_path = "backslash_double_quotation.txt"
with open(file_path, "r") as input_file:
    raw_text = input_file.read()
processed_text = re.sub(r'\"', "", raw_text)
print(raw_text)
print(processed_text)

and I expect processed_text like this:

"good to know it  so nice  "

However, the actual output is:

good to know it \ so nice \

All the double quotations are replaced by empty strings.
How can I fix this?

答案1

得分: 1

使用字符串可以使用 .replace() 方法来替换字符串中的特定字符或单词。

例如:

text = "good to know it \" so nice \""
print(text.replace("\"", " "))

这将输出:

good to know it   so nice  

对于你的代码:

import re
file_path = "backslash_double_quotation.txt"
with open(file_path, "r") as input_file:
    raw_text = input_file.read()
processed_text = raw_text.replace("\"", "")
print(raw_text)
print(processed_text)

如果你想使用 re,则可以使用以下方式:

processed_text = re.sub(r"\\\"", "", raw_text)
英文:

With strings you can use .replace() to replace specific characters or words in a string.

For example:

text = "good to know it \" so nice \""
print(text.replace("\"", " "))

The output for this is:

good to know it   so nice  

With your code:

import re
file_path = "backslash_double_quotation.txt"
with open(file_path, "r") as input_file:
    raw_text = input_file.read()
processed_text = raw_text.replace("\"", "")
print(raw_text)
print(processed_text)

If you want to use re then:

processed_text = re.sub(r"\\", "", raw_text)

答案2

得分: 1

你没有得到预期的结果,因为你的示例中使用了"raw-string",即"r"。如果你添加了"r",你应该指定你的正则表达式,而不包含任何转义字符。

只需在你的示例中移除"r",它就会按预期工作:

processed_text = re.sub('"', '', raw_text)

参考链接:

Raw String Notation

英文:

You don't get the expected result because of "raw-string", "r" in your example. If you add "r" you should specify your regex expression without any escape characters.

Just remove "r" in your example and it will work as expected:

processed_text = re.sub('\"', "", raw_text)

Reference:

Raw String Notation

答案3

得分: 0

处理一个接一个

processed_text = raw_text.replace('\"', '')
processed_text = processed_text.replace('\\', '')
英文:

Eliminate one by one

processed_text = raw_text.replace('"', '')
processed_text = processed_text.replace('\', '')

答案4

得分: 0

不含代码的翻译如下:

难以想象,一个转义的双引号 \" 表示的意思不同于将此引号包含在双引号分隔的字符串中。因此,很难想象不使用转义的转义符 \\ 来区分字符串中包含的转义与不将后续的双引号(如果有的话)视为字符串结束符。

这似乎是一种明确区分的方法 -

https://regex101.com/r/FH2Dfp/1

查找(原始上下文,用 r' ' 包裹):

(?<!\\)((?:\\\\)*)\\"

替换为:

`\1`
英文:

It's hard to imagine that an escaped double quote \" means something else than include this quote in the delimited double quote string. Therefore it's impossible to imagine not using an escaped escape \\ to differentiate an included escape in the string from not treating a following double quote (if any) as the closing string delimiter.

This seems to be a nonambiguous way to tell the difference -

https://regex101.com/r/FH2Dfp/1

Find (raw context, wrap in r' '):

(?<!\\)((?:\\\\)*)\\"

Replace with:

\1

答案5

得分: 0

我发现这个有效:

processed_text = re.sub(r'\\"', '', raw_text)
英文:

I found this works:

processed_text = re.sub(r'\\"', "", raw_text)

huangapple
  • 本文由 发表于 2023年2月24日 02:05:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/75548664.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定