英文:
Searching text in a file, replacing text and writing to new file in subdir, getting a doubling of replacement text when iterating
问题
在单个文件中搜索文本并将其写入该文件时,它按预期行事。它在子目录“output”中创建一个新文件,包含现有文本“This”,并在下一行添加文本“And That”。
然而,当我遍历子目录中的所有文件时,我得到了两倍的新文本。我不明白为什么。以下是代码:
import os
import shutil
import pathlib
def replace_text_in_multiple_files(input_path, output_path):
search_text = "This"
new_text = "This\nAndThat"
shutil.rmtree(output_path)
os.mkdir(output_path)
for subdir, dirs, files in os.walk(input_path):
for file in files:
input_file_path = subdir + os.sep + file
output_file_path = output_path + os.sep + file
if input_file_path.endswith(".txt"):
s = pathlib.Path(input_file_path).read_text()
s = s.replace(search_text, new_text)
with open(output_file_path, "w") as f:
f.write(s)
def replace_text_in_a_single_files(input_file_path, output_file_path):
search_text = "This"
new_text = "This\nAndThat"
s = pathlib.Path(input_file_path).read_text()
s = s.replace(search_text, new_text)
with open(output_file_path, "w") as f:
f.write(s)
replace_text_in_multiple_files("D:\\Test\\", "D:\\Test\\output\\")
#replace_text_in_a_single_files("D:\\Test\\File1.txt", "D:\\Test\\output\\File1.txt")
在目录'D:\Test'中有3个文本文件。每个文本文件包含以下文本:
This
is
a
test
如果在代码中运行'replace_text_in_a_single_files',它会打开File1.txt,搜索文本,用相同的文本加上值'And That'替换该文本,然后将其写入输出子目录中的新文件,结果如下:
This
And That
is
a
test
然而,当我运行replace_text_in_multiple_files时,它做了同样的事情,只是对一堆文件而不是一个文件,每个新文件都会加倍替换文本,导致以下结果:
This
AndThat
AndThat
is
a
test
所以,就像它执行了两次替换代码。但是为什么呢?而且为什么只有在迭代时才会发生呢?
我期望它只会在每个文件中生成以下文本。
This
AndThat
is
a
test
英文:
When I search text in a single file and write out just that one file, it acts as expected. It creates a new file in the subdirectory "output", with the existing text "This", and the addition of the text "And That" on the next line.
However, when I am iterating through all the files in a sub-directory, I'm getting double the new text. I don't get why. Here is the code:
import os
import shutil
import pathlib
def replace_text_in_multiple_files(input_path, output_path):
search_text = "This"
new_text = "This\nAndThat"
shutil.rmtree(output_path)
os.mkdir(output_path)
for subdir, dirs, files in os.walk(input_path):
for file in files:
input_file_path = subdir + os.sep + file
output_file_path = output_path + os.sep + file
if input_file_path.endswith(".txt"):
s = pathlib.Path(input_file_path).read_text()
s = s.replace(search_text, new_text)
with open(output_file_path, "w") as f:
f.write(s)
def replace_text_in_a_single_files(input_file_path, output_file_path):
search_text = "This"
new_text = "This\nAndThat"
s = pathlib.Path(input_file_path).read_text()
s = s.replace(search_text, new_text)
with open(output_file_path, "w") as f:
f.write(s)
replace_text_in_multiple_files("D:\\Test\\", "D:\\Test\\output\\")
#replace_text_in_a_single_files("D:\\Test\\File1.txt", "D:\\Test\\output\\File1.txt")
In the directory 'D:\Test' I have 3 text files. Each of the text files contains the following text:
This
is
a
test
If I run 'replace_text_in_a_single_files' in the code, it opens File1.txt, searches for the text, replaces that text with the same text plus the value 'And That', and then writes that out to a new file in the output subdirectory, which results in the following:
This
And That
is
a
test
However, when I run replace_text_in_multiple_files which does the same thing, just to a bunch of files instead of just one, each of the new files gets a doubling of the replacement text, resulting in the following:
This
AndThat
AndThat
is
a
test
So, it's like it's executing the replacement code twice. But why? And why only when it's iterating?
I was expecting that it would just produce the following text in each of the files.
This
AndThat
is
a
test
答案1
得分: 0
你正在迭代输入文件以及你自己的输出文件:
import os
import shutil
import pathlib
def replace_text_in_multiple_files(input_path, output_path):
search_text = "This"
new_text = "This\nAndThat"
shutil.rmtree(output_path)
os.mkdir(output_path)
for subdir, dirs, files in os.walk(input_path):
for file in files:
print(subdir, file)
input_file_path = subdir + os.sep + file
output_file_path = output_path + os.sep + file
if input_file_path.endswith(".txt"):
s = pathlib.Path(input_file_path).read_text()
s = s.replace(search_text, new_text)
with open(output_file_path, "w") as f:
f.write(s)
def replace_text_in_a_single_files(input_file_path, output_file_path):
search_text = "This"
new_text = "This\nAndThat"
s = pathlib.Path(input_file_path).read_text()
s = s.replace(search_text, new_text)
with open(output_file_path, "w") as f:
f.write(s)
replace_text_in_multiple_files("./Test", "./Test/output/")
./Test File3.txt
./Test File2.txt
./Test File1.txt
./Test/output File3.txt
./Test/output File2.txt
./Test/output File1.txt
你的脚本一旦“看到”输入文件夹中的文件就会写入每个输出文件,但然后os.walk
会“发现”输出文件夹中具有相同名称的文件,并继续迭代这些文件。
英文:
You're iterating over the input files as well as your own output files:
import os
import shutil
import pathlib
def replace_text_in_multiple_files(input_path, output_path):
search_text = "This"
new_text = "This\nAndThat"
shutil.rmtree(output_path)
os.mkdir(output_path)
for subdir, dirs, files in os.walk(input_path):
for file in files:
print(subdir, file)
input_file_path = subdir + os.sep + file
output_file_path = output_path + os.sep + file
if input_file_path.endswith(".txt"):
s = pathlib.Path(input_file_path).read_text()
s = s.replace(search_text, new_text)
with open(output_file_path, "w") as f:
f.write(s)
def replace_text_in_a_single_files(input_file_path, output_file_path):
search_text = "This"
new_text = "This\nAndThat"
s = pathlib.Path(input_file_path).read_text()
s = s.replace(search_text, new_text)
with open(output_file_path, "w") as f:
f.write(s)
replace_text_in_multiple_files("./Test", "./Test/output/")
./Test File3.txt
./Test File2.txt
./Test File1.txt
./Test/output File3.txt
./Test/output File2.txt
./Test/output File1.txt
Your script writes each output file once it "sees" a file in the input folder, but then os.walk
"discovers" files with the same name in the output folder and proceeds to iterate over those.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论