英文:
Python reading text file every second line skipped
问题
f = open("sample_failing_file.txt", encoding="ISO-8859-1")
readfile = f.read()
filelines = readfile.split("\n")
def remove_irrelevant_lines(filecontent: list[str]) -> list[str]:
filecontent_copy = filecontent.copy()
for line in filecontent_copy:
if drop_line_appropriate(line):
filecontent.remove(line)
return filecontent
def drop_line_appropriate(line: str) -> bool:
if line.startswith("#"):
return True
# some more conditions, omitted here
return False
filelines = remove_irrelevant_lines(filelines)
f.close()
英文:
I am processing a shell script in Python. My first step is to comb through the file and save only the important lines in a list (of strings). However, I have isolated a problem where every second line is ignored. Why is the second, fourth, etc. line skipped in the following code?
f = open("sample_failing_file.txt", encoding="ISO-8859-1")
readfile = f.read()
filelines = readfile.split("\n")
def remove_irrelevant_lines(filecontent: list[str]) -> list[str]:
for line in filecontent:
if drop_line_appropriate(line):
filecontent.remove(line)
return filecontent
def drop_line_appropriate(line: str) -> bool:
if line.startswith("#"):
return True
# some more conditions, omitted here
return False
filelines = remove_irrelevant_lines(filelines)
f.close()
When I run this code, I can see filecontent is complete. However, when I look at line, I can see e.g. some line 3 is never read. Here is a simplified version of the shell script, on which my Python script fails (sample_failing_file.txt)
#!/bin/sh
#
# some line 1
#
# some line 2
# some line 3
答案1
得分: 0
正如评论中指出的,不应在迭代列表时尝试删除元素。此外,在删除行时,不要使用 list.remove()
,因为这会导致它搜索该行,使其运行速度远远慢于应有的速度。
以下代码应该解决您的问题,并且运行速度会更快:
def remove_irrelevant_lines(filecontent: list[str]) -> list[str]:
return
这将创建并返回一个新列表,过滤掉由 drop_line_appropriate
指示的行。
英文:
As was pointed out in the comments, you shouldn't try to remove elements from a list while iterating over it. Additionally, when removing lines, don't want to use list.remove()
, since that causes it to search for the line, which will make it run vastly slower than it should.
The following should fix your problem and also run vastly faster:
def remove_irrelevant_lines(filecontent: list[str]) -> list[str]:
return
This creates and returns a new list, filtering out the lines indicated by drop_line_appropriate
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论