2023年3月7日 20:22:48go评论99阅读模式

英文:

Python reading text file every second line skipped

问题

f = open("sample_failing_file.txt", encoding="ISO-8859-1")
readfile = f.read()
filelines = readfile.split("\n")
def remove_irrelevant_lines(filecontent: list[str]) -> list[str]:
    filecontent_copy = filecontent.copy()
    for line in filecontent_copy:
        if drop_line_appropriate(line):
            filecontent.remove(line)
    return filecontent
def drop_line_appropriate(line: str) -> bool:
    if line.startswith("#"):
        return True
    # some more conditions, omitted here
    return False
filelines = remove_irrelevant_lines(filelines)
f.close()

英文:

I am processing a shell script in Python. My first step is to comb through the file and save only the important lines in a list (of strings). However, I have isolated a problem where every second line is ignored. Why is the second, fourth, etc. line skipped in the following code?

f = open(&quot;sample_failing_file.txt&quot;, encoding=&quot;ISO-8859-1&quot;)
readfile = f.read()
filelines = readfile.split(&quot;\n&quot;)
def remove_irrelevant_lines(filecontent: list[str]) -&gt; list[str]:
    for line in filecontent:
        if drop_line_appropriate(line):
            filecontent.remove(line)
    return filecontent
def drop_line_appropriate(line: str) -&gt; bool:
    if line.startswith(&quot;#&quot;):
        return True
    # some more conditions, omitted here
    return False
filelines = remove_irrelevant_lines(filelines)
f.close()

When I run this code, I can see filecontent is complete. However, when I look at line, I can see e.g. some line 3 is never read. Here is a simplified version of the shell script, on which my Python script fails (sample_failing_file.txt)

#!/bin/sh
#
# some line 1
#
# some line 2
# some line 3

答案1

得分: 0

正如评论中指出的，不应在迭代列表时尝试删除元素。此外，在删除行时，不要使用 list.remove()，因为这会导致它搜索该行，使其运行速度远远慢于应有的速度。

以下代码应该解决您的问题，并且运行速度会更快：

def remove_irrelevant_lines(filecontent: list[str]) -> list[str]:
    return

这将创建并返回一个新列表，过滤掉由 drop_line_appropriate 指示的行。

英文:

As was pointed out in the comments, you shouldn't try to remove elements from a list while iterating over it. Additionally, when removing lines, don't want to use list.remove(), since that causes it to search for the line, which will make it run vastly slower than it should.

The following should fix your problem and also run vastly faster:

def remove_irrelevant_lines(filecontent: list[str]) -&gt; list[str]:
    return

This creates and returns a new list, filtering out the lines indicated by drop_line_appropriate.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Python 每秒读取文本文件，跳过每一行。

问题

答案1

如何比较数据帧中所有行的列，无论索引值如何。

Django：在代码中根据settings.LANGUAGE_CODE格式化日期（例如消息）。

从一个日期和时间字符串创建pandas数据，但不包括冒号。

在Python中启动一个新终端并执行shell脚本，并获取退出代码。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。