问题

我有一个需要添加功能的程序，即剥离它处理的每个事件CSV文件的第二列。我尝试了这个帖子中的解决方案，但无法成功使用其中任何一个。

我的CSV文件如下所示

Time/Date,Event #,Event Desc
05/19/2020 20:12:30,29,Advance Drive ON
05/19/2020 20:32:23,29,Advance Drive ON
05/19/2020 20:35:13,29,Advance Drive ON
05/19/2020 20:39:50,37,Discharge 1 Plug Chute Fault
05/19/2020 20:47:40,68,LMI is in OFF Mode

以下是我的函数：

# 用于清理事件文件的函数
def CleanEventFiles(EF_files, eventHeader, EFmachineID):
    logging.debug(f'Cleaning Event files...')                       # 写入程序日志
    for f in EF_files:                                              # 对于事件文件中的所有文件
        IsFileReadOnly(f)                                           # 检查文件是否为只读
        print(f'\nCleaning file: {f}')                              # 告诉用户正在清理哪个文件
        print('\tReplacing new MachineIDs & File Headers...')       # 向用户打印信息
        logging.debug(f'\tReplacing headers for file {f}')          # 写入程序日志
        with open(f, newline='', encoding='latin-1') as g:          # 以读取方式打开文件
            r = csv.reader((line.replace('# 用于清理事件文件的函数
def CleanEventFiles(EF_files, eventHeader, EFmachineID):
    logging.debug(f'Cleaning Event files...')                       # 写入程序日志
    for f in EF_files:                                              # 对于事件文件中的所有文件
        IsFileReadOnly(f)                                           # 检查文件是否为只读
        print(f'\nCleaning file: {f}')                              # 告诉用户正在清理哪个文件
        print('\tReplacing new MachineIDs & File Headers...')       # 向用户打印信息
        logging.debug(f'\tReplacing headers for file {f}')          # 写入程序日志
        with open(f, newline='', encoding='latin-1') as g:          # 以读取方式打开文件
            r = csv.reader((line.replace('\0', '') for line in g))  # 在删除NULL的同时声明读取变量
            next(r)                                                 # 移除旧的机器ID
            data = 
                             # 将列表设置为文件中的所有数据
            data[0] = eventHeader                                   # 用新标题替换第一行
            data.insert(0, EFmachineID)                             # 在标题前添加机器ID的行
        WriteData(f, data)                                          # 将数据写入文件
', '') for line in g))  # 在删除NULL的同时声明读取变量
            next(r)                                                 # 移除旧的机器ID
            data = 
                             # 将列表设置为文件中的所有数据
            data[0] = eventHeader                                   # 用新标题替换第一行
            data.insert(0, EFmachineID)                             # 在标题前添加机器ID的行
        WriteData(f, data)                                          # 将数据写入文件

我知道解决方法肯定很简单，比如在某个地方将 del r[1] 放入循环中，但我怎么都似乎找不出来。我能找到的最好的方法似乎是删除每个文件的事件#标题，但文件处理后 data[1] 中的数据仍然保留。

如何最好地删除这些文件中的第二列数据？

英文:

I have a program that I need to add a functionality to, which is to strip out the second column of each Event CSV file which it processes. I've tried the solutions at this thread, but I've been unsuccessful in employing any of them.

My CSV files look like this

Time/Date,Event #,Event Desc
05/19/2020 20:12:30,29,Advance Drive ON
05/19/2020 20:32:23,29,Advance Drive ON
05/19/2020 20:35:13,29,Advance Drive ON
05/19/2020 20:39:50,37,Discharge 1 Plug Chute Fault
05/19/2020 20:47:40,68,LMI is in OFF Mode

And here is my function:

# A function to clean the Event Files of raw data
def CleanEventFiles(EF_files, eventHeader, EFmachineID):
    logging.debug(f&#39;Cleaning Event files...&#39;)                       # Write to program logger
    for f in EF_files:                                              # FOR ALL FILES IN EVENT FILES
        IsFileReadOnly(f)                                           # check to see if the file is READ ONLY
        print(f&#39;\nCleaning file: {f}&#39;)                              # tell user which file is being cleaned
        print(&#39;\tReplacing new MachineIDs &amp; File Headers...&#39;)       # print stuff to the user
        logging.debug(f&#39;\tReplacing headers for file {f}&#39;)          # write to program logger
        with open(f, newline=&#39;&#39;, encoding=&#39;latin-1&#39;) as g:          # open file as read
            r = csv.reader((line.replace(&#39;# A function to clean the Event Files of raw data
def CleanEventFiles(EF_files, eventHeader, EFmachineID):
logging.debug(f&#39;Cleaning Event files...&#39;)                       # Write to program logger
for f in EF_files:                                              # FOR ALL FILES IN EVENT FILES
IsFileReadOnly(f)                                           # check to see if the file is READ ONLY
print(f&#39;\nCleaning file: {f}&#39;)                              # tell user which file is being cleaned
print(&#39;\tReplacing new MachineIDs &amp; File Headers...&#39;)       # print stuff to the user
logging.debug(f&#39;\tReplacing headers for file {f}&#39;)          # write to program logger
with open(f, newline=&#39;&#39;, encoding=&#39;latin-1&#39;) as g:          # open file as read
r = csv.reader((line.replace(&#39;\0&#39;, &#39;&#39;) for line in g))  # declare read variable while removing NULLs
next(r)                                                 # remove old machineID
data =                              # set list to all data in file
data[0] = eventHeader                                   # replace first line with new header
data.insert(0, EFmachineID)                             # add line before header for machine ID
WriteData(f, data)                                          # write data to the file
&#39;, &#39;&#39;) for line in g))  # declare read variable while removing NULLs
            next(r)                                                 # remove old machineID
            data =                              # set list to all data in file
            data[0] = eventHeader                                   # replace first line with new header
            data.insert(0, EFmachineID)                             # add line before header for machine ID
        WriteData(f, data)                                          # write data to the file

I know it's got to be something as simple as putting del r[1] into a loop somewhere, but for the life of me I can't seem to figure it out. The best I seem to get is to remove the Event # header on each file, but the data in data[1] remains after the file processes.

What would be the best way to go about removing the second column of data from these files?

答案1

得分: 0

以下是代码的翻译部分：

import csv

def clean_event_file(filename, column_to_remove):

    ##--------------------
    ## 一次性读取所有行。
    ## 请注意，这还将获取标题。
    ##--------------------
    with open(filename, "r") as file_in:
        rows = list(csv.DictReader(file_in))
    headers = [col for col in rows[0].keys() if col != column_to_remove]
    ##--------------------

    ##--------------------
    ## 再次写出结果，不包括指定的标题
    ##--------------------
    with open(filename, "w", newline="") as file_out:
        writer = csv.DictWriter(file_out, fieldnames=headers, extrasaction="ignore")
        writer.writeheader()
        writer.writerows(rows)
    ##--------------------

clean_event_file("in.csv", "Event #")

注意：代码中的csv.DictReader和csv.DictWriter用于处理CSV文件，将指定列从CSV文件中删除，然后将结果写回文件。这段代码是Python代码，用于处理CSV文件。

英文:

If you can read all the rows into a list via csv.DictReader then that is a fairly straightforward solution. Note, this answer does things one file at a time so you will want to call it for each file. Note as well that this is currently destructive in that the original file is overwritten.

import csv

def clean_event_file(filename, column_to_remove):

    ##--------------------
    ## read in all the rows at once.
    ## note that this will also get us the headers.
    ##--------------------
    with open(filename, &quot;r&quot;) as file_in:
        rows = list(csv.DictReader(file_in))
    headers = [col for col in rows[0].keys() if col != column_to_remove]
    ##--------------------

    ##--------------------
    ## Write out the results again absent the given header
    ##--------------------
    with open(filename, &quot;w&quot;, newline=&quot;&quot;) as file_out:
        writer = csv.DictWriter(file_out, fieldnames=headers, extrasaction=&quot;ignore&quot;)
        writer.writeheader()
        writer.writerows(rows)
    ##--------------------

clean_event_file(&quot;in.csv&quot;, &quot;Event #&quot;)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

循环遍历 n 个 CSV 文件并在 Python 中删除列

问题

答案1

如何在Langchain Faiss检索器中指定相似度阈值？

将一个类的参数传递给类似字典的参数。

PyTorch中的数据增强用于CNN。

如何从Python中的嵌套列表中获取特定值

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论