执行文件夹中的多个文件,然后使用Python计算平均值。

huangapple go评论83阅读模式
英文:

Execute multiple files from a folder and later to calculate the average using Python

问题

我有多个 .csv 文件在一个文件夹里,每个文件包含一些整数值的几列。我想取一列(例如第一列),然后想找到该列的总和。最后,所有文件的平均值,这意味着'所有文件的总和/文件数量'。

我从以下开始,

import os

文件夹路径 = r'文件夹\路径'
总和 = 0
文件数量 = 0
对于文件名 在 os.listdir(文件夹路径):
    如果 文件名.endswith('.csv'):
        文件路径 = os.path.join(文件夹路径, 文件名)
        用 open(文件路径, 'r') as 文件:
            df=文件.readlines()   #不确定
            df.columns=['A', 'B', 'C']
            df1 = df['A'].sum()   #不确定
            总和 += df1
            文件数量 += 1
            打印(f'Sum for {文件名}: {df1}')

平均值 = 总和 / 文件数量
打印(f'Average of all accuracy: {平均值}')
英文:

I have multiple .csv files in a folder, where each file contains few columns with some integer values. I want to take one column (e.g. 1st) and later, want to find the sum of that column. At the end, the average of all files, which means 'total sum of all files/number of files'.
I started as following,

import os

folder_path = r'folder\path'
total_sum = 0
file_count = 0
for file_name in os.listdir(folder_path):
    if file_name.endswith('.csv'):
        file_path = os.path.join(folder_path, file_name)
        with open(file_path, 'r') as file:
            df=file.readlines()   #not sure
            df.columns=['A', 'B', 'C']
            df1 = df['A'].sum()   #not sure
            total_sum += df1
            file_count += 1
            print(f'Sum for {file_name}: {df1}')

average = total_sum / file_count
print(f'Average of all accuracy: {average}')

But, I am unable to read these .csv files, thereby to execute the rest code. A little clue or help will be appreciated.

答案1

得分: 1

You can use pandas 来简化处理 CSV 文件。以下是一个代码片段,可帮助你入门。

import os
import pandas as pd

folder_path = ... # 在这里输入你的绝对/相对文件夹路径
total_sum = 0
file_count = 0

# 循环遍历目录中的所有文件
for file_name in os.listdir(folder_path):
    # 注意:如果文件夹中有一些不是 CSV 文件,可以添加错误检查。

    # 读取文件到 pandas 数据框
    df = pd.read_csv(os.path.join(folder_path, file_name))

    # 注意:你应该为 df 中的 column_name 添加错误检查
    column_name = ... # 输入你想要求和的列名
    file_sum = df[column_name].sum()

    total_sum += file_sum
    file_count += 1

    print(f"{file_name} 的总和:{file_sum}")

if file_count == 0:
    average = 0.
else:
    average = total_sum / file_count
print(f"所有准确度的平均值:{average}")

请替换 folder_pathcolumn_name 的占位符,并按照你的需求使用此代码片段。

英文:

You're almost there, but you just need to use pandas to simplify dealing with CSV files. Here's a code snippet to get you started.

import os
import pandas as pd

folder_path = ... # Enter your absolute/relative folder path here
total_sum = 0
file_count = 0

# Loop through all files in the directory
for file_name in os.listdir(folder_path):
    # NOTE: if some files in the folder are not CSV, can add an error check.
    
    # Read in the file into a pandas dataframe
    df = pd.read_csv(os.path.join(folder_path, file_name))
    
    # NOTE: you should add an error check for column_name in the df
    column_name = ... # Enter the column you want to sum over
    file_sum = df[column_name].sum()

    total_sum += file_sum
    file_count += 1

    print(f"Sum for {file_name}: {file_sum}")

if file_count == 0:
    average = 0.
else:
    average = total_sum / file_count
print(f"Average of all accuracy: {average}")

答案2

得分: 0

我建议使用 pandas 库。它可以更“优雅”地读取 .csv 文件并更轻松地操作这些文件。

英文:

I would recommend using pandas library. It can read .csv files more "elegantly" and operate with them more easily.

huangapple
  • 本文由 发表于 2023年6月6日 06:20:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/76410330.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定