英文:
Execute multiple files from a folder and later to calculate the average using Python
问题
我有多个 .csv 文件在一个文件夹里,每个文件包含一些整数值的几列。我想取一列(例如第一列),然后想找到该列的总和。最后,所有文件的平均值,这意味着'所有文件的总和/文件数量'。
我从以下开始,
import os
文件夹路径 = r'文件夹\路径'
总和 = 0
文件数量 = 0
对于文件名 在 os.listdir(文件夹路径):
如果 文件名.endswith('.csv'):
文件路径 = os.path.join(文件夹路径, 文件名)
用 open(文件路径, 'r') as 文件:
df=文件.readlines() #不确定
df.columns=['A', 'B', 'C']
df1 = df['A'].sum() #不确定
总和 += df1
文件数量 += 1
打印(f'Sum for {文件名}: {df1}')
平均值 = 总和 / 文件数量
打印(f'Average of all accuracy: {平均值}')
英文:
I have multiple .csv files in a folder, where each file contains few columns with some integer values. I want to take one column (e.g. 1st) and later, want to find the sum of that column. At the end, the average of all files, which means 'total sum of all files/number of files'.
I started as following,
import os
folder_path = r'folder\path'
total_sum = 0
file_count = 0
for file_name in os.listdir(folder_path):
if file_name.endswith('.csv'):
file_path = os.path.join(folder_path, file_name)
with open(file_path, 'r') as file:
df=file.readlines() #not sure
df.columns=['A', 'B', 'C']
df1 = df['A'].sum() #not sure
total_sum += df1
file_count += 1
print(f'Sum for {file_name}: {df1}')
average = total_sum / file_count
print(f'Average of all accuracy: {average}')
But, I am unable to read these .csv files, thereby to execute the rest code. A little clue or help will be appreciated.
答案1
得分: 1
You can use pandas 来简化处理 CSV 文件。以下是一个代码片段,可帮助你入门。
import os
import pandas as pd
folder_path = ... # 在这里输入你的绝对/相对文件夹路径
total_sum = 0
file_count = 0
# 循环遍历目录中的所有文件
for file_name in os.listdir(folder_path):
# 注意:如果文件夹中有一些不是 CSV 文件,可以添加错误检查。
# 读取文件到 pandas 数据框
df = pd.read_csv(os.path.join(folder_path, file_name))
# 注意:你应该为 df 中的 column_name 添加错误检查
column_name = ... # 输入你想要求和的列名
file_sum = df[column_name].sum()
total_sum += file_sum
file_count += 1
print(f"{file_name} 的总和:{file_sum}")
if file_count == 0:
average = 0.
else:
average = total_sum / file_count
print(f"所有准确度的平均值:{average}")
请替换 folder_path
和 column_name
的占位符,并按照你的需求使用此代码片段。
英文:
You're almost there, but you just need to use pandas to simplify dealing with CSV files. Here's a code snippet to get you started.
import os
import pandas as pd
folder_path = ... # Enter your absolute/relative folder path here
total_sum = 0
file_count = 0
# Loop through all files in the directory
for file_name in os.listdir(folder_path):
# NOTE: if some files in the folder are not CSV, can add an error check.
# Read in the file into a pandas dataframe
df = pd.read_csv(os.path.join(folder_path, file_name))
# NOTE: you should add an error check for column_name in the df
column_name = ... # Enter the column you want to sum over
file_sum = df[column_name].sum()
total_sum += file_sum
file_count += 1
print(f"Sum for {file_name}: {file_sum}")
if file_count == 0:
average = 0.
else:
average = total_sum / file_count
print(f"Average of all accuracy: {average}")
答案2
得分: 0
我建议使用 pandas 库。它可以更“优雅”地读取 .csv 文件并更轻松地操作这些文件。
英文:
I would recommend using pandas library. It can read .csv files more "elegantly" and operate with them more easily.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论