在Python中,使用多个CSV文件的数据将新行附加到现有的Excel表格中。

huangapple go评论132阅读模式
英文:

Attaching new lines to existing excel in python with data from multiple csvs

问题

我尝试从文件夹中的多个CSV文件中读取数据,选择特定列,将文件名作为附加列,最后将数据写入现有的Excel文件。

打印输出实现了我想要的效果,但似乎无法让数据附加到Excel表格中。

import requests
import os
import pandas as pd
import time
import xlwt
import glob

files = glob.glob("*.csv")

writer = pd.ExcelWriter('output.xls', engine='xlsxwriter')
writer.close()

for i in files:

    df = pd.read_csv(i, usecols=['column1', 'column2'])
    df['Filename Column'] = i.split(".")[0]
    df.to_csv(i.split(".")[0] + ".csv")
    print(df)
    df_combined = pd.concat([df])
    reader = pd.read_excel(r'output.xls')
    df_combined.to_excel(writer, index=False, startrow=len(reader) + 1)

使用这种方式Excel 仍然为空。

英文:

I'm trying to read out data from multiple csv files in a folder, selecting specific column, adding the file name as an additional column and finally writing this to an existing excel.

The print output does what I'd like to do, but I don't seem to get the part working, where the data are attached to the excel sheet.

import requests
import os
import pandas as pd
import time
import xlwt
import glob


files = glob.glob("*.csv") 

writer = pd.ExcelWriter('output.xls', engine='xlsxwriter')
writer.close()

for i in files:
    
    df=pd.read_csv(i, usecols = ['column1', 'column2'])
    df['Filename Column'] = i.split(".")[0]
    df.to_csv(i.split(".")[0]+".csv")
    print(df)
    df_combined = pd.concat([df])
    reader = pd.read_excel(r'output.xls')
    df_combined.to_excel(writer, index=False, startrow=len(reader)+1)

The Excel remains empty, when I do it like that.

答案1

得分: 0

以下是您提供的代码的翻译部分:

import glob
import pandas as pd
from typing import List

if __name__ == "__main__":
    final_df: pd.DataFrame = pd.read_excel("dataset.xlsx", engine="openpyxl")

    files: List = glob.glob("*.csv")
    for file in files:
        added_df: pd.DataFrame = pd.read_csv(file, usecols=["column1", "column3"])
        added_df["Filename"] = file.split(".")[0]
        final_df: pd.DataFrame = pd.concat([final_df, added_df])

    final_df.to_excel("dataset.xlsx", index=False)

希望这对您有所帮助。

英文:

This code is what I've come up with.

import glob
import pandas as pd
from typing import List


if __name__ == "__main__":
    final_df: pd.DataFrame = pd.read_excel("dataset.xlsx", engine="openpyxl")

    files: List = glob.glob("*.csv")
    for file in files:
        added_df: pd.DataFrame = pd.read_csv(file, usecols=["column1", "column3"])
        added_df["Filename"] = file.split(".")[0]
        final_df: pd.DataFrame = pd.concat([final_df, added_df])

    final_df.to_excel("dataset.xlsx", index=False)

huangapple
  • 本文由 发表于 2023年3月7日 16:39:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/75659645.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定