在Python中,使用多个CSV文件的数据将新行附加到现有的Excel表格中。

huangapple go评论163阅读模式
英文:

Attaching new lines to existing excel in python with data from multiple csvs

问题

我尝试从文件夹中的多个CSV文件中读取数据,选择特定列,将文件名作为附加列,最后将数据写入现有的Excel文件。

打印输出实现了我想要的效果,但似乎无法让数据附加到Excel表格中。

  1. import requests
  2. import os
  3. import pandas as pd
  4. import time
  5. import xlwt
  6. import glob
  7. files = glob.glob("*.csv")
  8. writer = pd.ExcelWriter('output.xls', engine='xlsxwriter')
  9. writer.close()
  10. for i in files:
  11. df = pd.read_csv(i, usecols=['column1', 'column2'])
  12. df['Filename Column'] = i.split(".")[0]
  13. df.to_csv(i.split(".")[0] + ".csv")
  14. print(df)
  15. df_combined = pd.concat([df])
  16. reader = pd.read_excel(r'output.xls')
  17. df_combined.to_excel(writer, index=False, startrow=len(reader) + 1)

使用这种方式Excel 仍然为空。

英文:

I'm trying to read out data from multiple csv files in a folder, selecting specific column, adding the file name as an additional column and finally writing this to an existing excel.

The print output does what I'd like to do, but I don't seem to get the part working, where the data are attached to the excel sheet.

  1. import requests
  2. import os
  3. import pandas as pd
  4. import time
  5. import xlwt
  6. import glob
  7. files = glob.glob("*.csv")
  8. writer = pd.ExcelWriter('output.xls', engine='xlsxwriter')
  9. writer.close()
  10. for i in files:
  11. df=pd.read_csv(i, usecols = ['column1', 'column2'])
  12. df['Filename Column'] = i.split(".")[0]
  13. df.to_csv(i.split(".")[0]+".csv")
  14. print(df)
  15. df_combined = pd.concat([df])
  16. reader = pd.read_excel(r'output.xls')
  17. df_combined.to_excel(writer, index=False, startrow=len(reader)+1)

The Excel remains empty, when I do it like that.

答案1

得分: 0

以下是您提供的代码的翻译部分:

  1. import glob
  2. import pandas as pd
  3. from typing import List
  4. if __name__ == "__main__":
  5. final_df: pd.DataFrame = pd.read_excel("dataset.xlsx", engine="openpyxl")
  6. files: List = glob.glob("*.csv")
  7. for file in files:
  8. added_df: pd.DataFrame = pd.read_csv(file, usecols=["column1", "column3"])
  9. added_df["Filename"] = file.split(".")[0]
  10. final_df: pd.DataFrame = pd.concat([final_df, added_df])
  11. final_df.to_excel("dataset.xlsx", index=False)

希望这对您有所帮助。

英文:

This code is what I've come up with.

  1. import glob
  2. import pandas as pd
  3. from typing import List
  4. if __name__ == "__main__":
  5. final_df: pd.DataFrame = pd.read_excel("dataset.xlsx", engine="openpyxl")
  6. files: List = glob.glob("*.csv")
  7. for file in files:
  8. added_df: pd.DataFrame = pd.read_csv(file, usecols=["column1", "column3"])
  9. added_df["Filename"] = file.split(".")[0]
  10. final_df: pd.DataFrame = pd.concat([final_df, added_df])
  11. final_df.to_excel("dataset.xlsx", index=False)

huangapple
  • 本文由 发表于 2023年3月7日 16:39:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/75659645.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定