英文:
Converting xlsx to csv with pattern matching in Python not working
问题
I have several xlsx files in the directory, but I want to search for an xlsx file with the filename 20230406115500.001.A0.XZI.INVOICING_ES101_Anlage_DISCO_Split_20230405_114751.xlsx.
I want to search for the above file using pattern matching, something like *ES101*.xlsx, and then convert this xlsx file into a csv. There is only one file with this filename pattern. I am trying the following Python code:
import pandas as pd
read_file = pd.read_excel('/home/repo/*ES101*.xlsx')
read_file.to_csv('/home/repo/ES101.csv', index=None, header=True)
However, it throws an error No such file or directory. I am running this Python code on a Linux server with Python 3.9 version.
英文:
I have several xlsx files in directory. But i want to search xlsx file with filename 20230406115500.001.A0.XZI.INVOICING_ES101_Anlage_DISCO_Split_20230405_114751.xlsx.
I wanted to search above file with pattern matching something like *ES101*.xlsx and then convert this xlsx file into csv. There is only one file with this file name pattern. I am trying below python code:
import pandas as pd
read_file = pd.read_excel (r'/home/repo/"*ES101*.xlsx"')')
read_file.to_csv (r'/home/repo/ES101.csv', index = None, header=True)
But it throws error No such file or directory. I am running this python code in Linux server. I am using Python3.9 version.
答案1
得分: 1
你可以使用glob函数进行文件搜索,然后读取文件:
from glob import glob
# 它会搜索包含指定模式的所有文件,在这种情况下,我选择第一个且唯一的文件
file_name = glob('/home/repo/*ES101*.xlsx')[0]
dataframe = pd.read_excel(file_name, sheet_name='sheet_name')
dataframe.to_csv("file_name.csv", index=False)
# 如果你想要文件使用相同的名称,你可以这样做
dataframe.to_csv(file_name.split('.')[0]+'.csv', index=False)
你需要安装xlsxwriter引擎。
英文:
You can just search with the glob function and then read the file:
from glob import glob
# it searches all the files containing the pattern, in this case I take the first and only one
file_name = glob('/home/repo/*ES101*.xlsx')[0]
dataframe = pd.read_excel(file_name, sheet_name='sheet_name')
dataframe.to_csv("file_name.csv", index=False)
# if you want the same name for the file you just can
dataframe.to_csv(file_name.split('.')[0]+'.csv', index=False)
You need to have the xlsxwriter engine installed.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。



评论