将xlsx文件转换为csv文件并进行模式匹配在Python中不起作用。

huangapple go评论57阅读模式
英文:

Converting xlsx to csv with pattern matching in Python not working

问题

I have several xlsx files in the directory, but I want to search for an xlsx file with the filename 20230406115500.001.A0.XZI.INVOICING_ES101_Anlage_DISCO_Split_20230405_114751.xlsx.

I want to search for the above file using pattern matching, something like *ES101*.xlsx, and then convert this xlsx file into a csv. There is only one file with this filename pattern. I am trying the following Python code:

import pandas as pd

read_file = pd.read_excel('/home/repo/*ES101*.xlsx')
read_file.to_csv('/home/repo/ES101.csv', index=None, header=True)

However, it throws an error No such file or directory. I am running this Python code on a Linux server with Python 3.9 version.

英文:

I have several xlsx files in directory. But i want to search xlsx file with filename 20230406115500.001.A0.XZI.INVOICING_ES101_Anlage_DISCO_Split_20230405_114751.xlsx.

I wanted to search above file with pattern matching something like *ES101*.xlsx and then convert this xlsx file into csv. There is only one file with this file name pattern. I am trying below python code:

import pandas as pd

read_file = pd.read_excel (r'/home/repo/"*ES101*.xlsx"')')
read_file.to_csv (r'/home/repo/ES101.csv', index = None, header=True)

But it throws error No such file or directory. I am running this python code in Linux server. I am using Python3.9 version.

答案1

得分: 1

你可以使用glob函数进行文件搜索,然后读取文件:

from glob import glob

# 它会搜索包含指定模式的所有文件,在这种情况下,我选择第一个且唯一的文件
file_name = glob('/home/repo/*ES101*.xlsx')[0]

dataframe = pd.read_excel(file_name, sheet_name='sheet_name')
dataframe.to_csv("file_name.csv", index=False)

# 如果你想要文件使用相同的名称,你可以这样做
dataframe.to_csv(file_name.split('.')[0]+'.csv', index=False)

你需要安装xlsxwriter引擎。

英文:

You can just search with the glob function and then read the file:

from glob import glob

# it searches all the files containing the pattern, in this case I take the first and only one
file_name = glob('/home/repo/*ES101*.xlsx')[0]

dataframe = pd.read_excel(file_name, sheet_name='sheet_name')
dataframe.to_csv("file_name.csv", index=False)

# if you want the same name for the file you just can
dataframe.to_csv(file_name.split('.')[0]+'.csv', index=False)

You need to have the xlsxwriter engine installed.

huangapple
  • 本文由 发表于 2023年5月26日 16:42:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/76339124.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定