英文:
OpenPyXL - Formatted date columns set themselves as 'Custom' in output Excel file
问题
我需要将一个数据框插入到一个Excel文件中,其中每个日期列的格式为DD/MM/YYYY
。
我使用dataframe.to_excel
函数将数据框发送到输出文件中,使用正确设置的date_format
和datetime_format
参数的ExcelWriter
。
该格式被正确插入到Excel文件中的datetime
对象,但我的输出文件决定该对象是自定义
并将其放入YYYY-MM-DD
格式。
奇怪的是,当我编辑相应的单元格时,它们的格式正确,当设置为短日期
对象时保持不变。如何使这成为默认行为,以便在打开文件时首次看到它,而无需将列设置为短日期
?
请注意,我需要这个功能适用于列,而不是单元格。我使用这个程序处理大量数据,不希望迭代每个datetime
类型的每个列的每个单元格。
以下是我的代码的非常简化版本:
from pandas import DataFrame, ExcelWriter, to_datetime
from pandas._libs.tslibs.timestamps import Timestamp
def Send_Dataframe_To_Excel(dataframe):
columns_of_type_datetime = list(dataframe.select_dtypes(include=["datetime", "datetime64", "datetime64[ns]", "datetimetz"]).columns)
# 找到所有datetime类型的列
for column in columns_of_type_datetime:
dataframe[column] = to_datetime(dataframe[column], format='%Y-%m-%d %H:%M:%S', errors='coerce').dt.date
# 将它们转换为pandas datetime
with ExcelWriter("output_file.xlsx", engine="openpyxl", date_format='DD/MM/YYYY', datetime_format='DD/MM/YYYY') as writer:
dataframe.to_excel(writer, sheet_name="Sheet1", index=False)
dataframe = DataFrame({'ID': [1, 2, 3, 4, 5, 6, 7],
'text_data': ['text1', 'text2', 'text3', 'text4', 'text5', 'text6', 'text7'],
'number_data': [11, 12, 13, 14, 15, 16, 17,],
'date_data': [Timestamp('2011-01-01 00:20:00'), Timestamp('2012-02-02 00:00:00'), Timestamp('2013-03-03 00:00:00'), Timestamp('2014-04-04 00:00:00'), Timestamp('2015-05-05 00:00:00'), Timestamp('2016-06-06 00:00:00'), Timestamp('2017-07-07 00:00:00')]})
Send_Dataframe_To_Excel(dataframe)
以下是我在输出文件中期望的内容:
以下是实际输出文件的样子(正如您所见,我可以永久地将单元格类型设置为短日期
,从而“修复”问题,但这不是默认行为。此外,“Personnalisé”表示“自定义”,“Date courte”表示“短日期”):
英文:
I need to insert a dataframe into an Excel file, with every date column in format DD/MM/YYYY
.
I use the dataframe.to_excel
function to send the dataframe into the output file, using an ExcelWriter
with correctly set date_format
and datetime_format
parameters.
The format is correctly inserted into the excel file for datetime
objects, but my output file decides the object is Custom
and puts it in YYYY-MM-DD
format.
Oddly enough, when I edit the corresponding cells, they format correctly, and they stay as such when set to be Short Date
objects. How can I make this the default behavior, so that it's the first thing I see when opening the file, without having to set the column as a Short Date
?
Please note that I need this to work with columns, not cells. I handle big chunks of data using this program, and iterating through each cell of each column that is of type datetime
is not desired.
Here is a very simplified version of my code.
from pandas import DataFrame, ExcelWriter, to_datetime
from pandas._libs.tslibs.timestamps import Timestamp
def Send_Dataframe_To_Excel(dataframe):
columns_of_type_datetime = list(dataframe.select_dtypes(include=["datetime", "datetime64", "datetime64[ns]", "datetimetz"]).columns)
#find all columns of type datetime
for column in columns_of_type_datetime :
dataframe[column] = to_datetime(dataframe[column], format='%Y-%m-%d %H:%M:%S', errors='coerce').dt.date
#convert them to pandas datetime
with ExcelWriter("output_file.xlsx", engine="openpyxl", date_format='DD/MM/YYYY', datetime_format='DD/MM/YYYY') as writer:
dataframe.to_excel(writer, sheet_name="Sheet1", index=False)
dataframe = DataFrame({'ID': [1, 2, 3, 4, 5, 6, 7],
'text_data': ['text1', 'text2', 'text3', 'text4', 'text5', 'text6', 'text7'],
'number_data': [11, 12, 13, 14, 15, 16, 17,],
'date_data': [Timestamp('2011-01-01 00:20:00'), Timestamp('2012-02-02 00:00:00'), Timestamp('2013-03-03 00:00:00'), Timestamp('2014-04-04 00:00:00'), Timestamp('2015-05-05 00:00:00'), Timestamp('2016-06-06 00:00:00'), Timestamp('2017-07-07 00:00:00')]})
Send_Dataframe_To_Excel(dataframe)
Here is what I expected in the output file :
Here is what the output file actually ends up looking like (as you can see, I can set the cell type to Short Date
permanently, "fixing" the problem, but it's not the default behavior. Also, "Personnalisé" means "Custom" and "Date courte" means "Short date"):
答案1
得分: 1
这是一个已经公开的问题 (请参见,GH44284)。
暂时,如果可能的话,您需要切换到默认的引擎 (如果已安装的话,它就是 xlxwriter
):
with ExcelWriter(
'output_file.xlsx',
engine='xlsxwriter', # <- 这里
date_format='DD/MM/YYYY',
datetime_format='DD/MM/YYYY') \
as writer:
dataframe.to_excel(writer, sheet_name="Sheet1", index=False)
输出:
英文:
That's an open issue (see, GH44284).
For now, you need to switch (if possible) to the default engine (which is xlxwriter
if installed) :
with ExcelWriter(
'output_file.xlsx',
engine='xlsxwriter', # <- here
date_format='DD/MM/YYYY',
datetime_format='DD/MM/YYYY') \
as writer:
dataframe.to_excel(writer, sheet_name="Sheet1", index=False)
Output :
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论