OpenPyXL – 格式化的日期列在输出的Excel文件中被设置为’自定义’

huangapple go评论70阅读模式
英文:

OpenPyXL - Formatted date columns set themselves as 'Custom' in output Excel file

问题

我需要将一个数据框插入到一个Excel文件中,其中每个日期列的格式为DD/MM/YYYY

我使用dataframe.to_excel函数将数据框发送到输出文件中,使用正确设置的date_formatdatetime_format参数的ExcelWriter

该格式被正确插入到Excel文件中的datetime对象,但我的输出文件决定该对象是自定义并将其放入YYYY-MM-DD格式。

奇怪的是,当我编辑相应的单元格时,它们的格式正确,当设置为短日期对象时保持不变。如何使这成为默认行为,以便在打开文件时首次看到它,而无需将列设置为短日期

请注意,我需要这个功能适用于列,而不是单元格。我使用这个程序处理大量数据,不希望迭代每个datetime类型的每个列的每个单元格。

以下是我的代码的非常简化版本:

from pandas import DataFrame, ExcelWriter, to_datetime
from pandas._libs.tslibs.timestamps import Timestamp

def Send_Dataframe_To_Excel(dataframe):
    columns_of_type_datetime = list(dataframe.select_dtypes(include=["datetime", "datetime64", "datetime64[ns]", "datetimetz"]).columns)
    # 找到所有datetime类型的列
    
    for column in columns_of_type_datetime:
        dataframe[column] = to_datetime(dataframe[column], format='%Y-%m-%d %H:%M:%S', errors='coerce').dt.date
        # 将它们转换为pandas datetime
    
    with ExcelWriter("output_file.xlsx", engine="openpyxl", date_format='DD/MM/YYYY', datetime_format='DD/MM/YYYY') as writer:
        dataframe.to_excel(writer, sheet_name="Sheet1", index=False)

dataframe = DataFrame({'ID': [1, 2, 3, 4, 5, 6, 7],
                       'text_data': ['text1', 'text2', 'text3', 'text4', 'text5', 'text6', 'text7'],
                       'number_data': [11, 12, 13, 14, 15, 16, 17,],
                       'date_data': [Timestamp('2011-01-01 00:20:00'), Timestamp('2012-02-02 00:00:00'), Timestamp('2013-03-03 00:00:00'), Timestamp('2014-04-04 00:00:00'), Timestamp('2015-05-05 00:00:00'), Timestamp('2016-06-06 00:00:00'), Timestamp('2017-07-07 00:00:00')]})

Send_Dataframe_To_Excel(dataframe)

以下是我在输出文件中期望的内容:

OpenPyXL – 格式化的日期列在输出的Excel文件中被设置为’自定义’

以下是实际输出文件的样子(正如您所见,我可以永久地将单元格类型设置为短日期,从而“修复”问题,但这不是默认行为。此外,“Personnalisé”表示“自定义”,“Date courte”表示“短日期”):

OpenPyXL – 格式化的日期列在输出的Excel文件中被设置为’自定义’

英文:

I need to insert a dataframe into an Excel file, with every date column in format DD/MM/YYYY.

I use the dataframe.to_excel function to send the dataframe into the output file, using an ExcelWriter with correctly set date_format and datetime_format parameters.

The format is correctly inserted into the excel file for datetime objects, but my output file decides the object is Custom and puts it in YYYY-MM-DD format.

Oddly enough, when I edit the corresponding cells, they format correctly, and they stay as such when set to be Short Date objects. How can I make this the default behavior, so that it's the first thing I see when opening the file, without having to set the column as a Short Date ?

Please note that I need this to work with columns, not cells. I handle big chunks of data using this program, and iterating through each cell of each column that is of type datetime is not desired.

Here is a very simplified version of my code.

from pandas import DataFrame, ExcelWriter, to_datetime
from pandas._libs.tslibs.timestamps import Timestamp

def Send_Dataframe_To_Excel(dataframe):
		columns_of_type_datetime = list(dataframe.select_dtypes(include=["datetime", "datetime64", "datetime64[ns]", "datetimetz"]).columns)
			#find all columns of type datetime
			
		for column in columns_of_type_datetime :
			dataframe[column] = to_datetime(dataframe[column], format='%Y-%m-%d %H:%M:%S', errors='coerce').dt.date
				#convert them to pandas datetime
				
		with ExcelWriter("output_file.xlsx", engine="openpyxl", date_format='DD/MM/YYYY', datetime_format='DD/MM/YYYY') as writer:
			dataframe.to_excel(writer, sheet_name="Sheet1", index=False)

dataframe = DataFrame({'ID': [1, 2, 3, 4, 5, 6, 7],
			'text_data': ['text1', 'text2', 'text3', 'text4', 'text5', 'text6', 'text7'],
			'number_data': [11, 12, 13, 14, 15, 16, 17,],
			'date_data': [Timestamp('2011-01-01 00:20:00'), Timestamp('2012-02-02 00:00:00'), Timestamp('2013-03-03 00:00:00'), Timestamp('2014-04-04 00:00:00'), Timestamp('2015-05-05 00:00:00'), Timestamp('2016-06-06 00:00:00'), Timestamp('2017-07-07 00:00:00')]})
			
			
Send_Dataframe_To_Excel(dataframe)

Here is what I expected in the output file :

OpenPyXL – 格式化的日期列在输出的Excel文件中被设置为’自定义’

Here is what the output file actually ends up looking like (as you can see, I can set the cell type to Short Date permanently, "fixing" the problem, but it's not the default behavior. Also, "Personnalisé" means "Custom" and "Date courte" means "Short date"):

OpenPyXL – 格式化的日期列在输出的Excel文件中被设置为’自定义’

答案1

得分: 1

这是一个已经公开的问题 (请参见,GH44284)。

暂时,如果可能的话,您需要切换到默认的引擎 (如果已安装的话,它就是 xlxwriter):

with ExcelWriter(
    'output_file.xlsx',
    engine='xlsxwriter', # <- 这里
    date_format='DD/MM/YYYY',
    datetime_format='DD/MM/YYYY') \
as writer:
    dataframe.to_excel(writer, sheet_name="Sheet1", index=False)

输出:

OpenPyXL – 格式化的日期列在输出的Excel文件中被设置为’自定义’

英文:

That's an open issue (see, GH44284).

For now, you need to switch (if possible) to the default engine (which is xlxwriter if installed) :

with ExcelWriter(
    &#39;output_file.xlsx&#39;,
    engine=&#39;xlsxwriter&#39;, # &lt;- here
    date_format=&#39;DD/MM/YYYY&#39;,
    datetime_format=&#39;DD/MM/YYYY&#39;) \
as writer:
    dataframe.to_excel(writer, sheet_name=&quot;Sheet1&quot;, index=False)

Output :

OpenPyXL – 格式化的日期列在输出的Excel文件中被设置为’自定义’

huangapple
  • 本文由 发表于 2023年6月19日 17:58:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/76505534.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定