如何处理我的 .xlxs 文件中的以下混乱日期。

huangapple go评论68阅读模式
英文:

How can I go about fixing the following messy dates in my .xlxs file

问题

我有以下的 .xlxs 文件,我已经将它上传到了谷歌云盘 这里,其中包含了 10 个不同的日期,我正在尝试使用 pandas.to_datetime 来清理混乱的日期。然而,我意识到这个函数对于第 2 行(1414/01/2019)和第 3 行(110/05/2019)的日期有一些限制。

显然,我想要修复第 2 行和第 3 行的日期分别为 (14/01/2019) 和 (11/05/2019)。但是如果这不能实现,是否有一种方法可以跳过具有无法使用 pandas.to_datetime 修复的日期格式的行?

import pandas as pd

test = pd.read_excel('Messy_Dates.xlsx', usecols=[0], header=None, skiprows=[0])
英文:

I have the following .xlxs file, I have uploaded it on google drive here, with 10 different dates and I was experimenting with pandas.to_datetime to clean the messy dates that I have. However I realised that the function has some limits specifically for the dates I have in row 2 (1414/01/2019) and 3 (110/05/2019).
Obviously I would like to fix the dates in row 2 and 3 to (14/01/2019) and (11/05/2019) respectively. However if this cannot be done is there a way on how I can skip over rows that have a date format that cannot be fixed with pandas.to_datetime?

import pandas as pd

test=pd.read_excel('Messy_Dates.xlsx', usecols=[0], header=None, skiprows = [0])

答案1

得分: 1

使用to_datetime,参数设置为dayfirst=Trueerrors='coerce',以便通过Series.dropna删除无法解析的日期时间:

test=pd.read_excel('Messy_Dates.xlsx', usecols=[0], header=None, skiprows=[0])

out = pd.to_datetime(test[0], dayfirst=True, errors='coerce').dropna()
print(out)
0   2018-05-16
1   2018-05-17
4   2019-05-22
5   2019-05-24
6   2019-05-25
7   2019-05-28
Name: 0, dtype: datetime64[ns]
英文:

Use to_datetime with dayfirst=True and errors='coerce', so possible remove not parseable datetimes by Series.dropna:

test=pd.read_excel('Messy_Dates.xlsx', usecols=[0], header=None, skiprows = [0])

out = pd.to_datetime(test[0], dayfirst=True, errors='coerce').dropna()
print (out)
0   2018-05-16
1   2018-05-17
4   2019-05-22
5   2019-05-24
6   2019-05-25
7   2019-05-28
Name: 0, dtype: datetime64[ns]

huangapple
  • 本文由 发表于 2023年5月17日 18:20:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/76271041.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定