问题

我有以下的 .xlxs 文件，我已经将它上传到了谷歌云盘这里，其中包含了 10 个不同的日期，我正在尝试使用 pandas.to_datetime 来清理混乱的日期。然而，我意识到这个函数对于第 2 行（1414/01/2019）和第 3 行（110/05/2019）的日期有一些限制。

显然，我想要修复第 2 行和第 3 行的日期分别为 (14/01/2019) 和 (11/05/2019)。但是如果这不能实现，是否有一种方法可以跳过具有无法使用 pandas.to_datetime 修复的日期格式的行？

import pandas as pd

test = pd.read_excel('Messy_Dates.xlsx', usecols=[0], header=None, skiprows=[0])

英文:

I have the following .xlxs file, I have uploaded it on google drive here, with 10 different dates and I was experimenting with pandas.to_datetime to clean the messy dates that I have. However I realised that the function has some limits specifically for the dates I have in row 2 (1414/01/2019) and 3 (110/05/2019).
Obviously I would like to fix the dates in row 2 and 3 to (14/01/2019) and (11/05/2019) respectively. However if this cannot be done is there a way on how I can skip over rows that have a date format that cannot be fixed with pandas.to_datetime?

import pandas as pd

test=pd.read_excel(&#39;Messy_Dates.xlsx&#39;, usecols=[0], header=None, skiprows = [0])

答案1

得分: 1

使用to_datetime，参数设置为dayfirst=True和errors='coerce'，以便通过Series.dropna删除无法解析的日期时间：

test=pd.read_excel('Messy_Dates.xlsx', usecols=[0], header=None, skiprows=[0])

out = pd.to_datetime(test[0], dayfirst=True, errors='coerce').dropna()
print(out)
0   2018-05-16
1   2018-05-17
4   2019-05-22
5   2019-05-24
6   2019-05-25
7   2019-05-28
Name: 0, dtype: datetime64[ns]

英文:

Use to_datetime with dayfirst=True and errors='coerce', so possible remove not parseable datetimes by Series.dropna:

test=pd.read_excel(&#39;Messy_Dates.xlsx&#39;, usecols=[0], header=None, skiprows = [0])

out = pd.to_datetime(test[0], dayfirst=True, errors=&#39;coerce&#39;).dropna()
print (out)
0   2018-05-16
1   2018-05-17
4   2019-05-22
5   2019-05-24
6   2019-05-25
7   2019-05-28
Name: 0, dtype: datetime64[ns]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何处理我的 .xlxs 文件中的以下混乱日期。

问题

答案1

散点图为什么没有绘制所有点？

如何使用Python将学生ID列表从CSV文件转换为电子邮件地址？

write a python program to input a number and count the occurrence of a given number in a given list

pandas DataFrame 查询在使用 where 时不起作用。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论