从Pandas时间戳中获取日期的更清晰方法

huangapple go评论73阅读模式
英文:

Cleaner way of getting date out of pandas timestamp

问题

假设我有一个如下的数据框(DataFrame):

foo = pd.DataFrame(
    {
        'a': [1, 2, 3],
        'b': ['2021-01-05 05:15', '2021-01-06 11:10', '2021-03-01 09:00']
    }
)

我想将列 b 转换为日期时间,并仅提取日期部分。我可以这样做:

foo['date'] = pd.to_datetime(foo.b).dt.date

但尽管这会返回一个包含日期时间对象的NumPy数组,Pandas并不会识别它,而是将列的数据类型分配为 object

foo.dtypes

输出:
a        int64
b       object
date    object
dtype: object

我当然可以通过再次将它转换为日期时间来将其更改为日期时间:

foo['date'] = pd.to_datetime(pd.to_datetime(foo.b).dt.date)

我也可以使用字符串切片来获取日期部分:

foo['date2'] = pd.to_datetime(foo.b.str[:10])

但我觉得应该有一种更简洁的方法来从日期时间列中获取日期。

英文:

Suppose I have a df like so:

foo = pd.DataFrame(
    {
        'a': [1, 2, 3],
        'b': ['2021-01-05 05:15', '2021-01-06 11:10', '2021-03-01 09:00']
    }
)

And I want to convert column b to datetime and extract only the date part. I can do something like:

foo['date'] = pd.to_datetime(foo.b).dt.date

But even though this returns a Numpy array of datetime objects, Pandas doesn't recognise this and assigns an object dtype to the column:

foo.dtypes

Out: 
a        int64
b       object
date    object
dtype: object

I can of course get it to be a datetime by casting it to datetime again:

foo['date'] = pd.to_datetime(pd.to_datetime(foo.b).dt.date)

I can also get it with string slicing

foo['date2'] = pd.to_datetime(foo.b.str[:11])

But I feel like there must be a cleaner way of getting a date out of datetime column.

答案1

得分: 3

你可以使用 dt.normalize

foo['date'] = pd.to_datetime(foo['b']).dt.normalize()

输出:

>>> foo
   a                 b       date
0  1  2021-01-05 05:15 2021-01-05
1  2  2021-01-06 11:10 2021-01-06
2  3  2021-03-01 09:00 2021-03-01

>>> foo.dtypes
a                int64
b               object
date    datetime64[ns]
dtype: object

然而,你最后的解决方案也是一个不错的解决方案:pd.to_datetime(foo.b.str[:11])

英文:

You can use dt.normalize:

foo['date'] = pd.to_datetime(foo['b']).dt.normalize()

Output:

>>> foo
   a                 b       date
0  1  2021-01-05 05:15 2021-01-05
1  2  2021-01-06 11:10 2021-01-06
2  3  2021-03-01 09:00 2021-03-01

>>> foo.dtypes
a                int64
b               object
date    datetime64[ns]
dtype: object

However your last solution is a good solution pd.to_datetime(foo.b.str[:11]).

huangapple
  • 本文由 发表于 2023年7月11日 12:41:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/76658757.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定