从Pandas时间戳中获取日期的更清晰方法

huangapple go评论100阅读模式
英文:

Cleaner way of getting date out of pandas timestamp

问题

假设我有一个如下的数据框(DataFrame):

  1. foo = pd.DataFrame(
  2. {
  3. 'a': [1, 2, 3],
  4. 'b': ['2021-01-05 05:15', '2021-01-06 11:10', '2021-03-01 09:00']
  5. }
  6. )

我想将列 b 转换为日期时间,并仅提取日期部分。我可以这样做:

  1. foo['date'] = pd.to_datetime(foo.b).dt.date

但尽管这会返回一个包含日期时间对象的NumPy数组,Pandas并不会识别它,而是将列的数据类型分配为 object

  1. foo.dtypes
  2. 输出:
  3. a int64
  4. b object
  5. date object
  6. dtype: object

我当然可以通过再次将它转换为日期时间来将其更改为日期时间:

  1. foo['date'] = pd.to_datetime(pd.to_datetime(foo.b).dt.date)

我也可以使用字符串切片来获取日期部分:

  1. foo['date2'] = pd.to_datetime(foo.b.str[:10])

但我觉得应该有一种更简洁的方法来从日期时间列中获取日期。

英文:

Suppose I have a df like so:

  1. foo = pd.DataFrame(
  2. {
  3. 'a': [1, 2, 3],
  4. 'b': ['2021-01-05 05:15', '2021-01-06 11:10', '2021-03-01 09:00']
  5. }
  6. )

And I want to convert column b to datetime and extract only the date part. I can do something like:

  1. foo['date'] = pd.to_datetime(foo.b).dt.date

But even though this returns a Numpy array of datetime objects, Pandas doesn't recognise this and assigns an object dtype to the column:

  1. foo.dtypes
  2. Out:
  3. a int64
  4. b object
  5. date object
  6. dtype: object

I can of course get it to be a datetime by casting it to datetime again:

  1. foo['date'] = pd.to_datetime(pd.to_datetime(foo.b).dt.date)

I can also get it with string slicing

  1. foo['date2'] = pd.to_datetime(foo.b.str[:11])

But I feel like there must be a cleaner way of getting a date out of datetime column.

答案1

得分: 3

你可以使用 dt.normalize

  1. foo['date'] = pd.to_datetime(foo['b']).dt.normalize()

输出:

  1. >>> foo
  2. a b date
  3. 0 1 2021-01-05 05:15 2021-01-05
  4. 1 2 2021-01-06 11:10 2021-01-06
  5. 2 3 2021-03-01 09:00 2021-03-01
  6. >>> foo.dtypes
  7. a int64
  8. b object
  9. date datetime64[ns]
  10. dtype: object

然而,你最后的解决方案也是一个不错的解决方案:pd.to_datetime(foo.b.str[:11])

英文:

You can use dt.normalize:

  1. foo['date'] = pd.to_datetime(foo['b']).dt.normalize()

Output:

  1. >>> foo
  2. a b date
  3. 0 1 2021-01-05 05:15 2021-01-05
  4. 1 2 2021-01-06 11:10 2021-01-06
  5. 2 3 2021-03-01 09:00 2021-03-01
  6. >>> foo.dtypes
  7. a int64
  8. b object
  9. date datetime64[ns]
  10. dtype: object

However your last solution is a good solution pd.to_datetime(foo.b.str[:11]).

huangapple
  • 本文由 发表于 2023年7月11日 12:41:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/76658757.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定