将数字代码转换为Pandas数据框中的日期时间格式

huangapple go评论103阅读模式
英文:

Convert a digit code into datetime format in a Pandas Dataframe

问题

我有一个pandas数据帧,其中有一列包含一个5位代码,代表一个日期和时间,工作方式如下:

1 - 前三位数字代表日期;

2 - 最后两位数字代表时:分:秒。

示例1: 第一行的代码是19501,所以195代表2009年1月1日,01部分表示从00:00:00到00:29:59的时间;

示例2: 在第二行中,有代码19502,它代表2009年1月1日,从00:30:00到00:59:59的时间;

示例3: 另一个示例,19711将代表2009年1月3日,从05:00:00到05:29:59的时间;

示例4: 最后一行的代码是73048,代表2010年6月20日,从23:30:00到23:59:59的时间。

有什么方法可以将这个5位代码转换为适当的日期时间格式吗?

英文:

I have a pandas dataframe that has a column with a 5 digit code that represent a day and time, and it works like following:

1 - The first three digits represent the day;

2 - The last two digits represent the hour:minute:second.

Example1: The first row have the code 19501, so the 195 represent the 1st of January of 2009 and the 01 part represents the time from 00:00:00 to 00:29:59;

Example2: In the second row i have the code 19502 which is the 1st of January of 2009 from 00:30:00 to 00:59:59;

Example3: Another example, 19711 would be the 3rd of January of 2009 from 05:00:00 to 05:29:59;

Example4: The last row is the code 73048, which represent the 20th of June of 2010 from 23:30:00 to 23:59:59.

Any ideas in how can I convert this 5 digit code into a proper datetime format?

答案1

得分: 1

  1. I'm assuming your column is numeric.
  2. import datetime as dt
  3. df = pd.DataFrame({'code': [19501, 19502, 19711, 73048]})
  4. df['days'] = pd.to_timedelta(df['code']//100, 'D')
  5. df['half-hours'] = df['code']%100
  6. df['hours'] = pd.to_timedelta(df['half-hours']//2, 'h')
  7. df['minutes'] = pd.to_timedelta(df['half-hours']%2*30, 'm')
  8. base_day = dt.datetime(2009, 1, 1) - dt.timedelta(days = 195)
  9. df['dt0'] = base_day + df.days + df.hours + df.minutes - dt.timedelta(minutes = 30)
  10. df['dt1'] = base_day + df.days + df.hours + df.minutes - dt.timedelta(seconds = 1)
英文:

I'm assuming your column is numeric.

  1. import datetime as dt
  2. df = pd.DataFrame({'code': [19501, 19502, 19711, 73048]})
  3. df['days'] = pd.to_timedelta(df['code']//100, 'D')
  4. df['half-hours'] = df['code']%100
  5. df['hours'] = pd.to_timedelta(df['half-hours']//2, 'h')
  6. df['minutes'] = pd.to_timedelta(df['half-hours']%2*30, 'm')
  7. base_day = dt.datetime(2009, 1, 1) - dt.timedelta(days = 195)
  8. df['dt0'] = base_day + df.days + df.hours + df.minutes - dt.timedelta(minutes = 30)
  9. df['dt1'] = base_day + df.days + df.hours + df.minutes - dt.timedelta(seconds = 1)

答案2

得分: 1

以下是翻译后的内容:

  1. 一个简单的解决方案将天数添加到`2008-06-20`然后添加`(time-1)*30min`
  2. df = pd.DataFrame({'code': [19501, 19502, 19711, 73048]})
  3. d, t = df['code'].divmod(100)
  4. df['datetime'] = (
  5. pd.to_timedelta(d, unit='D')
  6. .add(pd.Timestamp('2008-06-20'))
  7. .add(pd.to_timedelta((t-1)*30, unit='T'))
  8. )

注意:这将给你周期的开始,如果要得到周期的结束,将(t-1)*30替换为t*30-1

输出:

  1. code datetime
  2. 0 19501 2009-01-01 00:00:00
  3. 1 19502 2009-01-01 00:30:00
  4. 2 19711 2009-01-03 05:00:00
  5. 3 73048 2010-06-20 23:30:00
英文:

A simple solution, add the days to 2008-06-20, add the (time-1)*30min;

  1. df = pd.DataFrame({'code': [19501, 19502, 19711, 73048]})
  2. d, t = df['code'].divmod(100)
  3. df['datetime'] = (
  4. pd.to_timedelta(d, unit='D')
  5. .add(pd.Timestamp('2008-06-20'))
  6. .add(pd.to_timedelta((t-1)*30, unit='T'))
  7. )

NB. this gives you the start of the period, for the end replace (t-1)*30 by t*30-1.

Output:

  1. code datetime
  2. 0 19501 2009-01-01 00:00:00
  3. 1 19502 2009-01-01 00:30:00
  4. 2 19711 2009-01-03 05:00:00
  5. 3 73048 2010-06-20 23:30:00

huangapple
  • 本文由 发表于 2023年2月14日 04:37:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/75440932.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定