将数字代码转换为Pandas数据框中的日期时间格式

huangapple go评论74阅读模式
英文:

Convert a digit code into datetime format in a Pandas Dataframe

问题

我有一个pandas数据帧,其中有一列包含一个5位代码,代表一个日期和时间,工作方式如下:

1 - 前三位数字代表日期;

2 - 最后两位数字代表时:分:秒。

示例1: 第一行的代码是19501,所以195代表2009年1月1日,01部分表示从00:00:00到00:29:59的时间;

示例2: 在第二行中,有代码19502,它代表2009年1月1日,从00:30:00到00:59:59的时间;

示例3: 另一个示例,19711将代表2009年1月3日,从05:00:00到05:29:59的时间;

示例4: 最后一行的代码是73048,代表2010年6月20日,从23:30:00到23:59:59的时间。

有什么方法可以将这个5位代码转换为适当的日期时间格式吗?

英文:

I have a pandas dataframe that has a column with a 5 digit code that represent a day and time, and it works like following:

1 - The first three digits represent the day;

2 - The last two digits represent the hour:minute:second.

Example1: The first row have the code 19501, so the 195 represent the 1st of January of 2009 and the 01 part represents the time from 00:00:00 to 00:29:59;

Example2: In the second row i have the code 19502 which is the 1st of January of 2009 from 00:30:00 to 00:59:59;

Example3: Another example, 19711 would be the 3rd of January of 2009 from 05:00:00 to 05:29:59;

Example4: The last row is the code 73048, which represent the 20th of June of 2010 from 23:30:00 to 23:59:59.

Any ideas in how can I convert this 5 digit code into a proper datetime format?

答案1

得分: 1

I'm assuming your column is numeric.

import datetime as dt

df = pd.DataFrame({'code': [19501, 19502, 19711, 73048]})
df['days'] = pd.to_timedelta(df['code']//100, 'D')
df['half-hours'] = df['code']%100
df['hours'] = pd.to_timedelta(df['half-hours']//2, 'h')
df['minutes'] = pd.to_timedelta(df['half-hours']%2*30, 'm')

base_day = dt.datetime(2009, 1, 1) - dt.timedelta(days = 195)

df['dt0'] = base_day + df.days + df.hours + df.minutes - dt.timedelta(minutes = 30)
df['dt1'] = base_day + df.days + df.hours + df.minutes - dt.timedelta(seconds = 1)
英文:

I'm assuming your column is numeric.

import datetime as dt

df = pd.DataFrame({'code': [19501, 19502, 19711, 73048]})
df['days'] = pd.to_timedelta(df['code']//100, 'D')
df['half-hours'] = df['code']%100
df['hours'] = pd.to_timedelta(df['half-hours']//2, 'h')
df['minutes'] = pd.to_timedelta(df['half-hours']%2*30, 'm')

base_day = dt.datetime(2009, 1, 1) - dt.timedelta(days = 195)

df['dt0'] = base_day + df.days + df.hours + df.minutes - dt.timedelta(minutes = 30)
df['dt1'] = base_day + df.days + df.hours + df.minutes - dt.timedelta(seconds = 1)

答案2

得分: 1

以下是翻译后的内容:

一个简单的解决方案将天数添加到`2008-06-20`,然后添加`(time-1)*30min`:

df = pd.DataFrame({'code': [19501, 19502, 19711, 73048]})

d, t = df['code'].divmod(100)

df['datetime'] = (
   pd.to_timedelta(d, unit='D')
     .add(pd.Timestamp('2008-06-20'))
     .add(pd.to_timedelta((t-1)*30, unit='T'))
)

注意:这将给你周期的开始,如果要得到周期的结束,将(t-1)*30替换为t*30-1

输出:

    code            datetime
0  19501 2009-01-01 00:00:00
1  19502 2009-01-01 00:30:00
2  19711 2009-01-03 05:00:00
3  73048 2010-06-20 23:30:00
英文:

A simple solution, add the days to 2008-06-20, add the (time-1)*30min;

df = pd.DataFrame({'code': [19501, 19502, 19711, 73048]})

d, t = df['code'].divmod(100)

df['datetime'] = (
   pd.to_timedelta(d, unit='D')
     .add(pd.Timestamp('2008-06-20'))
     .add(pd.to_timedelta((t-1)*30, unit='T'))
)

NB. this gives you the start of the period, for the end replace (t-1)*30 by t*30-1.

Output:


    code            datetime
0  19501 2009-01-01 00:00:00
1  19502 2009-01-01 00:30:00
2  19711 2009-01-03 05:00:00
3  73048 2010-06-20 23:30:00

huangapple
  • 本文由 发表于 2023年2月14日 04:37:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/75440932.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定