英文:
Convert a digit code into datetime format in a Pandas Dataframe
问题
我有一个pandas数据帧,其中有一列包含一个5位代码,代表一个日期和时间,工作方式如下:
1 - 前三位数字代表日期;
2 - 最后两位数字代表时:分:秒。
示例1: 第一行的代码是19501,所以195代表2009年1月1日,01部分表示从00:00:00到00:29:59的时间;
示例2: 在第二行中,有代码19502,它代表2009年1月1日,从00:30:00到00:59:59的时间;
示例3: 另一个示例,19711将代表2009年1月3日,从05:00:00到05:29:59的时间;
示例4: 最后一行的代码是73048,代表2010年6月20日,从23:30:00到23:59:59的时间。
有什么方法可以将这个5位代码转换为适当的日期时间格式吗?
英文:
I have a pandas dataframe that has a column with a 5 digit code that represent a day and time, and it works like following:
1 - The first three digits represent the day;
2 - The last two digits represent the hour:minute:second.
Example1: The first row have the code 19501, so the 195 represent the 1st of January of 2009 and the 01 part represents the time from 00:00:00 to 00:29:59;
Example2: In the second row i have the code 19502 which is the 1st of January of 2009 from 00:30:00 to 00:59:59;
Example3: Another example, 19711 would be the 3rd of January of 2009 from 05:00:00 to 05:29:59;
Example4: The last row is the code 73048, which represent the 20th of June of 2010 from 23:30:00 to 23:59:59.
Any ideas in how can I convert this 5 digit code into a proper datetime format?
答案1
得分: 1
I'm assuming your column is numeric.
import datetime as dt
df = pd.DataFrame({'code': [19501, 19502, 19711, 73048]})
df['days'] = pd.to_timedelta(df['code']//100, 'D')
df['half-hours'] = df['code']%100
df['hours'] = pd.to_timedelta(df['half-hours']//2, 'h')
df['minutes'] = pd.to_timedelta(df['half-hours']%2*30, 'm')
base_day = dt.datetime(2009, 1, 1) - dt.timedelta(days = 195)
df['dt0'] = base_day + df.days + df.hours + df.minutes - dt.timedelta(minutes = 30)
df['dt1'] = base_day + df.days + df.hours + df.minutes - dt.timedelta(seconds = 1)
英文:
I'm assuming your column is numeric.
import datetime as dt
df = pd.DataFrame({'code': [19501, 19502, 19711, 73048]})
df['days'] = pd.to_timedelta(df['code']//100, 'D')
df['half-hours'] = df['code']%100
df['hours'] = pd.to_timedelta(df['half-hours']//2, 'h')
df['minutes'] = pd.to_timedelta(df['half-hours']%2*30, 'm')
base_day = dt.datetime(2009, 1, 1) - dt.timedelta(days = 195)
df['dt0'] = base_day + df.days + df.hours + df.minutes - dt.timedelta(minutes = 30)
df['dt1'] = base_day + df.days + df.hours + df.minutes - dt.timedelta(seconds = 1)
答案2
得分: 1
以下是翻译后的内容:
一个简单的解决方案,将天数添加到`2008-06-20`,然后添加`(time-1)*30min`:
df = pd.DataFrame({'code': [19501, 19502, 19711, 73048]})
d, t = df['code'].divmod(100)
df['datetime'] = (
pd.to_timedelta(d, unit='D')
.add(pd.Timestamp('2008-06-20'))
.add(pd.to_timedelta((t-1)*30, unit='T'))
)
注意:这将给你周期的开始,如果要得到周期的结束,将(t-1)*30
替换为t*30-1
。
输出:
code datetime
0 19501 2009-01-01 00:00:00
1 19502 2009-01-01 00:30:00
2 19711 2009-01-03 05:00:00
3 73048 2010-06-20 23:30:00
英文:
A simple solution, add the days to 2008-06-20
, add the (time-1)*30min;
df = pd.DataFrame({'code': [19501, 19502, 19711, 73048]})
d, t = df['code'].divmod(100)
df['datetime'] = (
pd.to_timedelta(d, unit='D')
.add(pd.Timestamp('2008-06-20'))
.add(pd.to_timedelta((t-1)*30, unit='T'))
)
NB. this gives you the start of the period, for the end replace (t-1)*30
by t*30-1
.
Output:
code datetime
0 19501 2009-01-01 00:00:00
1 19502 2009-01-01 00:30:00
2 19711 2009-01-03 05:00:00
3 73048 2010-06-20 23:30:00
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论