英文:
Converting Timezone In Python
问题
我有一个数据框,格式为(store_id,day,start_time_local,end_time_local),看起来像这样:
1481966498820158979 0 "00:00:00" "00:10:00"
1481966498820158979 0 "11:00:00" "23:59:00"
1481966498820158979 1 "00:00:00" "00:10:00"
1481966498820158979 1 "11:00:00" "23:59:00"
1481966498820158979 2 "00:00:00" "00:10:00"
1481966498820158979 2 "11:00:00" "23:59:00"
1481966498820158979 3 "00:00:00" "00:10:00"
1481966498820158979 3 "11:00:00" "23:59:00"
1481966498820158979 4 "00:00:00" "00:10:00"
1481966498820158979 4 "11:00:00" "23:59:00"
1481966498820158979 5 "00:00:00" "00:10:00"
1481966498820158979 5 "11:00:00" "23:59:00"
1481966498820158979 6 "00:00:00" "00:10:00"
1481966498820158979 6 "11:00:00" "23:59:00"
该数据集的时区是本地时区,我想将其转换为UTC时区。
这是我编写的函数,但没有正确处理日期溢出:
# 定义一个函数,将本地时间转换为UTC并处理日期更改
def local_to_utc_with_day(row, timezone_string):
timezone_obj = pytz.timezone(timezone_string)
local_start_time = timezone_obj.localize(row['start_time_local'])
local_end_time = timezone_obj.localize(row['end_time_local'])
utc_start_time = local_start_time.astimezone(pytz.UTC)
utc_end_time = local_end_time.astimezone(pytz.UTC)
# 检查由于时区偏移是否导致日期更改
if utc_start_time.date() != row['start_time_local'].date():
row['day'] += 1
row['start_time_utc_formatted'] = utc_start_time.strftime('%H:%M:%S') # UTC时间格式为HH:MM:SS
row['end_time_utc_formatted'] = utc_end_time.strftime('%H:%M:%S')
row['start_time_local'] = local_start_time.strftime('%H:%M:%S') # 保持HH:MM:SS格式
row['end_time_local'] = local_end_time.strftime('%H:%M:%S') # 保持HH:MM:SS格式
return row
请问有什么我可以帮助你的吗?
英文:
I have dataframe in format (store_id, day, start_time_local, end_time_local)which looks like this
1481966498820158979 0 "00:00:00" "00:10:00"
1481966498820158979 0 "11:00:00" "23:59:00"
1481966498820158979 1 "00:00:00" "00:10:00"
1481966498820158979 1 "11:00:00" "23:59:00"
1481966498820158979 2 "00:00:00" "00:10:00"
1481966498820158979 2 "11:00:00" "23:59:00"
1481966498820158979 3 "00:00:00" "00:10:00"
1481966498820158979 3 "11:00:00" "23:59:00"
1481966498820158979 4 "00:00:00" "00:10:00"
1481966498820158979 4 "11:00:00" "23:59:00"
1481966498820158979 5 "00:00:00" "00:10:00"
1481966498820158979 5 "11:00:00" "23:59:00"
1481966498820158979 6 "00:00:00" "00:10:00"
1481966498820158979 6 "11:00:00" "23:59:00"
The timezone for this dataset is local, I want to convert it to utc
This is the function I wrote
not handling overflows into dates properly
# Define a function to convert local time to UTC and handle day change
def local_to_utc_with_day(row, timezone_string):
timezone_obj = pytz.timezone(timezone_string)
local_start_time = timezone_obj.localize(row['start_time_local'])
local_end_time = timezone_obj.localize(row['end_time_local'])
utc_start_time = local_start_time.astimezone(pytz.UTC)
utc_end_time = local_end_time.astimezone(pytz.UTC)
# Check if the day has changed due to timezone shift
if utc_start_time.date() != row['start_time_local'].date():
row['day'] += 1
row['start_time_utc_formatted'] = utc_start_time.strftime('%H:%M:%S') # UTC time in HH:MM:SS format
row['end_time_utc_formatted'] = utc_end_time.strftime('%H:%M:%S')
row['start_time_local'] = local_start_time.strftime('%H:%M:%S') # Keep HH:MM:SS format
row['end_time_local'] = local_end_time.strftime('%H:%M:%S') # Keep HH:MM:SS format
return row
答案1
得分: 1
IIUC(如果我理解正确):
伪造数据:
import pandas as pd
import pendulum
data = {
"store_id": [1, 2, 3, 4],
"day": [0, 0, 1, 1],
"start_time_local": ["00:00:00", "11:00:00", "00:00:00", "11:00:00"],
"end_time_local": ["00:10:00", "23:59:00", "00:10:00", "23:59:00"]
}
df = pd.DataFrame(data=data)
代码:
cols = ["start_time_local", "end_time_local"]
tz = pendulum.now().timezone_name
for col in cols:
df[col] = df.apply(
func=lambda x: pd.to_datetime(arg=x[col], format="mixed").tz_localize(tz).tz_convert("UTC").tz_localize(None),
axis=1
)
print(df)
输出(我的本地时间):
store_id day start_time_local end_time_local
0 1 0 2023-08-08 07:00:00 2023-08-08 07:10:00
1 2 0 2023-08-08 18:00:00 2023-08-09 06:59:00
2 3 1 2023-08-08 07:00:00 2023-08-08 07:10:00
3 4 1 2023-08-08 18:00:00 2023-08-09 06:59:00
以上是给定代码的翻译结果。
英文:
IIUC:
Fake Data:
import pandas as pd
import pendulum
data = {
"store_id": [1, 2, 3, 4],
"day": [0, 0, 1, 1],
"start_time_local": ["00:00:00", "11:00:00", "00:00:00", "11:00:00"],
"end_time_local": ["00:10:00", "23:59:00", "00:10:00", "23:59:00"]
}
df = pd.DataFrame(data=data)
Code:
cols = ["start_time_local", "end_time_local"]
tz = pendulum.now().timezone_name
for col in cols:
df[col] = df.apply(
func=lambda x: pd.to_datetime(arg=x[col], format="mixed").tz_localize(tz).tz_convert("UTC").tz_localize(None),
axis=1
)
print(df)
Output (my local time):
store_id day start_time_local end_time_local
0 1 0 2023-08-08 07:00:00 2023-08-08 07:10:00
1 2 0 2023-08-08 18:00:00 2023-08-09 06:59:00
2 3 1 2023-08-08 07:00:00 2023-08-08 07:10:00
3 4 1 2023-08-08 18:00:00 2023-08-09 06:59:00
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论