英文:
How to combine columns and change timezone?
问题
import pandas as pd
# 读取CSV文件
df = pd.read_csv('ES.csv')
# 选择最后10行数据
df_small = df[-10:]
# 合并'DATE'和'TIME'列并创建新列'dateandtime'
df_small['dateandtime'] = df['DATE'].astype(str) + ' ' + df['TIME'].astype(str)
# 打印数据信息
df_small.info()
# 将合并后的数据保存到新CSV文件'new.csv'
df_small.to_csv('new.csv', index=False)
在这段代码中,我们读取了CSV文件,并选择了最后10行数据。然后,我们创建了一个新列'dateandtime',将'DATE'和'TIME'列合并到其中。最后,我们将处理后的数据保存到'new.csv'文件中(不包括索引列)。
英文:
I want to merge columns DATE
and TIME
into one "date&time" column and change timezone from Brussel to New York. I tried:
import pandas as pd
#df = pd.read_csv('ES.csv', parse_dates=True)
df = pd.read_csv('ES.csv')
df_small = df[-10:]
df_small['dateandtime'] = df['DATE'].astype(str) + df['TIME'].astype(str)
df_small.info()
df_small.to_frame(name='new')
df_small.to_frame(name='new').to_csv('new.csv')
Errors and warnings:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
df_small.to_frame(name='new')
AttributeError: 'DataFrame' object has no attribute 'to_frame'
If you get it working you can have the full CSV file which I think is worth 500$-3000$ (10 years of financial 1 minute interval S&P500 futures data).
答案1
得分: 0
你可以使用 .cat 来合并列,.pop() 来移除旧列,最后使用 .to_localize() 转换为布鲁塞尔时区,然后使用 .tz_convert() 转换为纽约时区。将DatetimeIndex转换为字符串将比保留为datetime需要更多时间:
df_small["dateandtime"] = (pd
.to_datetime(df_small.pop("DATE").str.cat(df_small.pop("TIME"), sep=" "))
.dt.tz_localize("Europe/Brussels")
.dt.tz_convert("America/New_York")
)
df_small = df_small.set_index(pd.DatetimeIndex(df_small.pop("dateandtime")))
df_small.index = df_small.index.strftime("%Y-%m-%d %H:%M")
输出:
OPEN HIGH LOW CLOSE VOLUME
dateandtime
2010-01-03 18:01 1113.75 1114.75 1113.25 1114.25 2185
2010-01-03 18:02 1114.25 1115.25 1114.25 1114.75 437
2010-01-03 18:03 1114.75 1114.75 1114.25 1114.5 596
2010-01-03 18:04 1114.25 1114.75 1114.25 1114.5 131
2010-01-03 18:05 1114.5 1114.5 1114.25 1114.25 166
英文:
You can use .cat to combine columns .pop() to remove old columns finally .to_localize() as Brussels tz and .tz_convert() to New York tz. Converting DatetimeIndex to str will take much more time than leaving it as datetime:
df_small["dateandtime"] = (pd
.to_datetime(df_small.pop("DATE").str.cat(df_small.pop("TIME"), sep=" "))
.dt.tz_localize("Europe/Brussels")
.dt.tz_convert("America/New_York")
)
df_small = df_small.set_index(pd.DatetimeIndex(df_small.pop("dateandtime")))
df_small.index = df_small.index.strftime("%Y-%m-%d %H:%M")
Output:
OPEN HIGH LOW CLOSE VOLUME
dateandtime
2010-01-03 18:01 1113.75 1114.75 1113.25 1114.25 2185
2010-01-03 18:02 1114.25 1115.25 1114.25 1114.75 437
2010-01-03 18:03 1114.75 1114.75 1114.25 1114.5 596
2010-01-03 18:04 1114.25 1114.75 1114.25 1114.5 131
2010-01-03 18:05 1114.5 1114.5 1114.25 1114.25 166
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论