如何合并列并更改时区?

huangapple go评论96阅读模式
英文:

How to combine columns and change timezone?

问题

  1. import pandas as pd
  2. # 读取CSV文件
  3. df = pd.read_csv('ES.csv')
  4. # 选择最后10行数据
  5. df_small = df[-10:]
  6. # 合并'DATE'和'TIME'列并创建新列'dateandtime'
  7. df_small['dateandtime'] = df['DATE'].astype(str) + ' ' + df['TIME'].astype(str)
  8. # 打印数据信息
  9. df_small.info()
  10. # 将合并后的数据保存到新CSV文件'new.csv'
  11. df_small.to_csv('new.csv', index=False)

在这段代码中,我们读取了CSV文件,并选择了最后10行数据。然后,我们创建了一个新列'dateandtime',将'DATE'和'TIME'列合并到其中。最后,我们将处理后的数据保存到'new.csv'文件中(不包括索引列)。

英文:

如何合并列并更改时区?

I want to merge columns DATE and TIME into one "date&time" column and change timezone from Brussel to New York. I tried:

  1. import pandas as pd
  2. #df = pd.read_csv('ES.csv', parse_dates=True)
  3. df = pd.read_csv('ES.csv')
  4. df_small = df[-10:]
  5. df_small['dateandtime'] = df['DATE'].astype(str) + df['TIME'].astype(str)
  6. df_small.info()
  7. df_small.to_frame(name='new')
  8. df_small.to_frame(name='new').to_csv('new.csv')

Errors and warnings:

  1. SettingWithCopyWarning:
  2. A value is trying to be set on a copy of a slice from a DataFrame.
  3. Try using .loc[row_indexer,col_indexer] = value instead
  4. df_small.to_frame(name='new')
  5. AttributeError: 'DataFrame' object has no attribute 'to_frame'

If you get it working you can have the full CSV file which I think is worth 500$-3000$ (10 years of financial 1 minute interval S&P500 futures data).

答案1

得分: 0

你可以使用 .cat 来合并列,.pop() 来移除旧列,最后使用 .to_localize() 转换为布鲁塞尔时区,然后使用 .tz_convert() 转换为纽约时区。将DatetimeIndex转换为字符串将比保留为datetime需要更多时间:

  1. df_small["dateandtime"] = (pd
  2. .to_datetime(df_small.pop("DATE").str.cat(df_small.pop("TIME"), sep=" "))
  3. .dt.tz_localize("Europe/Brussels")
  4. .dt.tz_convert("America/New_York")
  5. )
  6. df_small = df_small.set_index(pd.DatetimeIndex(df_small.pop("dateandtime")))
  7. df_small.index = df_small.index.strftime("%Y-%m-%d %H:%M")

输出:

  1. OPEN HIGH LOW CLOSE VOLUME
  2. dateandtime
  3. 2010-01-03 18:01 1113.75 1114.75 1113.25 1114.25 2185
  4. 2010-01-03 18:02 1114.25 1115.25 1114.25 1114.75 437
  5. 2010-01-03 18:03 1114.75 1114.75 1114.25 1114.5 596
  6. 2010-01-03 18:04 1114.25 1114.75 1114.25 1114.5 131
  7. 2010-01-03 18:05 1114.5 1114.5 1114.25 1114.25 166
英文:

You can use .cat to combine columns .pop() to remove old columns finally .to_localize() as Brussels tz and .tz_convert() to New York tz. Converting DatetimeIndex to str will take much more time than leaving it as datetime:

  1. df_small["dateandtime"] = (pd
  2. .to_datetime(df_small.pop("DATE").str.cat(df_small.pop("TIME"), sep=" "))
  3. .dt.tz_localize("Europe/Brussels")
  4. .dt.tz_convert("America/New_York")
  5. )
  6. df_small = df_small.set_index(pd.DatetimeIndex(df_small.pop("dateandtime")))
  7. df_small.index = df_small.index.strftime("%Y-%m-%d %H:%M")

Output:

  1. OPEN HIGH LOW CLOSE VOLUME
  2. dateandtime
  3. 2010-01-03 18:01 1113.75 1114.75 1113.25 1114.25 2185
  4. 2010-01-03 18:02 1114.25 1115.25 1114.25 1114.75 437
  5. 2010-01-03 18:03 1114.75 1114.75 1114.25 1114.5 596
  6. 2010-01-03 18:04 1114.25 1114.75 1114.25 1114.5 131
  7. 2010-01-03 18:05 1114.5 1114.5 1114.25 1114.25 166

huangapple
  • 本文由 发表于 2023年7月23日 22:10:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/76748663.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定