我如何在数据框中筛选两个日期时间时间戳之间的数据?

huangapple go评论103阅读模式
英文:

How can I filter between two datetime timestamps in a dataframe?

问题

我有以下的数据框:

  1. price_lvl_size_total raw_end_of_event receive_timestamp
  2. 154 0 True 2023-05-29 00:15:01.000138338
  3. 160 0 True 2023-05-29 00:15:01.138503551

我想要筛选出在上述两个时间戳之间的行。

我尝试过:

  1. dateframe.between_time(2023-5-29 00:15:01.000138338, 2023-05-29 00:15:01.138503551)

但是我收到了一个invalid syntax错误。我也尝试将日期时间放入字符串中,但是我得到了以下错误:

  1. TypeError: Index must be DatetimeIndex
英文:

I have the following dataframe:

  1. price_lvl_size_total raw_end_of_event receive_timestamp
  2. 154 0 True 2023-05-29 00:15:01.000138338
  3. 160 0 True 2023-05-29 00:15:01.138503551

And I would like to filter for rows between the above two timestamps.

I have tried:

  1. dateframe.between_time(2023-5-29 00:15:01.000138338, 2023-05-29 00:15:01.138503551)

But I get a invalid syntax error. I have also tried putting the datetime inside strings but I get

  1. TypeError: Index must be DatetimeIndex

答案1

得分: 1

  1. # (if needed) to make sure the timestamps are cast to int64
  2. # file["receive_timestamp"] = file["receive_timestamp"].astype("int64")
  3. out = file.loc.between(1685319301000138338, 1685319301138503551)]
  4. #variant ?
  5. out = file.loc[pd.to_datetime(file["receive_timestamp"]).dt.microsecond.between(138, 138503)]
  6. print(out)
  7. price_lvl_size_total raw_end_of_event receive_timestamp
  8. 154 0 True 1685319301000138338
  9. 160 0 True 1685319301138503551
  10. ***Update :***
  11. Regarding the updated question, you can use :
  12. # (if needed)
  13. #file["receive_timestamp"] = pd.to_datetime(file["receive_timestamp"])
  14. start, end = "2023-05-29 00:15:01.000138338", "2023-05-29 00:15:01.138503551"
  15. out = file.loc.between(start, end)]
英文:

Don't you need to parse the timestamps first ? Anyways, you can try this :

  1. # (if needed) to make sure the timestamps are cast to int64
  2. # file["receive_timestamp"] = file["receive_timestamp"].astype("int64")
  3. out = file.loc.between(1685319301000138338, 1685319301138503551)]
  4. #variant ?
  5. out = file.loc[pd.to_datetime(file["receive_timestamp"]).dt.microsecond.between(138, 138503)]

Output :

  1. print(out)
  2. price_lvl_size_total raw_end_of_event receive_timestamp
  3. 154 0 True 1685319301000138338
  4. 160 0 True 1685319301138503551

Update :

Regarding the updated question, you can use :

  1. # (if needed)
  2. #file["receive_timestamp"] = pd.to_datetime(file["receive_timestamp"])
  3. start, end = "2023-05-29 00:15:01.000138338", "2023-05-29 00:15:01.138503551"
  4. out = file.loc.between(start, end)]

huangapple
  • 本文由 发表于 2023年6月1日 17:23:42
  • 转载请务必保留本文链接:https://go.coder-hub.com/76380434.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定