基于Pandas时间持续性的条件检查

huangapple go评论62阅读模式
英文:

Condition check based on Pandas time duration

问题

以下是数据框,其中时间戳为索引,分辨率为1秒。数据框包含了整天的数值。有一个频率列和一个功率列,功率列是根据频率列的条件创建的。

条件:如果频率小于49.99,则功率为-1000,如果大于50.01,则功率为1000。在某些情况下,频率会偏离49.99 - 50.01的范围超过5分钟。

时间戳 频率 功率[kW]
2021-01-01 00:00:00 49.92 1000
2021-01-01 00:00:01 49.96 1000
2021-01-01 00:00:02 50.01 0
2021-01-01 00:00:03 50.13 1000
2021-01-01 00:00:04 50.02 1000
2021-01-01 00:00:05 49.97 -1000
2021-01-01 00:00:06 49.90 -1000
2021-01-01 00:00:07 50.01 0

...

时间戳 频率 功率[kW]
2021-01-01 00:05:00 50.11 1000

请问如何在功率列上添加另一个条件,即规定功率列中的值(1000或-1000)在连续5分钟内不能特定地为1000或-1000。

我尝试过使用滚动窗口,但我遇到的建议是在5分钟窗口上应用 .mean() 或 .sum() 来决定特定行的值。这忽视了在这5分钟间隔内,频率可能会朝任何方向偏离的事实。

我正在努力解决的问题是定义功率值在连续5分钟内不能相同。非常感谢任何帮助。提前感谢!

英文:

Below given is the dataframe which has timestamp as index & resolution is 1 second. The dataframe has values for a whole day. There is a frequency column and a power column which was created using a condition on frequency column.

Condition: If frequency is less than 49.99 than power is -1000 and if it's more than 50.01 the power is 1000. There are instances when frequency deviates from the range of 49.99 - 50.01 in one direction for longer than 5 minutes.

timestamp frequency Power[kW]
2021-01-01 00:00:00 49.92 1000
2021-01-01 00:00:01 49.96 1000
2021-01-01 00:00:02 50.01 0
2021-01-01 00:00:03 50.13 1000
2021-01-01 00:00:04 50.02 1000
2021-01-01 00:00:05 49.97 -1000
2021-01-01 00:00:06 49.90 -1000
2021-01-01 00:00:07 50.01 0

.
.
.
.
.
.
.
.
.

timestamp frequency Power[kW]
2021-01-01 00:05:00 50.11 1000

Could someone please guide on how to add another condition on the power column that states that the values (1000 or -1000) in the power column cannot specifcially be 1000 or -1000 in continuation for more than 5 min.

I have tried using a rolling window. But the suggestions, I have came across apply .mean() or .sum() on the 5 mins window to decide for the specific row. Which ignores the fact that within this 5-minute interval, the frequency may deviate to either direction in the range.

The issue I am struggling with is to define that the power values shouldn't continuously be same for 5 mins. Highly appreciate any help. Thanks in advance!

答案1

得分: 1

不需要在这里使用求和或平均值,而只需要检查所有的值是否相同,可以使用以下方法实现(我假设您已经创建了一个带有power列的滚动窗口):

df['check'] = rolling_window.apply(lambda x: len(set(x)) == 1)
英文:

You don't need to use sum or mean here instead you just need to check if all of the values are same or not which can be achieved using something like below (Im assuming you have created a rolling window with the power column

df['check'] = rolling_window.apply(lambda x: len(set(x)) == 1)

huangapple
  • 本文由 发表于 2023年7月10日 22:20:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/76654671.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定