Calculate time difference in seconds in pandas if row in other column matches.

huangapple go评论63阅读模式
英文:

Calculate time difference in seconds in pandas if row in other column matches

问题

# 我在一个pandas DataFrame中使用3列('time','SEC','DeviceName')进行工作。 
# 使用以下代码计算'time'列中行之间的差异,并将结果赋给'SEC'列:

df['SEC'] = df['time'].diff().dt.total_seconds()

# 'DeviceName'列可能包含多种不同的设备,因此我需要修改代码,
# 只有在设备名称与前一行匹配时才执行计算,否则将'SEC'列赋值为0。

# 例如:

# 时间                秒         设备名称
# 4/18/2023 2:43:00             Applied_AA-12
# 4/18/2023 3:13:00   1800      Applied_AA-12  # 计算因为设备名称与前一行匹配
# 4/18/2023 3:35:53   0         Applied_AA-14  # 不计算因为设备名称与前一行不匹配
# 4/18/2023 3:36:03   10        Applied_AA-14  # 计算因为设备名称与前一行匹配
英文:

I'm working with 3 columns ('time', 'SEC', 'DeviceName') in a pandas DataFrame. I'm using the following code to calculate the differences between rows in the 'time' column and assign to the 'SEC' column:

df['SEC'] = df['time'].diff().dt.total_seconds()

The 'DeviceName' column can have several different devices, so I need to modify this to only perform the calculation if the device name matches the previous row, otherwise assign a 0 to 'SEC'.

For example:

time                    SEC       DeviceName
4/18/2023 2:43:00                 Applied_AA-12
4/18/2023 3:13:00       1800      Applied_AA-12  # calculate because the device name matches the previous row
4/18/2023 3:35:53       0         Applied_AA-14  # don't calculate because the device name doesn't match the previous row
4/18/2023 3:36:03       10        Applied_AA-14  # calculate because the device name matches the previous row

答案1

得分: 1

你可以使用 GroupyBy.diff

df["SEC"] = df.groupby("DeviceName")["time"].diff().dt.total_seconds().fillna(0)

df.at[0, "SEC"] = np.nan # 这是可选的吗?

输出:

print(df)

                 time     DeviceName     SEC
0 2023-04-18 02:43:00  Applied_AA-12     NaN
1 2023-04-18 03:13:00  Applied_AA-12 1800.00
2 2023-04-18 03:35:53  Applied_AA-14    0.00
3 2023-04-18 03:36:03  Applied_AA-14   10.00
英文:

You can use GroupyBy.diff :

df["SEC"] = df.groupby("DeviceName")["time"].diff().dt.total_seconds().fillna(0)

df.at[0, "SEC"] = np.nan # is this optional ?

Output :

print(df)

                 time     DeviceName     SEC
0 2023-04-18 02:43:00  Applied_AA-12     NaN
1 2023-04-18 03:13:00  Applied_AA-12 1800.00
2 2023-04-18 03:35:53  Applied_AA-14    0.00
3 2023-04-18 03:36:03  Applied_AA-14   10.00

huangapple
  • 本文由 发表于 2023年5月17日 23:54:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/76273999.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定