Calculate time difference in seconds in pandas if row in other column matches.

huangapple go评论92阅读模式
英文:

Calculate time difference in seconds in pandas if row in other column matches

问题

  1. # 我在一个pandas DataFrame中使用3列('time','SEC','DeviceName')进行工作。
  2. # 使用以下代码计算'time'列中行之间的差异,并将结果赋给'SEC'列:
  3. df['SEC'] = df['time'].diff().dt.total_seconds()
  4. # 'DeviceName'列可能包含多种不同的设备,因此我需要修改代码,
  5. # 只有在设备名称与前一行匹配时才执行计算,否则将'SEC'列赋值为0。
  6. # 例如:
  7. # 时间 秒 设备名称
  8. # 4/18/2023 2:43:00 Applied_AA-12
  9. # 4/18/2023 3:13:00 1800 Applied_AA-12 # 计算因为设备名称与前一行匹配
  10. # 4/18/2023 3:35:53 0 Applied_AA-14 # 不计算因为设备名称与前一行不匹配
  11. # 4/18/2023 3:36:03 10 Applied_AA-14 # 计算因为设备名称与前一行匹配
英文:

I'm working with 3 columns ('time', 'SEC', 'DeviceName') in a pandas DataFrame. I'm using the following code to calculate the differences between rows in the 'time' column and assign to the 'SEC' column:

  1. df['SEC'] = df['time'].diff().dt.total_seconds()

The 'DeviceName' column can have several different devices, so I need to modify this to only perform the calculation if the device name matches the previous row, otherwise assign a 0 to 'SEC'.

For example:

  1. time SEC DeviceName
  2. 4/18/2023 2:43:00 Applied_AA-12
  3. 4/18/2023 3:13:00 1800 Applied_AA-12 # calculate because the device name matches the previous row
  4. 4/18/2023 3:35:53 0 Applied_AA-14 # don't calculate because the device name doesn't match the previous row
  5. 4/18/2023 3:36:03 10 Applied_AA-14 # calculate because the device name matches the previous row

答案1

得分: 1

你可以使用 GroupyBy.diff

  1. df["SEC"] = df.groupby("DeviceName")["time"].diff().dt.total_seconds().fillna(0)
  2. df.at[0, "SEC"] = np.nan # 这是可选的吗?

输出:

  1. print(df)
  2. time DeviceName SEC
  3. 0 2023-04-18 02:43:00 Applied_AA-12 NaN
  4. 1 2023-04-18 03:13:00 Applied_AA-12 1800.00
  5. 2 2023-04-18 03:35:53 Applied_AA-14 0.00
  6. 3 2023-04-18 03:36:03 Applied_AA-14 10.00
英文:

You can use GroupyBy.diff :

  1. df["SEC"] = df.groupby("DeviceName")["time"].diff().dt.total_seconds().fillna(0)
  2. df.at[0, "SEC"] = np.nan # is this optional ?

Output :

  1. print(df)
  2. time DeviceName SEC
  3. 0 2023-04-18 02:43:00 Applied_AA-12 NaN
  4. 1 2023-04-18 03:13:00 Applied_AA-12 1800.00
  5. 2 2023-04-18 03:35:53 Applied_AA-14 0.00
  6. 3 2023-04-18 03:36:03 Applied_AA-14 10.00

huangapple
  • 本文由 发表于 2023年5月17日 23:54:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/76273999.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定