search for start date and ID in one dataframe and zero out data previous to the start date in another dataframe

huangapple go评论91阅读模式
英文:

search for start date and ID in one dataframe and zero out data previous to the start date in another dataframe

问题

我想使用查找表来查找开始日期,然后将查找表中开始日期之前的df数据归零。

查找开始日期与唯一ID

MNI_ID 开始日期
0 MNWIS-13 10/1/1967
1 MNWIS-12 12/1/1967
2 MDWR-1 1/1/1968
3 MW-15 2/1/1968

df

  1. MNWIS-13 MNWIS-12 MDWR-1 MW-15

10/1/1967 12.30 22.00 1.00 17.00
11/1/1967 11.00 33.00 23.00 80.00
12/1/1967 45.00 555.00 43.00 45.00
1/1/1968 56.60 405.00 69.00 67.00
2/1/1968 45.00 33.00 20.00 29.00

希望数据框看起来像这样:

  1. MNWIS-13 MNWIS-12 MDWR-1 MW-15

10/1/1967 12.30 0 0 0
11/1/1967 11.00 0 0 0
12/1/1967 45.00 555.00 0 0
1/1/1968 56.60 405.00 69.00 0
2/1/1968 45.00 33.00 20.00 29.00

英文:

I'd like to use a lookup table to search for a start date, then zero our the data from df prior to the start date in the lookup table

lookup start date with unique ID

  1. MNI_ID StartDate
  2. 0 MNWIS-13 10/1/1967
  3. 1 MNWIS-12 12/1/1967
  4. 2 MDWR-1 1/1/1968
  5. 3 MW-15 2/1/1968

df

  1. MNWIS-13 MNWIS-12 MDWR-1 MW-15
  2. 10/1/1967 12.30 22.00 1.00 17.00
  3. 11/1/1967 11.00 33.00 23.00 80.00
  4. 12/1/1967 45.00 555.00 43.00 45.00
  5. 1/1/1968 56.60 405.00 69.00 67.00
  6. 2/1/1968 45.00 33.00 20.00 29.00

Want dataframe to look like this:

  1. MNWIS-13 MNWIS-12 MDWR-1 MW-15
  2. 10/1/1967 12.30 0 0 0
  3. 11/1/1967 11.00 0 0 0
  4. 12/1/1967 45.00 555.00 0 0
  5. 1/1/1968 56.60 405.00 69.00 0
  6. 2/1/1968 45.00 33.00 20.00 29.00

答案1

得分: 1

解决方案

  1. mask = pd.crosstab(lookup['StartDate'], lookup['MNI_ID'])
  2. mask = mask.reindex_like(df).fillna(0)
  3. mask = mask.cummax()
  4. result = df * mask

解释

使用交叉表(crosstab)将lookup数据框重塑成宽格式。

  1. pd.crosstab(lookup['StartDate'], lookup['MNI_ID'])
  1. MNI_ID MDWR-1 MNWIS-12 MNWIS-13 MW-15
  2. StartDate
  3. 1/1/1968 1 0 0 0
  4. 10/1/1967 0 0 1 0
  5. 12/1/1967 0 1 0 0
  6. 2/1/1968 0 0 0 1

确保重塑后的lookup数据框的索引与给定的df匹配。

  1. mask.reindex_like(df).fillna(0)
  1. MNWIS-13 MNWIS-12 MDWR-1 MW-15
  2. 10/1/1967 1.0 0.0 0.0 0.0
  3. 11/1/1967 0.0 0.0 0.0 0.0
  4. 12/1/1967 0.0 1.0 0.0 0.0
  5. 1/1/1968 0.0 0.0 1.0 0.0
  6. 2/1/1968 0.0 0.0 0.0 1.0

沿着索引轴计算累积最大值,创建一个布尔掩码,该掩码可用于将匹配的StartDate之前的值置零。

  1. mask.cummax()
  1. MNWIS-13 MNWIS-12 MDWR-1 MW-15
  2. 10/1/1967 1.0 0.0 0.0 0.0
  3. 11/1/1967 1.0 0.0 0.0 0.0
  4. 12/1/1967 1.0 1.0 0.0 0.0
  5. 1/1/1968 1.0 1.0 1.0 0.0
  6. 2/1/1968 1.0 1.0 1.0 1.0

将布尔mask与给定的数据框相乘,以获得所需的结果。

  1. df * mask
  1. MNWIS-13 MNWIS-12 MDWR-1 MW-15
  2. 10/1/1967 12.3 0.0 0.0 0.0
  3. 11/1/1967 11.0 0.0 0.0 0.0
  4. 12/1/1967 45.0 555.0 0.0 0.0
  5. 1/1/1968 56.6 405.0 69.0 0.0
  6. 2/1/1968 45.0 33.0 20.0 29.0
  1. <details>
  2. <summary>英文:</summary>
  3. ### Solution
  4. mask = pd.crosstab(lookup[&#39;StartDate&#39;], lookup[&#39;MNI_ID&#39;])
  5. mask = mask.reindex_like(df).fillna(0)
  6. mask = mask.cummax()
  7. result = df * mask
  8. #### Explained
  9. Crosstab to reshape the `lookup` dataframe into wide format.
  10. pd.crosstab(lookup[&#39;StartDate&#39;], lookup[&#39;MNI_ID&#39;])
  11. MNI_ID MDWR-1 MNWIS-12 MNWIS-13 MW-15
  12. StartDate
  13. 1/1/1968 1 0 0 0
  14. 10/1/1967 0 0 1 0
  15. 12/1/1967 0 1 0 0
  16. 2/1/1968 0 0 0 1
  17. Ensure the index of reshaped `lookup` dataframe matches the given `df`
  18. mask.reindex_like(df).fillna(0)
  19. MNWIS-13 MNWIS-12 MDWR-1 MW-15
  20. 10/1/1967 1.0 0.0 0.0 0.0
  21. 11/1/1967 0.0 0.0 0.0 0.0
  22. 12/1/1967 0.0 1.0 0.0 0.0
  23. 1/1/1968 0.0 0.0 1.0 0.0
  24. 2/1/1968 0.0 0.0 0.0 1.0
  25. Calculate the cumulative maximum along index axis to create a boolean mask which can be used to zero the previous values before the matching `StartDate`
  26. mask.cummax()
  27. MNWIS-13 MNWIS-12 MDWR-1 MW-15
  28. 10/1/1967 1.0 0.0 0.0 0.0
  29. 11/1/1967 1.0 0.0 0.0 0.0
  30. 12/1/1967 1.0 1.0 0.0 0.0
  31. 1/1/1968 1.0 1.0 1.0 0.0
  32. 2/1/1968 1.0 1.0 1.0 1.0
  33. Multiply the boolean `mask` with the given dataframe to get the desired result
  34. df * mask
  35. MNWIS-13 MNWIS-12 MDWR-1 MW-15
  36. 10/1/1967 12.3 0.0 0.0 0.0
  37. 11/1/1967 11.0 0.0 0.0 0.0
  38. 12/1/1967 45.0 555.0 0.0 0.0
  39. 1/1/1968 56.6 405.0 69.0 0.0
  40. 2/1/1968 45.0 33.0 20.0 29.0
  41. </details>

huangapple
  • 本文由 发表于 2023年2月18日 07:59:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/75490261.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定