选择一个列中与另一个列的数据匹配的元素范围。

huangapple go评论98阅读模式
英文:

Python dataframe: Select range of elements from one column matching data from another column

问题

Sure, here's the translation of the code and the explanation:

  1. diff = [10, 15, 20, 25, 20, 15, 10, 10, 15, 21, 24, 19, 15, 10, 10, 15, 20, 21, 26, 20, 10, 15, 20, 25, 20, 15, 10]
  2. df_data = pd.DataFrame(diff, columns=['data'])
  3. df_data.insert(0, 'slno', [d for d in range(0, df_data.shape[0])])
  4. max = {
  5. 'pos': [3, 10, 18, 23],
  6. 'val': [25, 24, 26, 25]
  7. }
  8. df_max = pd.DataFrame(max)
  9. Dataframes what I have now:
  10. df_data:
  11. | |slno|data |
  12. |-----|----|-----|
  13. |0 | 0 | 10|
  14. |1 | 1 | 15|
  15. |2 | 2 | 20|
  16. |3 | 3 | 25|
  17. |4 | 4 | 20|
  18. |5 | 5 | 15|
  19. ...
  20. |26 | 26 | 10|
  21. df_max:
  22. | |pos| val|
  23. |---|---|-----|
  24. |0 | 3| 25|
  25. |1 | 10| 24|
  26. |2 | 18| 26|
  27. |3 | 23| 25|
  28. Result Expected:
  29. df_max:
  30. | |pos |val |range |
  31. |---|-----|-----|--------------------|
  32. |0 | 3 | 25 |[15, 20, 25, 20, 15]|
  33. |1 | 10 | 24 |[15, 21, 24, 19, 15]|
  34. |2 | 18 | 26 |[20, 21, 26, 20, 10]|
  35. |3 | 23 | 25 |[15, 20, 25, 20, 15]|
  36. I have two data frames. One data frame is selected rows of another. Now I need to go back to the bigger data frame, select data, and add the result to the smaller data frame.
  37. df_data has all the data. df_max has position and value (higher than the predefined threshold, here it is 23). Now I need to pick 2 values before and 2 after the value that has crossed the threshold. Add this resulting list as a row element in df_max.
  38. I am not able to wrap my head around it. kindly help.

If you have any specific questions or need further assistance with this code, please let me know.

英文:
  1. diff = [10,15,20,25,20,15, 10, 10, 15, 21, 24, 19, 15, 10, 10, 15, 20, 21, 26, 20, 10,15, 20, 25, 20, 15, 10]
  2. df_data = pd.DataFrame(diff, columns=['data'])
  3. df_data.insert(0, 'slno', [ d for d in range(0, df_data.shape[0])])
  4. max = {
  5. 'pos':[3,10,18,23],
  6. 'val' :[25, 24, 26, 25]
  7. }
  8. df_max = pd.DataFrame(max)
  9. Dataframes what I have now:
  10. df_data:
  11. | |slno|data |
  12. |-----|----|-----|
  13. |0 | 0 | 10|
  14. |1 | 1 | 15|
  15. |2 | 2 | 20|
  16. |3 | 3 | 25|
  17. |4 | 4 | 20|
  18. |5 | 5 | 15|
  19. |6 | 6 | 10|
  20. |7 | 7 | 10|
  21. |8 | 8 | 15|
  22. |9 | 9 | 21|
  23. |10 | 10 | 24|
  24. |11 | 11 | 19|
  25. |12 | 12 | 15|
  26. |13 | 13 | 10|
  27. |14 | 14 | 10|
  28. |15 | 15 | 15|
  29. |16 | 16 | 20|
  30. |17 | 17 | 21|
  31. |18 | 18 | 26|
  32. |19 | 19 | 20|
  33. |20 | 20 | 10|
  34. |21 | 21 | 15|
  35. |22 | 22 | 20|
  36. |23 | 23 | 25|
  37. |24 | 24 | 20|
  38. |25 | 25 | 15|
  39. |26 | 26 | 10|
  40. df_max:
  41. | |pos| val|
  42. |---|---|-----|
  43. |0 | 3| 25|
  44. |1 | 10| 24|
  45. |2 | 18| 26|
  46. |3 | 23| 25|
  47. Result Expected:
  48. df_max:
  49. | |pos |val |range |
  50. |---|-----|-----|--------------------|
  51. |0 | 3 | 25 |[15, 20, 25, 20, 15]|
  52. |1 | 10 | 24 |[15, 21, 24, 19, 15]|
  53. |2 | 18 | 26 |[20, 21, 26, 20, 10]|
  54. |3 | 23 | 25 |[15, 20, 25, 20, 15]|

I have two data frames. One data frame is selected rows of another. Now I need to go back to the bigger data frame, select data, and add the result to the smaller data frame.

df_data has all the data. df_max has position and value (higher than the predefined threshold, here it is 23). Now I need to pick 2 values before and 2 after the value that has crossed the threshold. Add this resulting list as a row element in df_max.

I am not able to wrap my head around it. kindly help.

答案1

得分: 0

你可以使用 apply 来根据 df_max 中的 pos 值选择 df_data['data'] 中的元素:

  1. df_max['range'] = df_max['pos'].apply(lambda p: df_data.loc

    .to_list())

输出:

  1. pos val range
  2. 0 3 25 [15, 20, 25, 20, 15]
  3. 1 10 24 [15, 21, 24, 19, 15]
  4. 2 18 26 [20, 21, 26, 20, 10]
  5. 3 23 25 [15, 20, 25, 20, 15]
英文:

You can use apply to select elements from df_data['data'] according to the pos value in df_max:

  1. df_max['range'] = df_max['pos'].apply(lambda p:df_data.loc[p-2:p+2, 'data'].to_list())

Output:

  1. pos val range
  2. 0 3 25 [15, 20, 25, 20, 15]
  3. 1 10 24 [15, 21, 24, 19, 15]
  4. 2 18 26 [20, 21, 26, 20, 10]
  5. 3 23 25 [15, 20, 25, 20, 15]

huangapple
  • 本文由 发表于 2023年5月7日 20:37:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/76193979.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定