英文:
Python dataframe: Select range of elements from one column matching data from another column
问题
Sure, here's the translation of the code and the explanation:
diff = [10, 15, 20, 25, 20, 15, 10, 10, 15, 21, 24, 19, 15, 10, 10, 15, 20, 21, 26, 20, 10, 15, 20, 25, 20, 15, 10]
df_data = pd.DataFrame(diff, columns=['data'])
df_data.insert(0, 'slno', [d for d in range(0, df_data.shape[0])])
max = {
'pos': [3, 10, 18, 23],
'val': [25, 24, 26, 25]
}
df_max = pd.DataFrame(max)
Dataframes what I have now:
df_data:
| |slno|data |
|-----|----|-----|
|0 | 0 | 10|
|1 | 1 | 15|
|2 | 2 | 20|
|3 | 3 | 25|
|4 | 4 | 20|
|5 | 5 | 15|
...
|26 | 26 | 10|
df_max:
| |pos| val|
|---|---|-----|
|0 | 3| 25|
|1 | 10| 24|
|2 | 18| 26|
|3 | 23| 25|
Result Expected:
df_max:
| |pos |val |range |
|---|-----|-----|--------------------|
|0 | 3 | 25 |[15, 20, 25, 20, 15]|
|1 | 10 | 24 |[15, 21, 24, 19, 15]|
|2 | 18 | 26 |[20, 21, 26, 20, 10]|
|3 | 23 | 25 |[15, 20, 25, 20, 15]|
I have two data frames. One data frame is selected rows of another. Now I need to go back to the bigger data frame, select data, and add the result to the smaller data frame.
df_data has all the data. df_max has position and value (higher than the predefined threshold, here it is 23). Now I need to pick 2 values before and 2 after the value that has crossed the threshold. Add this resulting list as a row element in df_max.
I am not able to wrap my head around it. kindly help.
If you have any specific questions or need further assistance with this code, please let me know.
英文:
diff = [10,15,20,25,20,15, 10, 10, 15, 21, 24, 19, 15, 10, 10, 15, 20, 21, 26, 20, 10,15, 20, 25, 20, 15, 10]
df_data = pd.DataFrame(diff, columns=['data'])
df_data.insert(0, 'slno', [ d for d in range(0, df_data.shape[0])])
max = {
'pos':[3,10,18,23],
'val' :[25, 24, 26, 25]
}
df_max = pd.DataFrame(max)
Dataframes what I have now:
df_data:
| |slno|data |
|-----|----|-----|
|0 | 0 | 10|
|1 | 1 | 15|
|2 | 2 | 20|
|3 | 3 | 25|
|4 | 4 | 20|
|5 | 5 | 15|
|6 | 6 | 10|
|7 | 7 | 10|
|8 | 8 | 15|
|9 | 9 | 21|
|10 | 10 | 24|
|11 | 11 | 19|
|12 | 12 | 15|
|13 | 13 | 10|
|14 | 14 | 10|
|15 | 15 | 15|
|16 | 16 | 20|
|17 | 17 | 21|
|18 | 18 | 26|
|19 | 19 | 20|
|20 | 20 | 10|
|21 | 21 | 15|
|22 | 22 | 20|
|23 | 23 | 25|
|24 | 24 | 20|
|25 | 25 | 15|
|26 | 26 | 10|
df_max:
| |pos| val|
|---|---|-----|
|0 | 3| 25|
|1 | 10| 24|
|2 | 18| 26|
|3 | 23| 25|
Result Expected:
df_max:
| |pos |val |range |
|---|-----|-----|--------------------|
|0 | 3 | 25 |[15, 20, 25, 20, 15]|
|1 | 10 | 24 |[15, 21, 24, 19, 15]|
|2 | 18 | 26 |[20, 21, 26, 20, 10]|
|3 | 23 | 25 |[15, 20, 25, 20, 15]|
I have two data frames. One data frame is selected rows of another. Now I need to go back to the bigger data frame, select data, and add the result to the smaller data frame.
df_data has all the data. df_max has position and value (higher than the predefined threshold, here it is 23). Now I need to pick 2 values before and 2 after the value that has crossed the threshold. Add this resulting list as a row element in df_max.
I am not able to wrap my head around it. kindly help.
答案1
得分: 0
你可以使用 apply
来根据 df_max
中的 pos
值选择 df_data['data']
中的元素:
df_max['range'] = df_max['pos'].apply(lambda p: df_data.loc.to_list())
输出:
pos val range
0 3 25 [15, 20, 25, 20, 15]
1 10 24 [15, 21, 24, 19, 15]
2 18 26 [20, 21, 26, 20, 10]
3 23 25 [15, 20, 25, 20, 15]
英文:
You can use apply
to select elements from df_data['data']
according to the pos
value in df_max
:
df_max['range'] = df_max['pos'].apply(lambda p:df_data.loc[p-2:p+2, 'data'].to_list())
Output:
pos val range
0 3 25 [15, 20, 25, 20, 15]
1 10 24 [15, 21, 24, 19, 15]
2 18 26 [20, 21, 26, 20, 10]
3 23 25 [15, 20, 25, 20, 15]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论