英文:
Find closest element in list for each row in Pandas DataFrame column
问题
我有一个Pandas DataFrame和一个比较列表,如下所示:
现在,我想在这个DataFrame上创建一个新列,其中每一行的值是与Results列对应的最接近的比较列表元素。
输出应该类似于这样:
这个任务可以通过循环或使用apply函数来完成,但我想知道如何以矢量化的方式完成。
英文:
I have a Pandas DataFrame and comparation list like this:
In [21]: df
Out[21]:
Results
0 90
1 80
2 70
3 60
4 50
5 40
6 30
7 20
8 10
In [23]: comparation_list
Out[23]: [83, 72, 65, 40, 36, 22, 15, 12]
Now, I want to create a new column on this df where the value of each row is the closest element of the comparation list to the Results column correspondent row.
The output should be something like this:
Results assigned_value
0 90 83
1 80 83
2 70 72
3 60 65
4 50 40
5 40 40
6 30 36
7 20 22
8 10 12
Doing this through loops or using apply comes straight to my mind, but I would like to know how to do it in a vectorized way.
答案1
得分: 0
使用 merge_asof
函数:
out = pd.merge_asof(
df.reset_index().sort_values(by='Results'),
pd.Series(sorted(comparation_list), name='assigned_value'),
left_on='Results', right_on='assigned_value',
direction='nearest'
).set_index('index').sort_index()
输出结果:
Results assigned_value
index
0 90 83
1 80 83
2 70 72
3 60 65
4 50 40
5 40 40
6 30 36
7 20 22
8 10 12
英文:
Use a merge_asof
:
out = pd.merge_asof(
df.reset_index().sort_values(by='Results'),
pd.Series(sorted(comparation_list), name='assigned_value'),
left_on='Results', right_on='assigned_value',
direction='nearest'
).set_index('index').sort_index()
Output:
Results assigned_value
index
0 90 83
1 80 83
2 70 72
3 60 65
4 50 40
5 40 40
6 30 36
7 20 22
8 10 12
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论