找到Pandas DataFrame列中每行的最接近元素

huangapple go评论58阅读模式
英文:

Find closest element in list for each row in Pandas DataFrame column

问题

我有一个Pandas DataFrame和一个比较列表,如下所示:

现在,我想在这个DataFrame上创建一个新列,其中每一行的值是与Results列对应的最接近的比较列表元素。

输出应该类似于这样:

这个任务可以通过循环或使用apply函数来完成,但我想知道如何以矢量化的方式完成。

英文:

I have a Pandas DataFrame and comparation list like this:

In [21]: df
Out[21]: 
   Results
0       90
1       80
2       70
3       60
4       50
5       40
6       30
7       20
8       10

In [23]: comparation_list
Out[23]: [83, 72, 65, 40, 36, 22, 15, 12]

Now, I want to create a new column on this df where the value of each row is the closest element of the comparation list to the Results column correspondent row.

The output should be something like this:

   Results   assigned_value
0       90               83
1       80               83
2       70               72
3       60               65
4       50               40
5       40               40
6       30               36
7       20               22
8       10               12

Doing this through loops or using apply comes straight to my mind, but I would like to know how to do it in a vectorized way.

答案1

得分: 0

使用 merge_asof 函数:

out = pd.merge_asof(
    df.reset_index().sort_values(by='Results'),
    pd.Series(sorted(comparation_list), name='assigned_value'),
    left_on='Results', right_on='assigned_value',
    direction='nearest'
).set_index('index').sort_index()

输出结果:

       Results  assigned_value
index                         
0           90              83
1           80              83
2           70              72
3           60              65
4           50              40
5           40              40
6           30              36
7           20              22
8           10              12
英文:

Use a merge_asof:

out = pd.merge_asof(
    df.reset_index().sort_values(by='Results'),
    pd.Series(sorted(comparation_list), name='assigned_value'),
    left_on='Results', right_on='assigned_value',
    direction='nearest'
).set_index('index').sort_index()

Output:

       Results  assigned_value
index                         
0           90              83
1           80              83
2           70              72
3           60              65
4           50              40
5           40              40
6           30              36
7           20              22
8           10              12

huangapple
  • 本文由 发表于 2023年2月8日 18:31:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/75384462.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定