2023年2月8日 18:31:47go评论94阅读模式

英文:

Find closest element in list for each row in Pandas DataFrame column

问题

我有一个Pandas DataFrame和一个比较列表，如下所示：

现在，我想在这个DataFrame上创建一个新列，其中每一行的值是与Results列对应的最接近的比较列表元素。

输出应该类似于这样：

这个任务可以通过循环或使用apply函数来完成，但我想知道如何以矢量化的方式完成。

英文:

I have a Pandas DataFrame and comparation list like this:

In [21]: df
Out[21]: 
   Results
0       90
1       80
2       70
3       60
4       50
5       40
6       30
7       20
8       10
In [23]: comparation_list
Out[23]: [83, 72, 65, 40, 36, 22, 15, 12]

Now, I want to create a new column on this df where the value of each row is the closest element of the comparation list to the Results column correspondent row.

The output should be something like this:

   Results   assigned_value
0       90               83
1       80               83
2       70               72
3       60               65
4       50               40
5       40               40
6       30               36
7       20               22
8       10               12

Doing this through loops or using apply comes straight to my mind, but I would like to know how to do it in a vectorized way.

答案1

得分: 0

使用 merge_asof 函数：

out = pd.merge_asof(
    df.reset_index().sort_values(by='Results'),
    pd.Series(sorted(comparation_list), name='assigned_value'),
    left_on='Results', right_on='assigned_value',
    direction='nearest'
).set_index('index').sort_index()

输出结果：

       Results  assigned_value
index                         
0           90              83
1           80              83
2           70              72
3           60              65
4           50              40
5           40              40
6           30              36
7           20              22
8           10              12

英文:

Use a merge_asof:

out = pd.merge_asof(
    df.reset_index().sort_values(by=&#39;Results&#39;),
    pd.Series(sorted(comparation_list), name=&#39;assigned_value&#39;),
    left_on=&#39;Results&#39;, right_on=&#39;assigned_value&#39;,
    direction=&#39;nearest&#39;
).set_index(&#39;index&#39;).sort_index()

Output:

       Results  assigned_value
index                         
0           90              83
1           80              83
2           70              72
3           60              65
4           50              40
5           40              40
6           30              36
7           20              22
8           10              12

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

找到Pandas DataFrame列中每行的最接近元素

问题

答案1

如何在Polars中模仿Pandas的基于索引的查询？

Snowflake 无法将变体值转换为日期。

我需要在训练神经网络时在多次运行之间获得均方误差（MSE）的一致结果。

如何在地图上绘制一个矩形网格？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。