2023年2月14日 05:51:37go评论104阅读模式

英文:

Pandas left join with duplicates

问题

你可以尝试使用merge方法时指定left_on和right_on参数来实现你的需求，如下所示：

result = A.merge(B, how='left', left_on=['Name', 'Value'], right_on=['Name', 'Value1'])

这将确保第一个'A'仅与第一个'A'匹配，第二个与第二个，第三个与第三个，得到你期望的结果：

  Name  Value  Value1
0    A      1       1
1    A      2       4
2    A      3       6

这种方法允许你根据多列进行合并，以实现你想要的匹配方式。

英文:

I have a pandas data frame like

A = pd.DataFrame({&#39;Name&#39; : [&#39;A&#39;, &#39;A&#39;,&#39;A&#39;], &#39;Value&#39; : [1,2,3]})
and another DataFrame 
B = pd.DataFrame({&#39;Name&#39;: [&#39;A&#39;, &#39;C&#39;, &#39;D&#39;, &#39;A&#39;, &#39;E&#39;, &#39;A&#39;], &#39;Value1&#39; :[1,2,3,4,5,6]})

When I merge these I get

A.merge(B, how=&#39;left&#39;, on=&#39;Name&#39;)

In [4]: A.merge(B, how=&#39;left&#39;, on=&#39;Name&#39;)
Out[4]: 
  Name  Value  Value1
0    A      1       1
1    A      1       4
2    A      1       6
3    A      2       1
4    A      2       4
5    A      2       6
6    A      3       1
7    A      3       4
8    A      3       6

Anyway to do this merge in a way such that first row with 'A' will match only with first row with 'A' in B, and second with second and third with third.
Final output like

  Name  Value  Value1
0    A      1       1
1    A      2       4
2    A      3       6

Thanks,

I tried doing left merge. I wasnt expecting anything different, but I am looking for a better way to do this.

Doing Inner join doesnt help either

A.merge(B, how=&#39;inner&#39;, on=&#39;Name&#39;)
  Name  Value  Value1
0    A      1       1
1    A      1       4
2    A      1       6
3    A      2       1
4    A      2       4
5    A      2       6
6    A      3       1
7    A      3       4
8    A      3       6

答案1

得分: 7

使用groupby.cumcount进行去重，并将其作为次要键传递给merge：

A.merge(B, how='left',
        left_on=['Name', A.groupby('Name').cumcount()],
        right_on=['Name', B.groupby('Name').cumcount()]
       )#.drop(columns='key_1')

输出：

  Name  key_1  Value  Value1
0    A      0      1       1
1    A      1      2       4
2    A      2      3       6

英文:

Deduplicate with groupby.cumcount and pass it to merge as secondary key:

A.merge(B, how=&#39;left&#39;,
        left_on=[&#39;Name&#39;, A.groupby(&#39;Name&#39;).cumcount()],
        right_on=[&#39;Name&#39;, B.groupby(&#39;Name&#39;).cumcount()]
       )#.drop(columns=&#39;key_1&#39;)

Output:

  Name  key_1  Value  Value1
0    A      0      1       1
1    A      1      2       4
2    A      2      3       6

答案2

得分: 0

你正在请求的实际上不是一个连接操作。

但是，你可以像这样操作：

pd.concat([A, B[B.Name == "A"].reset_index().Value1], axis=1)

英文:

You are requesting something that is not actually a join.

You can do something like this however:

pd.concat([A, B[B.Name == &quot;A&quot;].reset_index().Value1], axis=1)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Pandas 左连接与重复项

问题

答案1

答案2

清除屏幕，然后在循环中准备显示新数据之前执行？

unbound method init() error with python2 but not with python3

pandas列的除法操作会返回多列。

从数据框单元格中删除特定元素时，只需将该元素从列表中删除。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。