问题

我有一个如下的数据框：
| ID | col1| col2 |
|:---- |:------:| -----:|
| AB  |1   | 3 |
| AB  |1   | 3 |
| CD |2   | 4 |
| CD |2   | 3 |
我想比较每个ID内的行。
对于每个有差异的列，将不匹配的内容添加到相应的列中。
输出：
| ID | col1| col2 | 不匹配提取_col1| 不匹配提取_col2|
|:---- |:------:| :-----:| :-----:|-----:|
| AB  |1   | 3 |Na |Na|
| AB  |1   | 3 |Na| Na|
| CD |2   | 4 |Na| 4:3|
| CD |2   | 3 |Na| 4:3|

英文:

I have a data frame like this:

ID	col1	col2
AB	1	3
AB	1	3
CD	2	4
CD	2	3

I would like to compare row within each ID.
For each column with difference add in the mismatch referred to the column.

Output:

ID	col1	col2	mismatch_extract_col1	mismatch_extract_col2
AB	1	3	Na	Na
AB	1	3	Na	Na
CD	2	4	Na	4:3
CD	2	3	Na	4:3

答案1

得分: 2

您可以使用n_distinct() == 1来确定是否在每个ID组的每一列中存在不匹配。

library(dplyr)
df %>%
  mutate(across(col1:col2, ~ if_else(n_distinct(.x) == 1, NA, toString(.x)),
                .names = "mismatch_extract_{.col}"),
         .by = ID)
# # A tibble: 4 × 5
#   ID     col1  col2 mismatch_extract_col1 mismatch_extract_col2
#   <chr> <int> <int> <lgl>                 <chr>                
# 1 AB        1     3 NA                    NA                   
# 2 AB        1     3 NA                    NA                   
# 3 CD        2     4 NA                    4, 3                 
# 4 CD        2     3 NA                    4, 3

注意：代码部分已排除在翻译之外，只提供翻译的文本。

英文:

You can use n_distinct() == 1 to know if there is a mismatch in each column by ID groups.

library(dplyr)
df %&gt;%
  mutate(across(col1:col2, ~ if_else(n_distinct(.x) == 1, NA, toString(.x)),
                .names = &quot;mismatch_extract_{.col}&quot;),
         .by = ID)
# # A tibble: 4 &#215; 5
#   ID     col1  col2 mismatch_extract_col1 mismatch_extract_col2
#   &lt;chr&gt; &lt;int&gt; &lt;int&gt; &lt;lgl&gt;                 &lt;chr&gt;                
# 1 AB        1     3 NA                    NA                   
# 2 AB        1     3 NA                    NA                   
# 3 CD        2     4 NA                    4, 3                 
# 4 CD        2     3 NA                    4, 3

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

提取组之间的不匹配。

问题

答案1

在Python中拆分后提取数值以创建一个新列，标记为是或否

Plotly Sankey图的动画帧在不同步骤中相同。

java.lang.RuntimeException: Unsupported literal type class org.apache.spark.sql.Dataset /Spark – JAVA

在R中过滤数据框时，排除字符串列中的多个字符的最佳方法是什么？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。