2023年5月17日 14:53:25go评论97阅读模式

英文:

Subtracting values of a shared variable between two data frames with unequal size if their categorical variables are the same

问题

我想比较两个dataframe中年份为2020和2019的共享列value。由于表中添加了一个新的国家，2020年的数据行数更多。
我编写了下面的function，但它没有产生任何结果。如果有人能帮助我，我将不胜感激。

我的期望结果：如果dat2019中的一行与dat2020中的分类变量完全相同，则返回它们在2020年和2019年的value之间的差异。

请注意，dat2020（Banggladesh）和dat2019（Chiina）中有故意的拼写错误。

英文:

I wanted to compare the shared column, value, in two dataframe for year 2020 and 2019. The 2020 data has more rows since a new country has added to the table.
I wrote the below function but it didn’t produce any result. I would appreciate if anyone could help me on this.

dat2020 &lt;- tribble(
  ~Country, ~Gender, ~Indicator, ~value,  
  &quot;Bangladesh&quot;, &quot;Male&quot;, &quot;A&quot;, 3.7,
  &quot;Bangladesh&quot;, &quot;Female&quot;, &quot;A&quot;, 2.6,
  &quot;Banggladesh&quot;, &quot;Male&quot;, &quot;B&quot;, 6.8,
  &quot;Bangladesh&quot;, &quot;Female&quot;, &quot;B&quot;, 4.1,
  &quot;China&quot;, &quot;Male&quot;, &quot;A&quot;, 7.6,
  &quot;China&quot;, &quot;Female&quot;, &quot;A&quot;, 3.9,
  &quot;China&quot;, &quot;Male&quot;, &quot;B&quot;, 1.5,
  &quot;China&quot;, &quot;Female&quot;, &quot;B&quot;, 2.9,
  &quot;Laos&quot;, &quot;Male&quot;, &quot;A&quot;, 7.6,
  &quot;Laos&quot;, &quot;Female&quot;, &quot;A&quot;, 5.1,
  &quot;Laos&quot;, &quot;Male&quot;, &quot;B&quot;, 3.8,
  &quot;Laos&quot;, &quot;Female&quot;, &quot;B&quot;, 2.8,
)
dat2019 &lt;- tribble(
  ~Country, ~Gender, ~Indicator, ~value,  
  &quot;Bangladesh&quot;, &quot;Male&quot;, &quot;A&quot;, 3.6,
  &quot;Bangladesh&quot;, &quot;Female&quot;, &quot;A&quot;, 6.8,
  &quot;Bangladesh&quot;, &quot;Male&quot;, &quot;B&quot;, 9.2,
  &quot;Bangladesh&quot;, &quot;Female&quot;, &quot;B&quot;, 1.5,
  &quot;China&quot;, &quot;Male&quot;, &quot;A&quot;, 8.5,
  &quot;Chiina&quot;, &quot;Female&quot;, &quot;A&quot;, 3.9,
  &quot;China&quot;, &quot;Male&quot;, &quot;B&quot;, 4.6,
  &quot;China&quot;, &quot;Female&quot;, &quot;B&quot;, 5.3,
)
CheckList &lt;- c()
checkValue &lt;- function(data1, data2){
  if(data1$Country == data2$Country &amp; data1$Gender == data2$Gender &amp; data1$Indicator == data2$Indicator){
    CheckList$Diff = data1$value - data2$value
  }
  else{
    CheckList$Diff = NA
  }
}
checkValue(data1 = dat2019, data2 = dat2020)

My desired outcome: if a row in dat2019 has exactly same categorical variables as in dat2020, return the difference between their value in 2020 and 2019.

Note there are intentional typos in dat2020 (Banggladesh) and dat2019 (Chiina).

答案1

得分: 1

做联接并减去

library(dplyr)
left_join(dat2020, dat2019, by = names(dat2020)[1:3]) %>%
  mutate(Diff = value.x - value.y, value = value.x, .keep = "unused")

英文:

Do a join and subtract

library(dplyr)
left_join(dat2020, dat2019, by = names(dat2020)[1:3]) %&gt;% 
  mutate(Diff = value.x - value.y, value = value.x, .keep = &quot;unused&quot;)
</details>
# 答案2
**得分**: 0
我会通过`join`操作来解决这个问题：
```R
library(dplyr)
dat2019 %>%
  dplyr::rename(value2019 = value) %>%
  dplyr::left_join(dplyr::rename(dat2020, value2020 = value)) %>%
  dplyr::mutate(diff = value2020 - value2019)

#> 正在连接，按照 = c("Country", "Gender", "Indicator")
#> # A tibble: 8 x 6
#> Country Gender Indicator value2019 value2020 diff
#>
#> 1 Bangladesh 男性 A 3.6 3.7 0.1
#> 2 Bangladesh 女性 A 6.8 2.6 -4.20
#> 3 Bangladesh 男性 B 9.2 NA NA
#> 4 Bangladesh 女性 B 1.5 4.1 2.60
#> 5 中国男性 A 8.5 7.6 -0.9
#> 6 中国女性 A 3.9 NA NA
#> 7 中国男性 B 4.6 1.5 -3.10
#> 8 中国女性 B 5.3 2.9 -2.4


<details>
<summary>英文:</summary>
I would solve this via a `join` operation:

library(dplyr)

dat2019 %>%
dplyr::rename(value2019 = value) %>%
dplyr::left_join(dplyr::rename(dat2020, value2020 = value)) %>%
dplyr::mutate(diff = value2020 - value2019)

#> Joining, by = c("Country", "Gender", "Indicator")
#> # A tibble: 8 x 6
#> Country Gender Indicator value2019 value2020 diff
#> <chr> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 Bangladesh Male A 3.6 3.7 0.1
#> 2 Bangladesh Female A 6.8 2.6 -4.20
#> 3 Bangladesh Male B 9.2 NA NA
#> 4 Bangladesh Female B 1.5 4.1 2.60
#> 5 China Male A 8.5 7.6 -0.9
#> 6 Chiina Female A 3.9 NA NA
#> 7 China Male B 4.6 1.5 -3.10
#> 8 China Female B 5.3 2.9 -2.4


</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Subtracting values of a shared variable between two data frames with unequal size if their categorical variables are the same

问题

答案1

操作数据框并总结

将CSV文件保存为制表符分隔的文件，同时保留行名称。

将多列文本拆分成不同列的R代码示例：

Summarise node table

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。