2020年1月3日 21:54:19go评论114阅读模式

英文:

How could one branch multiple variables under another variable in R?

问题

我正在尝试按一组变量将我的观察结果分组，这些变量最终会归属于最后一组变量。这是我的示例数据：

         国家         姓名      民族       党派
         阿富汗     约翰      普什图人   X党
         阿富汗     奥利弗    普什图人   Y党
         阿富汗     布拉德    塔吉克人   X党
         阿富汗     查德      哈扎拉人   X党
         波斯尼亚   维尔京    塞尔维亚人 P党
         波斯尼亚   玛丽      塞尔维亚人 P党
         波斯尼亚   耶稣      克罗地亚人 C党

我要做的是在每个党派下创建所有现有民族的集合，并计算在每个党派下有多少人属于每个民族，还要考虑国家，结果应该如下所示：

         国家         党派     民族       人数
         阿富汗     X党     普什图人   1
         阿富汗     X党     塔吉克人   1
         阿富汗     X党     哈扎拉人   1
         阿富汗     Y党     普什图人   1
         阿富汗     Y党     塔吉克人   0
         阿富汗     Y党     哈扎拉人   0
         波斯尼亚   P党     塞尔维亚人 2
         波斯尼亚   P党     克罗地亚人 0
         波斯尼亚   C党     塞尔维亚人 0
         波斯尼亚   C党     克罗地亚人 1

到目前为止，我尝试了 group_by 和 aggregate 函数，但没有成功。

英文:

I'm trying to group my observations by a set of variables under another set of variables which is, finally, under a last set of variables. Here's what I have for example:

     country      name     ethnicity   party
     Afghanistan  john     Pashtun     X Party
     Afghanistan  oliver   Pashtun     Y Party
     Afghanistan  brad     Tajik       X Party
     Afghanistan  chad     Hazara      X Party
     Bosnia       virgin   Serb        P Party
     Bosnia       mary     Serb        P Party
     Bosnia       jesus    Croat       C Party

What I'm going for should create the set of all existing ethnicities under each party and count how many persons are under each ethnicity in a party, within a country and look something like:

     country      party     ethnicity   count
     Afghanistan  X Party   Pashtun     1
     Afghanistan  X Party   Tajik       1
     Afghanistan  X Party   Hazara      1
     Afghanistan  Y Party   Pashtun     1
     Afghanistan  Y Party   Tajik       0
     Afghanistan  Y Party   Hazara      0
     Bosnia       P Party   Serb        2
     Bosnia       P Party   Croat       0
     Bosnia       C Party   Serb        0
     Bosnia       C Party   Croat       1

So far I've tried the functions group_by and aggregate to no avail.

答案1

得分: 1

这是一个非常简单的操作，请阅读这本书 https://r4ds.had.co.nz/

library(data.table)
library(tidyverse)
df_example <- fread("country      name     ethnicity   party coolness
Afghanistan  john     Pashtun     X_Party     cool
Afghanistan  oliver   Pashtun     Y_Party     not_cool
Afghanistan  brad     Tajik       X_Party     cool
Afghanistan  chad     Hazara      X_Party     not_cool
Bosnia       virgin   Serb        P_Party     cool
Bosnia       mary     Serb        P_Party     cool
Bosnia       jesus    Croat       C_Party     not_cool" ,
                    header = TRUE)
df_example %>%
  group_by(country,ethnicity,party) %>%
  add_tally() %>%
  select(-name) %>%
  distinct()

英文:

this is a really simply operation, please read this book https://r4ds.had.co.nz/

library(data.table)
library(tidyverse)
df_example &lt;- fread(&quot;country      name     ethnicity   party coolness
Afghanistan  john     Pashtun     X_Party     cool
Afghanistan  oliver   Pashtun     Y_Party     not_cool
Afghanistan  brad     Tajik       X_Party     cool
Afghanistan  chad     Hazara      X_Party     not_cool
Bosnia       virgin   Serb        P_Party     cool
Bosnia       mary     Serb        P_Party     cool
Bosnia       jesus    Croat       C_Party     not_cool&quot; ,
                    
                    header = TRUE)
df_example %&gt;% 
  group_by(country,ethnicity,party) %&gt;% 
  add_tally() %&gt;% 
  select(-name) %&gt;% # Some stuff that you don&#39;t want
  distinct()

答案2

得分: 1

你可以使用 dplyr 和 tidyr：

df %>%
  count(!!!select(., -name)) %>%
  group_by(country) %>%
  complete(ethnicity, nesting(party), fill = list(n = 0))
   country     ethnicity party       n
   <chr>       <chr>     <fct>   <dbl>
 1 Afghanistan Hazara    X Party     1
 2 Afghanistan Hazara    Y Party     0
 3 Afghanistan Pashtun   X Party     1
 4 Afghanistan Pashtun   Y Party     1
 5 Afghanistan Tajik     X Party     1
 6 Afghanistan Tajik     Y Party     0
 7 Bosnia      Croat     C Party     1
 8 Bosnia      Croat     P Party     0
 9 Bosnia      Serb      C Party     0
10 Bosnia      Serb      P Party     2

英文:

You can use dplyr and tidyr:

df %&gt;%
 count(!!!select(., -name)) %&gt;%
 group_by(country) %&gt;%
 complete(ethnicity, nesting(party), fill = list(n = 0))
   country     ethnicity party       n
   &lt;chr&gt;       &lt;chr&gt;     &lt;fct&gt;   &lt;dbl&gt;
 1 Afghanistan Hazara    X Party     1
 2 Afghanistan Hazara    Y Party     0
 3 Afghanistan Pashtun   X Party     1
 4 Afghanistan Pashtun   Y Party     1
 5 Afghanistan Tajik     X Party     1
 6 Afghanistan Tajik     Y Party     0
 7 Bosnia      Croat     C Party     1
 8 Bosnia      Croat     P Party     0
 9 Bosnia      Serb      C Party     0
10 Bosnia      Serb      P Party     2

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在R中，如何将多个变量分支到另一个变量下？

问题

答案1

答案2

如何手动计算自回归模型的残差

testthat和roxygen用于不是包的分析项目。

如何在PowerBI中向已存在的可视化组添加新数据？

创建一个按组计算的，基于行值变化递增的班次ID。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。