在R中,如何将多个变量分支到另一个变量下?

huangapple go评论75阅读模式
英文:

How could one branch multiple variables under another variable in R?

问题

我正在尝试按一组变量将我的观察结果分组,这些变量最终会归属于最后一组变量。这是我的示例数据:

         国家         姓名      民族       党派

         阿富汗     约翰      普什图人   X党
         阿富汗     奥利弗    普什图人   Y党
         阿富汗     布拉德    塔吉克人   X党
         阿富汗     查德      哈扎拉人   X党
         波斯尼亚   维尔京    塞尔维亚人 P党
         波斯尼亚   玛丽      塞尔维亚人 P党
         波斯尼亚   耶稣      克罗地亚人 C党

我要做的是在每个党派下创建所有现有民族的集合,并计算在每个党派下有多少人属于每个民族,还要考虑国家,结果应该如下所示:

         国家         党派     民族       人数

         阿富汗     X党     普什图人   1
         阿富汗     X党     塔吉克人   1
         阿富汗     X党     哈扎拉人   1
         阿富汗     Y党     普什图人   1
         阿富汗     Y党     塔吉克人   0
         阿富汗     Y党     哈扎拉人   0
         波斯尼亚   P党     塞尔维亚人 2
         波斯尼亚   P党     克罗地亚人 0
         波斯尼亚   C党     塞尔维亚人 0
         波斯尼亚   C党     克罗地亚人 1

到目前为止,我尝试了 group_byaggregate 函数,但没有成功。

英文:

I'm trying to group my observations by a set of variables under another set of variables which is, finally, under a last set of variables. Here's what I have for example:

     country      name     ethnicity   party

     Afghanistan  john     Pashtun     X Party
     Afghanistan  oliver   Pashtun     Y Party
     Afghanistan  brad     Tajik       X Party
     Afghanistan  chad     Hazara      X Party
     Bosnia       virgin   Serb        P Party
     Bosnia       mary     Serb        P Party
     Bosnia       jesus    Croat       C Party

What I'm going for should create the set of all existing ethnicities under each party and count how many persons are under each ethnicity in a party, within a country and look something like:

     country      party     ethnicity   count

     Afghanistan  X Party   Pashtun     1
     Afghanistan  X Party   Tajik       1
     Afghanistan  X Party   Hazara      1
     Afghanistan  Y Party   Pashtun     1
     Afghanistan  Y Party   Tajik       0
     Afghanistan  Y Party   Hazara      0
     Bosnia       P Party   Serb        2
     Bosnia       P Party   Croat       0
     Bosnia       C Party   Serb        0
     Bosnia       C Party   Croat       1

So far I've tried the functions group_by and aggregate to no avail.

答案1

得分: 1

这是一个非常简单的操作,请阅读这本书 https://r4ds.had.co.nz/

library(data.table)
library(tidyverse)

df_example <- fread("country      name     ethnicity   party coolness
Afghanistan  john     Pashtun     X_Party     cool
Afghanistan  oliver   Pashtun     Y_Party     not_cool
Afghanistan  brad     Tajik       X_Party     cool
Afghanistan  chad     Hazara      X_Party     not_cool
Bosnia       virgin   Serb        P_Party     cool
Bosnia       mary     Serb        P_Party     cool
Bosnia       jesus    Croat       C_Party     not_cool" ,

                    header = TRUE)


df_example %>%
  group_by(country,ethnicity,party) %>%
  add_tally() %>%
  select(-name) %>%
  distinct()
英文:

this is a really simply operation, please read this book https://r4ds.had.co.nz/

library(data.table)
library(tidyverse)

df_example &lt;- fread(&quot;country      name     ethnicity   party coolness
Afghanistan  john     Pashtun     X_Party     cool
Afghanistan  oliver   Pashtun     Y_Party     not_cool
Afghanistan  brad     Tajik       X_Party     cool
Afghanistan  chad     Hazara      X_Party     not_cool
Bosnia       virgin   Serb        P_Party     cool
Bosnia       mary     Serb        P_Party     cool
Bosnia       jesus    Croat       C_Party     not_cool&quot; ,
                    
                    header = TRUE)


df_example %&gt;% 
  group_by(country,ethnicity,party) %&gt;% 
  add_tally() %&gt;% 
  select(-name) %&gt;% # Some stuff that you don&#39;t want
  distinct()

答案2

得分: 1

你可以使用 dplyrtidyr

df %>%
  count(!!!select(., -name)) %>%
  group_by(country) %>%
  complete(ethnicity, nesting(party), fill = list(n = 0))

   country     ethnicity party       n
   <chr>       <chr>     <fct>   <dbl>
 1 Afghanistan Hazara    X Party     1
 2 Afghanistan Hazara    Y Party     0
 3 Afghanistan Pashtun   X Party     1
 4 Afghanistan Pashtun   Y Party     1
 5 Afghanistan Tajik     X Party     1
 6 Afghanistan Tajik     Y Party     0
 7 Bosnia      Croat     C Party     1
 8 Bosnia      Croat     P Party     0
 9 Bosnia      Serb      C Party     0
10 Bosnia      Serb      P Party     2
英文:

You can use dplyr and tidyr:

df %&gt;%
 count(!!!select(., -name)) %&gt;%
 group_by(country) %&gt;%
 complete(ethnicity, nesting(party), fill = list(n = 0))

   country     ethnicity party       n
   &lt;chr&gt;       &lt;chr&gt;     &lt;fct&gt;   &lt;dbl&gt;
 1 Afghanistan Hazara    X Party     1
 2 Afghanistan Hazara    Y Party     0
 3 Afghanistan Pashtun   X Party     1
 4 Afghanistan Pashtun   Y Party     1
 5 Afghanistan Tajik     X Party     1
 6 Afghanistan Tajik     Y Party     0
 7 Bosnia      Croat     C Party     1
 8 Bosnia      Croat     P Party     0
 9 Bosnia      Serb      C Party     0
10 Bosnia      Serb      P Party     2

huangapple
  • 本文由 发表于 2020年1月3日 21:54:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/59579796.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定