英文:
How could one branch multiple variables under another variable in R?
问题
我正在尝试按一组变量将我的观察结果分组,这些变量最终会归属于最后一组变量。这是我的示例数据:
国家 姓名 民族 党派
阿富汗 约翰 普什图人 X党
阿富汗 奥利弗 普什图人 Y党
阿富汗 布拉德 塔吉克人 X党
阿富汗 查德 哈扎拉人 X党
波斯尼亚 维尔京 塞尔维亚人 P党
波斯尼亚 玛丽 塞尔维亚人 P党
波斯尼亚 耶稣 克罗地亚人 C党
我要做的是在每个党派下创建所有现有民族的集合,并计算在每个党派下有多少人属于每个民族,还要考虑国家,结果应该如下所示:
国家 党派 民族 人数
阿富汗 X党 普什图人 1
阿富汗 X党 塔吉克人 1
阿富汗 X党 哈扎拉人 1
阿富汗 Y党 普什图人 1
阿富汗 Y党 塔吉克人 0
阿富汗 Y党 哈扎拉人 0
波斯尼亚 P党 塞尔维亚人 2
波斯尼亚 P党 克罗地亚人 0
波斯尼亚 C党 塞尔维亚人 0
波斯尼亚 C党 克罗地亚人 1
到目前为止,我尝试了 group_by
和 aggregate
函数,但没有成功。
英文:
I'm trying to group my observations by a set of variables under another set of variables which is, finally, under a last set of variables. Here's what I have for example:
country name ethnicity party
Afghanistan john Pashtun X Party
Afghanistan oliver Pashtun Y Party
Afghanistan brad Tajik X Party
Afghanistan chad Hazara X Party
Bosnia virgin Serb P Party
Bosnia mary Serb P Party
Bosnia jesus Croat C Party
What I'm going for should create the set of all existing ethnicities under each party and count how many persons are under each ethnicity in a party, within a country and look something like:
country party ethnicity count
Afghanistan X Party Pashtun 1
Afghanistan X Party Tajik 1
Afghanistan X Party Hazara 1
Afghanistan Y Party Pashtun 1
Afghanistan Y Party Tajik 0
Afghanistan Y Party Hazara 0
Bosnia P Party Serb 2
Bosnia P Party Croat 0
Bosnia C Party Serb 0
Bosnia C Party Croat 1
So far I've tried the functions group_by
and aggregate
to no avail.
答案1
得分: 1
这是一个非常简单的操作,请阅读这本书 https://r4ds.had.co.nz/
library(data.table)
library(tidyverse)
df_example <- fread("country name ethnicity party coolness
Afghanistan john Pashtun X_Party cool
Afghanistan oliver Pashtun Y_Party not_cool
Afghanistan brad Tajik X_Party cool
Afghanistan chad Hazara X_Party not_cool
Bosnia virgin Serb P_Party cool
Bosnia mary Serb P_Party cool
Bosnia jesus Croat C_Party not_cool" ,
header = TRUE)
df_example %>%
group_by(country,ethnicity,party) %>%
add_tally() %>%
select(-name) %>%
distinct()
英文:
this is a really simply operation, please read this book https://r4ds.had.co.nz/
library(data.table)
library(tidyverse)
df_example <- fread("country name ethnicity party coolness
Afghanistan john Pashtun X_Party cool
Afghanistan oliver Pashtun Y_Party not_cool
Afghanistan brad Tajik X_Party cool
Afghanistan chad Hazara X_Party not_cool
Bosnia virgin Serb P_Party cool
Bosnia mary Serb P_Party cool
Bosnia jesus Croat C_Party not_cool" ,
header = TRUE)
df_example %>%
group_by(country,ethnicity,party) %>%
add_tally() %>%
select(-name) %>% # Some stuff that you don't want
distinct()
答案2
得分: 1
你可以使用 dplyr
和 tidyr
:
df %>%
count(!!!select(., -name)) %>%
group_by(country) %>%
complete(ethnicity, nesting(party), fill = list(n = 0))
country ethnicity party n
<chr> <chr> <fct> <dbl>
1 Afghanistan Hazara X Party 1
2 Afghanistan Hazara Y Party 0
3 Afghanistan Pashtun X Party 1
4 Afghanistan Pashtun Y Party 1
5 Afghanistan Tajik X Party 1
6 Afghanistan Tajik Y Party 0
7 Bosnia Croat C Party 1
8 Bosnia Croat P Party 0
9 Bosnia Serb C Party 0
10 Bosnia Serb P Party 2
英文:
You can use dplyr
and tidyr
:
df %>%
count(!!!select(., -name)) %>%
group_by(country) %>%
complete(ethnicity, nesting(party), fill = list(n = 0))
country ethnicity party n
<chr> <chr> <fct> <dbl>
1 Afghanistan Hazara X Party 1
2 Afghanistan Hazara Y Party 0
3 Afghanistan Pashtun X Party 1
4 Afghanistan Pashtun Y Party 1
5 Afghanistan Tajik X Party 1
6 Afghanistan Tajik Y Party 0
7 Bosnia Croat C Party 1
8 Bosnia Croat P Party 0
9 Bosnia Serb C Party 0
10 Bosnia Serb P Party 2
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论