英文:
Conditional fractioning
问题
ID | Country | Sales | fraction |
---|---|---|---|
1 | 奥地利 | 6 | 0.666 |
1 | 奥地利 | 6 | 0.666 |
1 | 比利时 | 6 | 0.333 |
2 | 比利时 | 10 | 0.5 |
2 | 捷克 | 10 | 0.5 |
3 | 丹麦 | 3 | 1 |
3 | 德国 | 3 | 1 |
英文:
Suppose I have the dataset
ID | Country | Sales |
---|---|---|
1 | Austria | 6 |
1 | Austria | 6 |
1 | Belgium | 6 |
2 | Belgium | 10 |
2 | Czech | 10 |
3 | Denmark | 3 |
3 | Germany | 3 |
I want to another variable of sales which depends on countries and their ID ie in fractions.
ID | Country | Sales | fraction |
---|---|---|---|
1 | Austria | 6 | 0.666 |
1 | Austria | 6 | 0.666 |
1 | Belgium | 6 | 0.333 |
2 | Belgium | 10 | 0.5 |
2 | Czech | 10 | 0.5 |
3 | Denmark | 3 | 1 |
3 | Denmark | 3 | 1 |
Any help would be appreciated!
答案1
得分: 1
library(dplyr)
your_data |>
mutate(country_total = sum(Sales), .by = c(ID, Country)) |>
mutate(fraction = country_total / sum(Sales), .by = ID)
# ID Country Sales country_total fraction
# 1 1 Austria 6 12 0.6666667
# 2 1 Austria 6 12 0.6666667
# 3 1 Belgium 6 6 0.3333333
# 4 2 Belgium 10 10 0.5000000
# 5 2 Czech 10 10 0.5000000
# 6 3 Denmark 3 3 0.5000000
# 7 3 Germany 3 3 0.5000000
使用此示例数据:
your_data = read.table(text = 'ID Country Sales
1 Austria 6
1 Austria 6
1 Belgium 6
2 Belgium 10
2 Czech 10
3 Denmark 3
3 Germany 3', header = T)
英文:
library(dplyr)
your_data |>
mutate(country_total = sum(Sales), .by = c(ID, Country)) |>
mutate(fraction = country_total / sum(Sales), .by = ID)
# ID Country Sales country_total fraction
# 1 1 Austria 6 12 0.6666667
# 2 1 Austria 6 12 0.6666667
# 3 1 Belgium 6 6 0.3333333
# 4 2 Belgium 10 10 0.5000000
# 5 2 Czech 10 10 0.5000000
# 6 3 Denmark 3 3 0.5000000
# 7 3 Germany 3 3 0.5000000
Using this sample data:
your_data = read.table(text = 'ID Country Sales
1 Austria 6
1 Austria 6
1 Belgium 6
2 Belgium 10
2 Czech 10
3 Denmark 3
3 Germany 3', header = T)
答案2
得分: 1
Here is the translated content:
"或者,我们可以使用 add_count
来获得相同的结果
library(dplyr)
df %>% add_count(ID,name = 'n') %>% add_count(ID,Country, name = 'gn') %>%
mutate(new=gn/n) %>% select(-c(n,gn))
# 输出
# A tibble: 7 × 4
ID Country Sales new
<dbl> <chr> <dbl> <dbl>
1 1 Austria 6 0.667
2 1 Austria 6 0.667
3 1 Belgium 6 0.333
4 2 Belgium 10 0.5
5 2 Czech 10 0.5
6 3 Denmark 3 0.5
7 3 Germany 3 0.5
```"
<details>
<summary>英文:</summary>
Alternatively we could use `add_count` to get the same result
````r
library(dplyr)
df %>% add_count(ID,name = 'n') %>% add_count(ID,Country, name = 'gn') %>%
mutate(new=gn/n) %>% select(-c(n,gn))
# output
# A tibble: 7 × 4
ID Country Sales new
<dbl> <chr> <dbl> <dbl>
1 1 Austria 6 0.667
2 1 Austria 6 0.667
3 1 Belgium 6 0.333
4 2 Belgium 10 0.5
5 2 Czech 10 0.5
6 3 Denmark 3 0.5
7 3 Germany 3 0.5
答案3
得分: 0
ID Country Sales fraction
1 1 Austria 6 0.6666667
2 1 Austria 6 0.6666667
3 1 Belgium 6 0.3333333
4 2 Belgium 10 0.5000000
5 2 Czech 10 0.5000000
6 3 Denmark 3 0.5000000
7 3 Germany 3 0.5000000
英文:
Base
> df$fraction=ave(df$Sales,list(df$ID,df$Country),FUN=sum)/ave(df$Sales,df$ID,FUN=sum)
ID Country Sales fraction
1 1 Austria 6 0.6666667
2 1 Austria 6 0.6666667
3 1 Belgium 6 0.3333333
4 2 Belgium 10 0.5000000
5 2 Czech 10 0.5000000
6 3 Denmark 3 0.5000000
7 3 Germany 3 0.5000000
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论