英文:
Why am I getting zeros when computing growth rates by country-year-sector in my data using tidyverse?
问题
我想为以下数据集中的每个国家-年份-部门计算增长率:
> sapply(sa1, class)
country year sector sector_share
"factor" "numeric" "factor" "numeric"
> print(sa1)
country year sector sector_share
1 Sub-Saharan Africa 1981 agriculture 15.724457
2 Sub-Saharan Africa 1982 agriculture 16.165780
3 Sub-Saharan Africa 1983 agriculture 15.908671
4 Sub-Saharan Africa 1984 agriculture 17.593971
5 Sub-Saharan Africa 1985 agriculture 19.428871
6 Sub-Saharan Africa 1986 agriculture 19.593291
7 Sub-Saharan Africa 1987 agriculture 19.789807
8 Sub-Saharan Africa 1988 agriculture 20.597277
9 Sub-Saharan Africa 1989 agriculture 19.933259
10 Sub-Saharan Africa 1990 agriculture 19.790467
42 Sub-Saharan Africa 1981 industry 35.516119
43 Sub-Saharan Africa 1982 industry 32.407578
...
我使用以下代码:
sa1 <- sa1 %>%
group_by(country, year, sector) %>%
arrange(year) %>%
mutate(growth_rate = ifelse(!is.na(lag(sector_share)), (sector_share / lag(sector_share) - 1) * 100, 0))
但我得到了零,这是不应该的,因为sector_share列中没有NA。
> print(sa1)
# A tibble: 164 × 5
# Groups: country, year, sector [164]
country year sector sector_share growth_rate
<fct> <dbl> <fct> <dbl> <dbl>
1 Sub-Saharan Africa 1981 agriculture 15.7 0
2 Sub-Saharan Africa 1981 industry 35.5 0
3 Sub-Saharan Africa 1981 manufacturing 18.4 0
4 Sub-Saharan Africa 1981 services 44.9 0
5 Sub-Saharan Africa 1982 agriculture 16.2 0
6 Sub-Saharan Africa 1982 industry 32.4 0
7 Sub-Saharan Africa 1982 manufacturing 17.9 0
8 Sub-Saharan Africa 1982 services 46.3 0
9 Sub-Saharan Africa 1983 agriculture 15.9 0
10 Sub-Saharan Africa 1983 industry 32.3 0
# ℹ 154 more rows
# ℹ Use `print(n = ...)` to see more rows
我尝试计算增长率,但得到了零。这不合理,因为我的数据在sector_share列中没有NA,而且代码中我也进行了检查以防万一。
有人能帮助我吗?谢谢!
英文:
I want to compute a growth rate for each country-year-sector in the following dataset:
> sapply(sa1, class)
country year sector sector_share
"factor" "numeric" "factor" "numeric"
> print(sa1)
country year sector sector_share
1 Sub-Saharan Africa 1981 agriculture 15.724457
2 Sub-Saharan Africa 1982 agriculture 16.165780
3 Sub-Saharan Africa 1983 agriculture 15.908671
4 Sub-Saharan Africa 1984 agriculture 17.593971
5 Sub-Saharan Africa 1985 agriculture 19.428871
6 Sub-Saharan Africa 1986 agriculture 19.593291
7 Sub-Saharan Africa 1987 agriculture 19.789807
8 Sub-Saharan Africa 1988 agriculture 20.597277
9 Sub-Saharan Africa 1989 agriculture 19.933259
10 Sub-Saharan Africa 1990 agriculture 19.790467
42 Sub-Saharan Africa 1981 industry 35.516119
43 Sub-Saharan Africa 1982 industry 32.407578
44 Sub-Saharan Africa 1983 industry 32.303477
45 Sub-Saharan Africa 1984 industry 30.437994
46 Sub-Saharan Africa 1985 industry 30.544564
47 Sub-Saharan Africa 1986 industry 29.458658
48 Sub-Saharan Africa 1987 industry 29.490104
49 Sub-Saharan Africa 1988 industry 29.009534
50 Sub-Saharan Africa 1989 industry 29.340000
51 Sub-Saharan Africa 1990 industry 29.698078
52 Sub-Saharan Africa 1991 industry 28.727260
83 Sub-Saharan Africa 1981 manufacturing 18.419694
84 Sub-Saharan Africa 1982 manufacturing 17.895412
85 Sub-Saharan Africa 1983 manufacturing 18.037958
86 Sub-Saharan Africa 1984 manufacturing 16.316419
87 Sub-Saharan Africa 1985 manufacturing 16.256940
88 Sub-Saharan Africa 1986 manufacturing 15.728073
89 Sub-Saharan Africa 1987 manufacturing 15.825253
90 Sub-Saharan Africa 1988 manufacturing 16.320170
91 Sub-Saharan Africa 1989 manufacturing 16.062034
92 Sub-Saharan Africa 1990 manufacturing 16.134401
93 Sub-Saharan Africa 1991 manufacturing 15.826331
124 Sub-Saharan Africa 1981 services 44.946512
125 Sub-Saharan Africa 1982 services 46.323757
126 Sub-Saharan Africa 1983 services 46.071141
127 Sub-Saharan Africa 1984 services 45.820815
128 Sub-Saharan Africa 1985 services 43.226268
129 Sub-Saharan Africa 1986 services 43.409858
130 Sub-Saharan Africa 1987 services 44.298582
131 Sub-Saharan Africa 1988 services 43.191570
132 Sub-Saharan Africa 1989 services 43.023115
133 Sub-Saharan Africa 1990 services 44.043939
134 Sub-Saharan Africa 1991 services 44.995853
I use the following code:
sa1 <- sa1 %>%
group_by(country, year, sector) %>%
arrange(year) %>%
mutate(growth_rate = ifelse(!is.na(lag(sector_share)), (sector_share / lag(sector_share) - 1) * 100, 0))
But I obtain zeros, which should not be since the are no NAs in the sector_share column.
> print(sa1)
# A tibble: 164 × 5
# Groups: country, year, sector [164]
country year sector sector_share growth_rate
<fct> <dbl> <fct> <dbl> <dbl>
1 Sub-Saharan Africa 1981 agriculture 15.7 0
2 Sub-Saharan Africa 1981 industry 35.5 0
3 Sub-Saharan Africa 1981 manufacturing 18.4 0
4 Sub-Saharan Africa 1981 services 44.9 0
5 Sub-Saharan Africa 1982 agriculture 16.2 0
6 Sub-Saharan Africa 1982 industry 32.4 0
7 Sub-Saharan Africa 1982 manufacturing 17.9 0
8 Sub-Saharan Africa 1982 services 46.3 0
9 Sub-Saharan Africa 1983 agriculture 15.9 0
10 Sub-Saharan Africa 1983 industry 32.3 0
# ℹ 154 more rows
# ℹ Use `print(n = ...)` to see more rows
I tried to compute the growth rate, but I obtain zeros. It does not make sense since my data has no NAs in the sector_share column and I am doing a check even in the code just in case.
Can someone help me? Thank you!
答案1
得分: 0
由于您正在按年份
分组,您的计算一次只“看到”一年,因此无法计算多年间的增长。所以不要按年份分组:
library(dplyr)
sa1 %>%
group_by(country, sector) %>%
arrange(year) %>%
mutate(growth_rate = ifelse(!is.na(lag(sector_share)), (sector_share / lag(sector_share) - 1) * 100, 0))
# A tibble: 43 × 5
# Groups: country, sector [4]
country year sector sector_share growth_rate
<chr> <int> <chr> <dbl> <dbl>
1 Africa 1981 agriculture 15.7 0
2 Africa 1981 industry 35.5 0
3 Africa 1981 manufacturing 18.4 0
4 Africa 1981 services 44.9 0
5 Africa 1982 agriculture 16.2 2.81
6 Africa 1982 industry 32.4 -8.75
7 Africa 1982 manufacturing 17.9 -2.85
8 Africa 1982 services 46.3 3.06
9 Africa 1983 agriculture 15.9 -1.59
10 Africa 1983 industry 32.3 -0.321
# ℹ 33 more rows
英文:
Since you’re grouping by year
, your computation only “sees” one year at a time, making it impossible to compute growth across multiple years. So don’t group by year:
library(dplyr)
sa1 %>%
group_by(country, sector) %>%
arrange(year) %>%
mutate(growth_rate = ifelse(!is.na(lag(sector_share)), (sector_share / lag(sector_share) - 1) * 100, 0))
# A tibble: 43 × 5
# Groups: country, sector [4]
country year sector sector_share growth_rate
<chr> <int> <chr> <dbl> <dbl>
1 Africa 1981 agriculture 15.7 0
2 Africa 1981 industry 35.5 0
3 Africa 1981 manufacturing 18.4 0
4 Africa 1981 services 44.9 0
5 Africa 1982 agriculture 16.2 2.81
6 Africa 1982 industry 32.4 -8.75
7 Africa 1982 manufacturing 17.9 -2.85
8 Africa 1982 services 46.3 3.06
9 Africa 1983 agriculture 15.9 -1.59
10 Africa 1983 industry 32.3 -0.321
# ℹ 33 more rows
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论