英文:
Why does my mutate function only work for single digit values
问题
以下是代码部分的中文翻译:
我想使用我的数据中的年龄变量来创建一个名为 '年龄分类' 的变量,其中包含不同的年龄段。这是我运行的代码:
New_Data <- New_Data %>%
mutate(Age_category = case_when(New_Data$Age < 1 ~ "<1",
New_Data$Age >= 1 & New_Data$Age <= 4 ~ "1-4",
New_Data$Age > 4 & New_Data$Age <= 9 ~ "5-9",
New_Data$Age > 9 & New_Data$Age <= 14 ~ "10-14",
New_Data$Age > 14 & New_Data$Age <= 19 ~ "15-19",
New_Data$Age > 19 & New_Data$Age <= 24 ~ "20-24",
New_Data$Age > 24 & New_Data$Age <= 29 ~ "25-29",
New_Data$Age > 29 & New_Data$Age <= 34 ~ "30-34",
New_Data$Age > 34 & New_Data$Age <= 39 ~ "35-39",
New_Data$Age > 39 & New_Data$Age <= 44 ~ "40-44",
New_Data$Age > 44 & New_Data$Age <= 49 ~ "45-49",
New_Data$Age > 49 & New_Data$Age <= 54 ~ "50-54",
New_Data$Age > 54 & New_Data$Age <= 59 ~ "55-59",
New_Data$Age > 59 & New_Data$Age <= 64 ~ "60-64",
New_Data$Age > 64 ~ "65+",
New_Data$Age == "NULL" ~ "NULL"))
尽管成功创建了年龄分类变量,但它只包含三个不同的年龄段("5-9","<1","1-4","65+")。我不理解为什么其他年龄段尚未创建,尽管它们的年龄在数据中存在。
英文:
I want to use the age variable in my data to create an 'age category' variable with different age brackets. This is the code I ran
New_Data <- New_Data %>% mutate(Age_category = case_when(New_Data$Age <1 ~ "<1",
New_Data$Age >=1 & New_Data$Age <=4 ~ "1-4",
New_Data$Age >4 & New_Data$Age <=9 ~ "5-9",
New_Data$Age >9 & New_Data$Age <=14 ~ "10-14",
New_Data$Age >14 & New_Data$Age <=19 ~ "15-19",
New_Data$Age >19 & New_Data$Age <=24 ~ "20-24",
New_Data$Age >24 & New_Data$Age <=29 ~ "25-29",
New_Data$Age >29 & New_Data$Age <=34 ~ "30-34",
New_Data$Age >34 & New_Data$Age <=39 ~ "35-39",
New_Data$Age >39 & New_Data$Age <=44 ~ "40-44",
New_Data$Age >44 & New_Data$Age <=49 ~ "45-49",
New_Data$Age >49 & New_Data$Age <=54 ~ "50-54",
New_Data$Age >54 & New_Data$Age <=59 ~ "55-59",
New_Data$Age >59 & New_Data$Age <=64 ~ "60-64",
New_Data$Age >64 ~ "65+",
New_Data$Age == "NULL" ~ "NULL"))
I expect the following in the age category column
"<1", "1-4","5-9", "10-14", "15-19","20-24", "25-29", "30-34", "35-39","40-44", "45-49", "50-54", "55-59",
"60-64", "65+","NULL"
Although the age category variable is created successfully, it only contains three age distinct age brackets("5-9" "<1" "1-4" "65+"). I don't understand why the others haven't been created yet their ages exist in the data
答案1
得分: 1
如果你在case_when
语句中发现自己使用了这么多子句,应该问问自己是否存在更简单的方法。在你的情况下,如果你使用cut
,可以节省大量编码和调试时间:
New_Data %>%
mutate(Age_category = cut(Age,
breaks = c(0, 0.5, seq(4.5, 64.5, 5), 100),
labels = c("<1", "1-4", "5-9", "10-14", "15-19",
"20-24", "25-29", "30-34", "35-39",
"40-44", "45-49", "50-54", "55-59",
"60-64", "65+")))
#> Age Age_category
#> 1 68 65+
#> 2 39 35-39
#> 3 1 1-4
#> 4 34 30-34
#> 5 87 65+
#> 6 43 40-44
#> 7 14 10-14
#> 8 82 65+
#> 9 59 55-59
#> 10 51 50-54
<sup>创建于 2023-02-17,使用 reprex v2.0.2</sup>
使用的数据
set.seed(1)
New_Data <- data.frame(Age = sample(100, 10))
英文:
If you ever find yourself using this many clauses in a case_when
, you should ask yourself if something simpler exists. In your case, it would save a lot of coding and debugging if you use cut
instead:
New_Data %>%
mutate(Age_category = cut(Age,
breaks = c(0, 0.5, seq(4.5, 64.5, 5), 100),
labels = c("<1", "1-4", "5-9", "10-14", "15-19",
"20-24", "25-29", "30-34", "35-39",
"40-44", "45-49", "50-54", "55-59",
"60-64", "65+")))
#> Age Age_category
#> 1 68 65+
#> 2 39 35-39
#> 3 1 1-4
#> 4 34 30-34
#> 5 87 65+
#> 6 43 40-44
#> 7 14 10-14
#> 8 82 65+
#> 9 59 55-59
#> 10 51 50-54
<sup>Created on 2023-02-17 with reprex v2.0.2</sup>
Data used
set.seed(1)
New_Data <- data.frame(Age = sample(100, 10))
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论