为什么我的变异函数只对单个数字值起作用

huangapple go评论66阅读模式
英文:

Why does my mutate function only work for single digit values

问题

以下是代码部分的中文翻译:

我想使用我的数据中的年龄变量来创建一个名为 '年龄分类' 的变量,其中包含不同的年龄段。这是我运行的代码:

New_Data <- New_Data %>%
  mutate(Age_category = case_when(New_Data$Age < 1 ~ "<1",
                                   New_Data$Age >= 1 & New_Data$Age <= 4 ~ "1-4",
                                   New_Data$Age > 4  & New_Data$Age <= 9 ~ "5-9",
                                   New_Data$Age > 9  & New_Data$Age <= 14 ~ "10-14",
                                   New_Data$Age > 14 & New_Data$Age <= 19 ~ "15-19",
                                   New_Data$Age > 19 & New_Data$Age <= 24 ~ "20-24",
                                   New_Data$Age > 24 & New_Data$Age <= 29 ~ "25-29",
                                   New_Data$Age > 29 & New_Data$Age <= 34 ~ "30-34",
                                   New_Data$Age > 34 & New_Data$Age <= 39 ~ "35-39",
                                   New_Data$Age > 39 & New_Data$Age <= 44 ~ "40-44",
                                   New_Data$Age > 44 & New_Data$Age <= 49 ~ "45-49",
                                   New_Data$Age > 49 & New_Data$Age <= 54 ~ "50-54",
                                   New_Data$Age > 54 & New_Data$Age <= 59 ~ "55-59",
                                   New_Data$Age > 59 & New_Data$Age <= 64 ~ "60-64",
                                   New_Data$Age > 64 ~ "65+",
                                   New_Data$Age == "NULL" ~ "NULL"))

尽管成功创建了年龄分类变量,但它只包含三个不同的年龄段("5-9","<1","1-4","65+")。我不理解为什么其他年龄段尚未创建,尽管它们的年龄在数据中存在。

英文:

I want to use the age variable in my data to create an 'age category' variable with different age brackets. This is the code I ran

        New_Data &lt;- New_Data %&gt;% mutate(Age_category = case_when(New_Data$Age &lt;1 ~ &quot;&lt;1&quot;,
        New_Data$Age &gt;=1 &amp; New_Data$Age &lt;=4 ~ &quot;1-4&quot;,
        New_Data$Age &gt;4  &amp; New_Data$Age &lt;=9 ~ &quot;5-9&quot;,
        New_Data$Age &gt;9  &amp; New_Data$Age &lt;=14 ~ &quot;10-14&quot;,
        New_Data$Age &gt;14 &amp; New_Data$Age &lt;=19 ~ &quot;15-19&quot;,
        New_Data$Age &gt;19 &amp; New_Data$Age &lt;=24 ~ &quot;20-24&quot;,
        New_Data$Age &gt;24 &amp; New_Data$Age &lt;=29 ~ &quot;25-29&quot;,
        New_Data$Age &gt;29 &amp; New_Data$Age &lt;=34 ~ &quot;30-34&quot;,
        New_Data$Age &gt;34 &amp; New_Data$Age &lt;=39 ~ &quot;35-39&quot;,
        New_Data$Age &gt;39 &amp; New_Data$Age &lt;=44 ~ &quot;40-44&quot;,
        New_Data$Age &gt;44 &amp; New_Data$Age &lt;=49 ~ &quot;45-49&quot;,
        New_Data$Age &gt;49 &amp; New_Data$Age &lt;=54 ~ &quot;50-54&quot;,
        New_Data$Age &gt;54 &amp; New_Data$Age &lt;=59 ~ &quot;55-59&quot;,
        New_Data$Age &gt;59 &amp; New_Data$Age &lt;=64 ~ &quot;60-64&quot;,
        New_Data$Age &gt;64 ~ &quot;65+&quot;,
        New_Data$Age == &quot;NULL&quot; ~ &quot;NULL&quot;))

I expect the following in the age category column

 &quot;&lt;1&quot;, &quot;1-4&quot;,&quot;5-9&quot;, &quot;10-14&quot;, &quot;15-19&quot;,&quot;20-24&quot;, &quot;25-29&quot;, &quot;30-34&quot;,  &quot;35-39&quot;,&quot;40-44&quot;, &quot;45-49&quot;, &quot;50-54&quot;, &quot;55-59&quot;,
&quot;60-64&quot;, &quot;65+&quot;,&quot;NULL&quot;

Although the age category variable is created successfully, it only contains three age distinct age brackets("5-9" "<1" "1-4" "65+"). I don't understand why the others haven't been created yet their ages exist in the data

答案1

得分: 1

如果你在case_when语句中发现自己使用了这么多子句,应该问问自己是否存在更简单的方法。在你的情况下,如果你使用cut,可以节省大量编码和调试时间:

New_Data %>% 
  mutate(Age_category = cut(Age, 
                            breaks = c(0, 0.5, seq(4.5, 64.5, 5), 100), 
                            labels = c("<1", "1-4", "5-9", "10-14", "15-19",
                                       "20-24", "25-29", "30-34", "35-39",
                                       "40-44", "45-49",  "50-54", "55-59",
                                       "60-64", "65+")))
#>    Age Age_category
#> 1   68         65+
#> 2   39       35-39
#> 3    1         1-4
#> 4   34       30-34
#> 5   87         65+
#> 6   43       40-44
#> 7   14       10-14
#> 8   82         65+
#> 9   59       55-59
#> 10  51       50-54

<sup>创建于 2023-02-17,使用 reprex v2.0.2</sup>


使用的数据

set.seed(1)
New_Data <- data.frame(Age = sample(100, 10))
英文:

If you ever find yourself using this many clauses in a case_when, you should ask yourself if something simpler exists. In your case, it would save a lot of coding and debugging if you use cut instead:

New_Data %&gt;%
  mutate(Age_category = cut(Age, 
                            breaks = c(0, 0.5, seq(4.5, 64.5, 5), 100), 
                            labels = c(&quot;&lt;1&quot;, &quot;1-4&quot;, &quot;5-9&quot;, &quot;10-14&quot;, &quot;15-19&quot;,
                                       &quot;20-24&quot;, &quot;25-29&quot;, &quot;30-34&quot;, &quot;35-39&quot;,
                                       &quot;40-44&quot;, &quot;45-49&quot;,  &quot;50-54&quot;, &quot;55-59&quot;,
                                       &quot;60-64&quot;, &quot;65+&quot;)))
#&gt;    Age Age_category
#&gt; 1   68          65+
#&gt; 2   39        35-39
#&gt; 3    1          1-4
#&gt; 4   34        30-34
#&gt; 5   87          65+
#&gt; 6   43        40-44
#&gt; 7   14        10-14
#&gt; 8   82          65+
#&gt; 9   59        55-59
#&gt; 10  51        50-54

<sup>Created on 2023-02-17 with reprex v2.0.2</sup>


Data used

set.seed(1)
New_Data &lt;- data.frame(Age = sample(100, 10))

huangapple
  • 本文由 发表于 2023年2月18日 01:52:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/75487695.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定