为什么我的变异函数只对单个数字值起作用

huangapple go评论93阅读模式
英文:

Why does my mutate function only work for single digit values

问题

以下是代码部分的中文翻译:

我想使用我的数据中的年龄变量来创建一个名为 '年龄分类' 的变量,其中包含不同的年龄段。这是我运行的代码:

  1. New_Data <- New_Data %>%
  2. mutate(Age_category = case_when(New_Data$Age < 1 ~ "<1",
  3. New_Data$Age >= 1 & New_Data$Age <= 4 ~ "1-4",
  4. New_Data$Age > 4 & New_Data$Age <= 9 ~ "5-9",
  5. New_Data$Age > 9 & New_Data$Age <= 14 ~ "10-14",
  6. New_Data$Age > 14 & New_Data$Age <= 19 ~ "15-19",
  7. New_Data$Age > 19 & New_Data$Age <= 24 ~ "20-24",
  8. New_Data$Age > 24 & New_Data$Age <= 29 ~ "25-29",
  9. New_Data$Age > 29 & New_Data$Age <= 34 ~ "30-34",
  10. New_Data$Age > 34 & New_Data$Age <= 39 ~ "35-39",
  11. New_Data$Age > 39 & New_Data$Age <= 44 ~ "40-44",
  12. New_Data$Age > 44 & New_Data$Age <= 49 ~ "45-49",
  13. New_Data$Age > 49 & New_Data$Age <= 54 ~ "50-54",
  14. New_Data$Age > 54 & New_Data$Age <= 59 ~ "55-59",
  15. New_Data$Age > 59 & New_Data$Age <= 64 ~ "60-64",
  16. New_Data$Age > 64 ~ "65+",
  17. New_Data$Age == "NULL" ~ "NULL"))

尽管成功创建了年龄分类变量,但它只包含三个不同的年龄段("5-9","<1","1-4","65+")。我不理解为什么其他年龄段尚未创建,尽管它们的年龄在数据中存在。

英文:

I want to use the age variable in my data to create an 'age category' variable with different age brackets. This is the code I ran

  1. New_Data &lt;- New_Data %&gt;% mutate(Age_category = case_when(New_Data$Age &lt;1 ~ &quot;&lt;1&quot;,
  2. New_Data$Age &gt;=1 &amp; New_Data$Age &lt;=4 ~ &quot;1-4&quot;,
  3. New_Data$Age &gt;4 &amp; New_Data$Age &lt;=9 ~ &quot;5-9&quot;,
  4. New_Data$Age &gt;9 &amp; New_Data$Age &lt;=14 ~ &quot;10-14&quot;,
  5. New_Data$Age &gt;14 &amp; New_Data$Age &lt;=19 ~ &quot;15-19&quot;,
  6. New_Data$Age &gt;19 &amp; New_Data$Age &lt;=24 ~ &quot;20-24&quot;,
  7. New_Data$Age &gt;24 &amp; New_Data$Age &lt;=29 ~ &quot;25-29&quot;,
  8. New_Data$Age &gt;29 &amp; New_Data$Age &lt;=34 ~ &quot;30-34&quot;,
  9. New_Data$Age &gt;34 &amp; New_Data$Age &lt;=39 ~ &quot;35-39&quot;,
  10. New_Data$Age &gt;39 &amp; New_Data$Age &lt;=44 ~ &quot;40-44&quot;,
  11. New_Data$Age &gt;44 &amp; New_Data$Age &lt;=49 ~ &quot;45-49&quot;,
  12. New_Data$Age &gt;49 &amp; New_Data$Age &lt;=54 ~ &quot;50-54&quot;,
  13. New_Data$Age &gt;54 &amp; New_Data$Age &lt;=59 ~ &quot;55-59&quot;,
  14. New_Data$Age &gt;59 &amp; New_Data$Age &lt;=64 ~ &quot;60-64&quot;,
  15. New_Data$Age &gt;64 ~ &quot;65+&quot;,
  16. New_Data$Age == &quot;NULL&quot; ~ &quot;NULL&quot;))

I expect the following in the age category column

  1. &quot;&lt;1&quot;, &quot;1-4&quot;,&quot;5-9&quot;, &quot;10-14&quot;, &quot;15-19&quot;,&quot;20-24&quot;, &quot;25-29&quot;, &quot;30-34&quot;, &quot;35-39&quot;,&quot;40-44&quot;, &quot;45-49&quot;, &quot;50-54&quot;, &quot;55-59&quot;,
  2. &quot;60-64&quot;, &quot;65+&quot;,&quot;NULL&quot;

Although the age category variable is created successfully, it only contains three age distinct age brackets("5-9" "<1" "1-4" "65+"). I don't understand why the others haven't been created yet their ages exist in the data

答案1

得分: 1

如果你在case_when语句中发现自己使用了这么多子句,应该问问自己是否存在更简单的方法。在你的情况下,如果你使用cut,可以节省大量编码和调试时间:

  1. New_Data %>%
  2. mutate(Age_category = cut(Age,
  3. breaks = c(0, 0.5, seq(4.5, 64.5, 5), 100),
  4. labels = c("<1", "1-4", "5-9", "10-14", "15-19",
  5. "20-24", "25-29", "30-34", "35-39",
  6. "40-44", "45-49", "50-54", "55-59",
  7. "60-64", "65+")))
  8. #> Age Age_category
  9. #> 1 68 65+
  10. #> 2 39 35-39
  11. #> 3 1 1-4
  12. #> 4 34 30-34
  13. #> 5 87 65+
  14. #> 6 43 40-44
  15. #> 7 14 10-14
  16. #> 8 82 65+
  17. #> 9 59 55-59
  18. #> 10 51 50-54

<sup>创建于 2023-02-17,使用 reprex v2.0.2</sup>


使用的数据

  1. set.seed(1)
  2. New_Data <- data.frame(Age = sample(100, 10))
英文:

If you ever find yourself using this many clauses in a case_when, you should ask yourself if something simpler exists. In your case, it would save a lot of coding and debugging if you use cut instead:

  1. New_Data %&gt;%
  2. mutate(Age_category = cut(Age,
  3. breaks = c(0, 0.5, seq(4.5, 64.5, 5), 100),
  4. labels = c(&quot;&lt;1&quot;, &quot;1-4&quot;, &quot;5-9&quot;, &quot;10-14&quot;, &quot;15-19&quot;,
  5. &quot;20-24&quot;, &quot;25-29&quot;, &quot;30-34&quot;, &quot;35-39&quot;,
  6. &quot;40-44&quot;, &quot;45-49&quot;, &quot;50-54&quot;, &quot;55-59&quot;,
  7. &quot;60-64&quot;, &quot;65+&quot;)))
  8. #&gt; Age Age_category
  9. #&gt; 1 68 65+
  10. #&gt; 2 39 35-39
  11. #&gt; 3 1 1-4
  12. #&gt; 4 34 30-34
  13. #&gt; 5 87 65+
  14. #&gt; 6 43 40-44
  15. #&gt; 7 14 10-14
  16. #&gt; 8 82 65+
  17. #&gt; 9 59 55-59
  18. #&gt; 10 51 50-54

<sup>Created on 2023-02-17 with reprex v2.0.2</sup>


Data used

  1. set.seed(1)
  2. New_Data &lt;- data.frame(Age = sample(100, 10))

huangapple
  • 本文由 发表于 2023年2月18日 01:52:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/75487695.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定