如何将变量值更新为当前值(当前值 * 100 / colsums())

huangapple go评论107阅读模式
英文:

How to update variable value to current value (current value * 100/ colsums())

问题

我有一个如下的数据框(df):

如何将变量值更新为当前值(当前值 * 100 / colsums())

我想要执行以下两个步骤:

  1. 如果列名的第一个字母与Grade中的值匹配,并且当前值为NA,则将其替换为0。
  2. 将A1:B2范围的值更新为当前值(当前值 * 100 / colsums())。

如何实现这个目标?

  1. df <- structure(list(Grade = c("A", "A", "B", "B"), Pass = c("Y", "N",
  2. "Y", "N"), A1 = c(7, 8, NA, NA), A2 = c(4, 5, NA, NA), A3 = c(9,
  3. NA, NA, NA), B1 = c(NA, NA, 8, NA), B2 = c(NA, NA, 3, 4)), row.names = c(NA,
  4. -4L), class = c("tbl_df", "tbl", "data.frame"))
英文:

I have a df like below.:

如何将变量值更新为当前值(当前值 * 100 / colsums())

I would like to do following two steps:

  1. if first letter of col name matched the value in Grade and current value is NA, then replace it as 0.
  2. update the value from A1:B2 as current value, (current value * 100/ colsums()).

How can I make this happen?

  1. df &lt;- structure(list(Grade = c(&quot;A&quot;, &quot;A&quot;, &quot;B&quot;, &quot;B&quot;), Pass = c(&quot;Y&quot;, &quot;N&quot;,
  2. &quot;Y&quot;, &quot;N&quot;), A1 = c(7, 8, NA, NA), A2 = c(4, 5, NA, NA), A3 = c(9,
  3. NA, NA, NA), B1 = c(NA, NA, 8, NA), B2 = c(NA, NA, 3, 4)), row.names = c(NA,
  4. -4L), class = c(&quot;tbl_df&quot;, &quot;tbl&quot;, &quot;data.frame&quot;))

答案1

得分: 3

使用dplyr包,我们可以在case_when中使用across(如果您想要使用两个单独的ifelse,请查看我的旧编辑)。

  1. library(dplyr)
  2. df %>%
  3. mutate(across(A1:B2, ~case_when(str_extract(cur_column(), "^.") == Grade & is.na(.x) ~ "0 (0%)",
  4. is.na(.x) ~ NA,
  5. .default = paste0(.x, " (", round(.x * 100/(sum(.x, na.rm = T)), digits = 2), "%)"))))

A tibble: 4 × 7

Grade Pass A1 A2 A3 B1 B2

1 A Y 7 (46.67%) 4 (44.44%) 9 (100%) NA NA
2 A N 8 (53.33%) 5 (55.56%) 0 (0%) NA NA
3 B Y NA NA NA 8 (100%) 3 (42.86%)
4 B N NA NA NA 0 (0%) 4 (57.14%)

  1. <hr>
  2. ### 更新
  3. 如果要使用`colnames`,您需要首先`pick`列。
  4. ```r
  5. df %>%
  6. mutate(across(A1:B2, ~ifelse(str_extract(colnames(pick(.x)), "^.") == Grade & is.na(.x), 0, .x)),
  7. across(A1:B2, ~ifelse(is.na(.x), NA, paste0(.x, " (", round(.x * 100/(sum(.x, na.rm = T)), digits = 2), "%)"))))
英文:

With the dplyr package, we can use across with case_when (check my old edits if you want to use two separate ifelse).

  1. library(dplyr)
  2. df |&gt;
  3. mutate(across(A1:B2, ~case_when(str_extract(cur_column(), &quot;^.&quot;) == Grade &amp; is.na(.x) ~ &quot;0 (0%)&quot;,
  4. is.na(.x) ~ NA,
  5. .default = paste0(.x, &quot; (&quot;, round(.x * 100/(sum(.x, na.rm = T)), digits = 2), &quot;%)&quot;))))
  6. # A tibble: 4 &#215; 7
  7. Grade Pass A1 A2 A3 B1 B2
  8. &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
  9. 1 A Y 7 (46.67%) 4 (44.44%) 9 (100%) NA NA
  10. 2 A N 8 (53.33%) 5 (55.56%) 0 (0%) NA NA
  11. 3 B Y NA NA NA 8 (100%) 3 (42.86%)
  12. 4 B N NA NA NA 0 (0%) 4 (57.14%)

<hr>

Update

You need to first pick the columns if you want to use colnames.

  1. df %&gt;%
  2. mutate(across(A1:B2, ~ifelse(str_extract(colnames(pick(.x)), &quot;^.&quot;) == Grade &amp; is.na(.x), 0, .x)),
  3. across(A1:B2, ~ifelse(is.na(.x), NA, paste0(.x, &quot; (&quot;, round(.x * 100/(sum(.x, na.rm = T)), digits = 2), &quot;%)&quot;))))

答案2

得分: 3

以下是代码的中文翻译:

  1. # 使用两个枢轴的替代方法:
  2. library(dplyr)
  3. library(tidyr) # pivot_*
  4. # 将数据框进行长格式化,排除 Grade 和 Pass 列
  5. df %>%
  6. pivot_longer(-c(Grade, Pass)) %>%
  7. # 如果 Grade 与 name 的首字母相匹配,则将值设为 0
  8. mutate(value = if_else(Grade == substring(name, 1, 1), coalesce(value, 0), value)) %>%
  9. # 按 name 列分组
  10. group_by(name) %>%
  11. # 如果值为 NA,则将其替换为空字符串,否则将其格式化为带百分比的字符串
  12. mutate(value = if_else(is.na(value), "", sprintf("%i (%0.02f%%)", value, 100 * value / sum(value, na.rm = TRUE)))) %>%
  13. # 将数据框重新转换为宽格式
  14. pivot_wider()
  15. # # 一个数据框: 4 × 7
  16. # Grade Pass A1 A2 A3 B1 B2
  17. # <chr> <chr> <chr> <chr> <chr> <chr> <chr>
  18. # 1 A Y "7 (46.67%)" "4 (44.44%)" "9 (100.00%)" "" ""
  19. # 2 A N "8 (53.33%)" "5 (55.56%)" "0 (0.00%)" "" ""
  20. # 3 B Y "" "" "" "8 (100.00%)" "3 (42.86%)"
  21. # 4 B N "" "" "" "0 (0.00%)" "4 (57.14%)"

这是你要求的代码部分的中文翻译。如果还有其他需要,请随时告诉我。

英文:

An alternative with two pivots:

  1. library(dplyr)
  2. library(tidyr) # pivot_*
  3. df %&gt;%
  4. pivot_longer(-c(Grade, Pass)) %&gt;%
  5. mutate(value = if_else(Grade == substring(name, 1, 1), coalesce(value, 0), value)) %&gt;%
  6. group_by(name) %&gt;%
  7. mutate(value = if_else(is.na(value), &quot;&quot;, sprintf(&quot;%i (%0.02f%%)&quot;, value, 100 * value / sum(value, na.rm = TRUE)))) %&gt;%
  8. pivot_wider()
  9. # # A tibble: 4 &#215; 7
  10. # Grade Pass A1 A2 A3 B1 B2
  11. # &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
  12. # 1 A Y &quot;7 (46.67%)&quot; &quot;4 (44.44%)&quot; &quot;9 (100.00%)&quot; &quot;&quot; &quot;&quot;
  13. # 2 A N &quot;8 (53.33%)&quot; &quot;5 (55.56%)&quot; &quot;0 (0.00%)&quot; &quot;&quot; &quot;&quot;
  14. # 3 B Y &quot;&quot; &quot;&quot; &quot;&quot; &quot;8 (100.00%)&quot; &quot;3 (42.86%)&quot;
  15. # 4 B N &quot;&quot; &quot;&quot; &quot;&quot; &quot;0 (0.00%)&quot; &quot;4 (57.14%)&quot;

huangapple
  • 本文由 发表于 2023年6月5日 09:19:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/76403037.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定