英文:
How to update variable value to current value (current value * 100/ colsums())
问题
我有一个如下的数据框(df):
我想要执行以下两个步骤:
- 如果列名的第一个字母与
Grade
中的值匹配,并且当前值为NA,则将其替换为0。 - 将A1:B2范围的值更新为当前值(
当前值 * 100 / colsums()
)。
如何实现这个目标?
df <- structure(list(Grade = c("A", "A", "B", "B"), Pass = c("Y", "N",
"Y", "N"), A1 = c(7, 8, NA, NA), A2 = c(4, 5, NA, NA), A3 = c(9,
NA, NA, NA), B1 = c(NA, NA, 8, NA), B2 = c(NA, NA, 3, 4)), row.names = c(NA,
-4L), class = c("tbl_df", "tbl", "data.frame"))
英文:
I have a df like below.:
I would like to do following two steps:
- if first letter of col name matched the value in
Grade
and current value is NA, then replace it as 0. - update the value from A1:B2 as current value, (
current value * 100/ colsums()
).
How can I make this happen?
df <- structure(list(Grade = c("A", "A", "B", "B"), Pass = c("Y", "N",
"Y", "N"), A1 = c(7, 8, NA, NA), A2 = c(4, 5, NA, NA), A3 = c(9,
NA, NA, NA), B1 = c(NA, NA, 8, NA), B2 = c(NA, NA, 3, 4)), row.names = c(NA,
-4L), class = c("tbl_df", "tbl", "data.frame"))
答案1
得分: 3
使用dplyr
包,我们可以在case_when
中使用across
(如果您想要使用两个单独的ifelse
,请查看我的旧编辑)。
library(dplyr)
df %>%
mutate(across(A1:B2, ~case_when(str_extract(cur_column(), "^.") == Grade & is.na(.x) ~ "0 (0%)",
is.na(.x) ~ NA,
.default = paste0(.x, " (", round(.x * 100/(sum(.x, na.rm = T)), digits = 2), "%)"))))
A tibble: 4 × 7
Grade Pass A1 A2 A3 B1 B2
1 A Y 7 (46.67%) 4 (44.44%) 9 (100%) NA NA
2 A N 8 (53.33%) 5 (55.56%) 0 (0%) NA NA
3 B Y NA NA NA 8 (100%) 3 (42.86%)
4 B N NA NA NA 0 (0%) 4 (57.14%)
<hr>
### 更新
如果要使用`colnames`,您需要首先`pick`列。
```r
df %>%
mutate(across(A1:B2, ~ifelse(str_extract(colnames(pick(.x)), "^.") == Grade & is.na(.x), 0, .x)),
across(A1:B2, ~ifelse(is.na(.x), NA, paste0(.x, " (", round(.x * 100/(sum(.x, na.rm = T)), digits = 2), "%)"))))
英文:
With the dplyr
package, we can use across
with case_when
(check my old edits if you want to use two separate ifelse
).
library(dplyr)
df |>
mutate(across(A1:B2, ~case_when(str_extract(cur_column(), "^.") == Grade & is.na(.x) ~ "0 (0%)",
is.na(.x) ~ NA,
.default = paste0(.x, " (", round(.x * 100/(sum(.x, na.rm = T)), digits = 2), "%)"))))
# A tibble: 4 × 7
Grade Pass A1 A2 A3 B1 B2
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 A Y 7 (46.67%) 4 (44.44%) 9 (100%) NA NA
2 A N 8 (53.33%) 5 (55.56%) 0 (0%) NA NA
3 B Y NA NA NA 8 (100%) 3 (42.86%)
4 B N NA NA NA 0 (0%) 4 (57.14%)
<hr>
Update
You need to first pick
the columns if you want to use colnames
.
df %>%
mutate(across(A1:B2, ~ifelse(str_extract(colnames(pick(.x)), "^.") == Grade & is.na(.x), 0, .x)),
across(A1:B2, ~ifelse(is.na(.x), NA, paste0(.x, " (", round(.x * 100/(sum(.x, na.rm = T)), digits = 2), "%)"))))
答案2
得分: 3
以下是代码的中文翻译:
# 使用两个枢轴的替代方法:
library(dplyr)
library(tidyr) # pivot_*
# 将数据框进行长格式化,排除 Grade 和 Pass 列
df %>%
pivot_longer(-c(Grade, Pass)) %>%
# 如果 Grade 与 name 的首字母相匹配,则将值设为 0
mutate(value = if_else(Grade == substring(name, 1, 1), coalesce(value, 0), value)) %>%
# 按 name 列分组
group_by(name) %>%
# 如果值为 NA,则将其替换为空字符串,否则将其格式化为带百分比的字符串
mutate(value = if_else(is.na(value), "", sprintf("%i (%0.02f%%)", value, 100 * value / sum(value, na.rm = TRUE)))) %>%
# 将数据框重新转换为宽格式
pivot_wider()
# # 一个数据框: 4 × 7
# Grade Pass A1 A2 A3 B1 B2
# <chr> <chr> <chr> <chr> <chr> <chr> <chr>
# 1 A Y "7 (46.67%)" "4 (44.44%)" "9 (100.00%)" "" ""
# 2 A N "8 (53.33%)" "5 (55.56%)" "0 (0.00%)" "" ""
# 3 B Y "" "" "" "8 (100.00%)" "3 (42.86%)"
# 4 B N "" "" "" "0 (0.00%)" "4 (57.14%)"
这是你要求的代码部分的中文翻译。如果还有其他需要,请随时告诉我。
英文:
An alternative with two pivots:
library(dplyr)
library(tidyr) # pivot_*
df %>%
pivot_longer(-c(Grade, Pass)) %>%
mutate(value = if_else(Grade == substring(name, 1, 1), coalesce(value, 0), value)) %>%
group_by(name) %>%
mutate(value = if_else(is.na(value), "", sprintf("%i (%0.02f%%)", value, 100 * value / sum(value, na.rm = TRUE)))) %>%
pivot_wider()
# # A tibble: 4 × 7
# Grade Pass A1 A2 A3 B1 B2
# <chr> <chr> <chr> <chr> <chr> <chr> <chr>
# 1 A Y "7 (46.67%)" "4 (44.44%)" "9 (100.00%)" "" ""
# 2 A N "8 (53.33%)" "5 (55.56%)" "0 (0.00%)" "" ""
# 3 B Y "" "" "" "8 (100.00%)" "3 (42.86%)"
# 4 B N "" "" "" "0 (0.00%)" "4 (57.14%)"
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论