2023年6月5日 09:19:02go评论107阅读模式

英文:

How to update variable value to current value (current value * 100/ colsums())

问题

我有一个如下的数据框（df）：

我想要执行以下两个步骤：

如果列名的第一个字母与Grade中的值匹配，并且当前值为NA，则将其替换为0。
将A1:B2范围的值更新为当前值（当前值 * 100 / colsums()）。

如何实现这个目标？

df <- structure(list(Grade = c("A", "A", "B", "B"), Pass = c("Y", "N", 
"Y", "N"), A1 = c(7, 8, NA, NA), A2 = c(4, 5, NA, NA), A3 = c(9, 
NA, NA, NA), B1 = c(NA, NA, 8, NA), B2 = c(NA, NA, 3, 4)), row.names = c(NA, 
-4L), class = c("tbl_df", "tbl", "data.frame"))

英文:

I have a df like below.:

I would like to do following two steps:

if first letter of col name matched the value in Grade and current value is NA, then replace it as 0.
update the value from A1:B2 as current value, (current value * 100/ colsums()).

How can I make this happen?

df &lt;- structure(list(Grade = c(&quot;A&quot;, &quot;A&quot;, &quot;B&quot;, &quot;B&quot;), Pass = c(&quot;Y&quot;, &quot;N&quot;, 
&quot;Y&quot;, &quot;N&quot;), A1 = c(7, 8, NA, NA), A2 = c(4, 5, NA, NA), A3 = c(9, 
NA, NA, NA), B1 = c(NA, NA, 8, NA), B2 = c(NA, NA, 3, 4)), row.names = c(NA, 
-4L), class = c(&quot;tbl_df&quot;, &quot;tbl&quot;, &quot;data.frame&quot;))

答案1

得分: 3

使用dplyr包，我们可以在case_when中使用across（如果您想要使用两个单独的ifelse，请查看我的旧编辑）。

library(dplyr)
df %>%
  mutate(across(A1:B2, ~case_when(str_extract(cur_column(), "^.") == Grade & is.na(.x) ~ "0 (0%)",
                                  is.na(.x) ~ NA,
                                  .default = paste0(.x, " (", round(.x * 100/(sum(.x, na.rm = T)), digits = 2), "%)"))))

A tibble: 4 × 7

Grade Pass A1 A2 A3 B1 B2

1 A Y 7 (46.67%) 4 (44.44%) 9 (100%) NA NA
2 A N 8 (53.33%) 5 (55.56%) 0 (0%) NA NA
3 B Y NA NA NA 8 (100%) 3 (42.86%)
4 B N NA NA NA 0 (0%) 4 (57.14%)


<hr>
### 更新
如果要使用`colnames`，您需要首先`pick`列。
```r
df %>%
  mutate(across(A1:B2, ~ifelse(str_extract(colnames(pick(.x)), "^.") == Grade & is.na(.x), 0, .x)), 
         across(A1:B2, ~ifelse(is.na(.x), NA, paste0(.x, " (", round(.x * 100/(sum(.x, na.rm = T)), digits = 2), "%)"))))

英文:

With the dplyr package, we can use across with case_when (check my old edits if you want to use two separate ifelse).

library(dplyr)
df |&gt; 
  mutate(across(A1:B2, ~case_when(str_extract(cur_column(), &quot;^.&quot;) == Grade &amp; is.na(.x) ~ &quot;0 (0%)&quot;,
                                  is.na(.x) ~ NA,
                                  .default = paste0(.x, &quot; (&quot;, round(.x * 100/(sum(.x, na.rm = T)), digits = 2), &quot;%)&quot;))))
# A tibble: 4 &#215; 7
  Grade Pass  A1         A2         A3       B1       B2        
  &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;      &lt;chr&gt;      &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;     
1 A     Y     7 (46.67%) 4 (44.44%) 9 (100%) NA       NA        
2 A     N     8 (53.33%) 5 (55.56%) 0 (0%)   NA       NA        
3 B     Y     NA         NA         NA       8 (100%) 3 (42.86%)
4 B     N     NA         NA         NA       0 (0%)   4 (57.14%)

<hr>

Update

You need to first pick the columns if you want to use colnames.

df %&gt;% 
  mutate(across(A1:B2, ~ifelse(str_extract(colnames(pick(.x)), &quot;^.&quot;) == Grade &amp; is.na(.x), 0, .x)), 
         across(A1:B2, ~ifelse(is.na(.x), NA, paste0(.x, &quot; (&quot;, round(.x * 100/(sum(.x, na.rm = T)), digits = 2), &quot;%)&quot;))))

答案2

得分: 3

以下是代码的中文翻译：

# 使用两个枢轴的替代方法：
library(dplyr)
library(tidyr) # pivot_*
# 将数据框进行长格式化，排除 Grade 和 Pass 列
df %>%
  pivot_longer(-c(Grade, Pass)) %>%
  
  # 如果 Grade 与 name 的首字母相匹配，则将值设为 0
  mutate(value = if_else(Grade == substring(name, 1, 1), coalesce(value, 0), value)) %>%
  
  # 按 name 列分组
  group_by(name) %>%
  
  # 如果值为 NA，则将其替换为空字符串，否则将其格式化为带百分比的字符串
  mutate(value = if_else(is.na(value), "", sprintf("%i (%0.02f%%)", value, 100 * value / sum(value, na.rm = TRUE)))) %>%
  
  # 将数据框重新转换为宽格式
  pivot_wider()
# # 一个数据框: 4 × 7
#   Grade Pass  A1           A2           A3            B1            B2          
#   <chr> <chr> <chr>        <chr>        <chr>         <chr>         <chr>       
# 1 A     Y     "7 (46.67%)" "4 (44.44%)" "9 (100.00%)" ""            ""          
# 2 A     N     "8 (53.33%)" "5 (55.56%)" "0 (0.00%)"   ""            ""          
# 3 B     Y     ""           ""           ""            "8 (100.00%)" "3 (42.86%)"
# 4 B     N     ""           ""           ""            "0 (0.00%)"   "4 (57.14%)"

这是你要求的代码部分的中文翻译。如果还有其他需要，请随时告诉我。

英文:

An alternative with two pivots:

library(dplyr)
library(tidyr) # pivot_*
df %&gt;%
  pivot_longer(-c(Grade, Pass)) %&gt;%
  mutate(value = if_else(Grade == substring(name, 1, 1), coalesce(value, 0), value)) %&gt;%
  group_by(name) %&gt;%
  mutate(value = if_else(is.na(value), &quot;&quot;, sprintf(&quot;%i (%0.02f%%)&quot;, value, 100 * value / sum(value, na.rm = TRUE)))) %&gt;%
  pivot_wider()
# # A tibble: 4 &#215; 7
#   Grade Pass  A1           A2           A3            B1            B2          
#   &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;        &lt;chr&gt;        &lt;chr&gt;         &lt;chr&gt;         &lt;chr&gt;       
# 1 A     Y     &quot;7 (46.67%)&quot; &quot;4 (44.44%)&quot; &quot;9 (100.00%)&quot; &quot;&quot;            &quot;&quot;          
# 2 A     N     &quot;8 (53.33%)&quot; &quot;5 (55.56%)&quot; &quot;0 (0.00%)&quot;   &quot;&quot;            &quot;&quot;          
# 3 B     Y     &quot;&quot;           &quot;&quot;           &quot;&quot;            &quot;8 (100.00%)&quot; &quot;3 (42.86%)&quot;
# 4 B     N     &quot;&quot;           &quot;&quot;           &quot;&quot;            &quot;0 (0.00%)&quot;   &quot;4 (57.14%)&quot;

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何将变量值更新为当前值（当前值 * 100 / colsums()）

问题

答案1

A tibble: 4 × 7

Update

答案2

如何在dplyr中使用`bind_rows`处理大小不兼容的列表？

如何显示具有不同列’b’值的列’a’的重复项

执行一个包含字符字符串的变异函数。

我想知道如何使用mixstock包。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。