2023年3月8日 16:05:13go评论103阅读模式

英文:

Applying conditional functions on multiple columns of a dataframe, based on values of a variable

问题

以下是代码的翻译部分：

我有一个包含一个标识我的分组的因子变量（这里是 `y`）和多个数值变量的数据框（为了简化，这里只显示两个 `x` 和 `z`）：
```R
df = tribble(
  ~x,     ~y,     ~z,
  1,     "a",     5,   
  2,     "b",     6,   
  3,     "a",     7,    
  4,     "b",     8,  
  5,     "a",     9,  
  6,     "b",     10
)

我想要在我的数据框中添加新列，其中我根据因子变量（y）的值对这些数值变量（x 和 z）应用不同的数学函数。对于上面的示例数据框：所有具有 y == "a" 的观察值将添加1，而具有 y == "b" 的观察值将添加2。

这是我的代码：

df %>% mutate(x_new = case_when(grepl("a", y) ~ x + 1,
                                grepl("b", y) ~ x + 2))

输出

A tibble: 6 × 4

  x y         z     x_new

1 1 a 5 2
2 2 b 6 4
3 3 a 7 4
4 4 b 8 6
5 5 a 9 6
6 6 b 10 8


我的代码可以用于添加一个变量，但我希望对所有数值变量应用相同的函数，因此在示例中，我还希望对“z”变量应用这些函数，并将值存储在另一个新列中。由于我有许多数值列，我不想手动使用上面的方法逐个进行变异。有关如何实现这一点的建议吗？（尤其是 tidyverse 解决方案，但任何帮助都非常感激。）
如果您需要更多帮助，请告诉我。
<details>
<summary>英文:</summary>
I have a dataframe with a factor variable identifying my groups (here `y`), and multiple numerical variables (to simplify, here I only show two `x` and `z`):

df = tribble(
~x, ~y, ~z,
1, "a", 5,
2, "b", 6,
3, "a", 7,
4, "b", 8,
5, "a", 9,
6, "b", 10
)


I want to add new columns to my dataframe in which I apply different mathematical functions on those numerical variables (x and z), based on the values of the factor variable (y). For the example dataframe above: all observations with `y == &quot;a&quot;` are added with 1, and the ones with `y == &quot;b&quot;` are added with 2.
This is my code to do it:

df %>% mutate(x_new = case_when(grepl("a", y) ~ x + 1,
grepl("b", y) ~ x + 2))

Output

A tibble: 6 × 4

  x y         z     x_new

<dbl> <chr> <dbl> <dbl>
1 1 a 5 2
2 2 b 6 4
3 3 a 7 4
4 4 b 8 6
5 5 a 9 6
6 6 b 10 8


My code works OK for adding one variable, but I want to apply the same functions for ALL the numerical variables, so in the example I want to apply the functions to the &quot;z&quot; variable as well and store the values in another new column. Since I have many numerical columns I don&#39;t want to manually mutate them one by one with the approach above. Any advice on how to do this? (specially tidyverse solutions but any help is very appreciated)
</details>
# 答案1
**得分**: 2
```R
# 进行匹配以将a、b与1、2匹配。然后将其添加到df，排除y列。
ll <- setNames(c(1, 2), c("a", "b"))
x <- df[, -2] + ll[df$y]
colnames(x) <- paste0(colnames(x), "_new")
# 将x列与原始数据框合并
cbind(df, x)
#   x y  z x_new z_new
# 1 2 a  6     3     7
# 2 4 b  8     6    10
# 3 4 a  8     5     9
# 4 6 b 10     8    12
# 5 6 a 10     7    11
# 6 8 b 12    10    14

英文:

Make a lookup to match a, b with 1,2. Then add to df excluding y column. Finally, suffix with "_new" and column bind back to original dataframe:

ll &lt;- setNames(c(1, 2), c(&quot;a&quot;, &quot;b&quot;))
x &lt;- df[, -2 ] + ll[ df$y ]
colnames(x) &lt;- paste0(colnames(x), &quot;_new&quot;)
cbind(df, x)
#   x y  z x_new z_new
# 1 2 a  6     3     7
# 2 4 b  8     6    10
# 3 4 a  8     5     9
# 4 6 b 10     8    12
# 5 6 a 10     7    11
# 6 8 b 12    10    14

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在数据框上根据变量的数值应用条件函数到多个列。

问题

输出

A tibble: 6 × 4

Output

A tibble: 6 × 4

如何将新行添加到现有的CSV文件中，其中列的顺序不同。

nops_eval在扫描响应中出现错误。

无法在R的reticulate中加载pandas，因为缺少GLIBCXX_3.4.29。

解决一个只有一个未知数的非线性方程系统

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论