2023年6月15日 20:32:48go评论96阅读模式

英文:

Pass a variable in mutate

问题

在这段代码中，您想要处理包含NA值的DataFrame，并尝试使用循环和mutate函数覆盖这些值。但是，您遇到了一个错误。问题出在以下这一行：

df <- df %>% mutate(!!name = ifelse(is.na(!!name, Colonne4 * means_[name], !!name)))

您可以将其修改为以下内容：

df <- df %>% mutate(!!name := ifelse(is.na(!!name), Colonne4 * means_[name], !!name))

这样就能够正确地覆盖DataFrame中的NA值。

英文:

I have a dataframe containing NA values in each columns. I would like to override these values.
I set up a loop that goes through each column and applies the mutate function.
The column name is in a name variable. As I use it in the mutate function ?

df &lt;- data.frame(&quot;Colonne1&quot; = c(1, 2, NA, 4, 5, NA), &quot;Colonne2&quot; = c(6, NA, 8, 9, NA, 11), &quot;Colonne3&quot; = c(NA, 13, 14, NA, 16, 17), &quot;Colonne4&quot; = runif(6))
means_ &lt;- colMeans(df, na.rm = TRUE)[-c(ncol(df))]
for(name in names(means_)){
  df &lt;- df %&gt;% mutate(!!name = ifelse(is.na(!!name, Colonne4 * means_[name], !!name)))
}

Error: unexpected &#39;=&#39; in:
&quot;  df &lt;- df %&gt;%
    mutate(!!name =&quot;

答案1

得分: 3

这是您提供的代码的翻译部分：

在使用mutate()中的across()时，也许使用across()会更容易-没有必要提前计算均值：

library(dplyr)
  
set.seed(1234)
df <- data.frame("Colonne1" = c(1, 2, NA, 4, 5, NA), "Colonne2" = c(6, NA, 8, 9, NA, 11), "Colonne3" = c(NA, 13, 14, NA, 16, 17), "Colonne4" = runif(6))
df <- df %>%
  mutate(across(Colonne1:Colonne3, 
                ~ifelse(is.na(.x), 
                        Colonne4*mean(.x, na.rm=TRUE), 
                        .x)))
df
#>   Colonne1  Colonne2  Colonne3  Colonne4
#> 1 1.000000  6.000000  1.705551 0.1137034
#> 2 2.000000  5.289545 13.000000 0.6222994
#> 3 1.827824  8.000000 14.000000 0.6092747
#> 4 4.000000  9.000000  9.350692 0.6233794
#> 5 5.000000  7.317781 16.000000 0.8609154
#> 6 1.920932 11.000000 17.000000 0.6403106

如果您想知道如何在循环中实现它，可以像下面这样做。首先要注意的是，在mutate()中将字符串用作变量名时，需要将=更改为:=。您还可以使用!!sym(name)来评估变量名，这将使其在mutate()中像变量一样处理，而不是字符串。

set.seed(1234)
df <- data.frame("Colonne1" = c(1, 2, NA, 4, 5, NA), "Colonne2" = c(6, NA, 8, 9, NA, 11), "Colonne3" = c(NA, 13, 14, NA, 16, 17), "Colonne4" = runif(6))
means_ <- colMeans(df, na.rm = TRUE)[-c(ncol(df))]
for(name in names(means_)){
  df <- df %>% mutate({{name}} := ifelse(is.na(!!sym(name)), Colonne4 * means_[!!name], !!sym(name)))
}
df
#>   Colonne1  Colonne2  Colonne3  Colonne4
#> 1 1.000000  6.000000  1.705551 0.1137034
#> 2 2.000000  5.289545 13.000000 0.6222994
#> 3 1.827824  8.000000 14.000000 0.6092747
#> 4 4.000000  9.000000  9.350692 0.6233794
#> 5 5.000000  7.317781 16.000000 0.8609154
#> 6 1.920932 11.000000 17.000000 0.6403106

^{创建于2023-06-15，使用 reprex v2.0.2}

英文:

It might be easier to do this with across() in mutate()- there is no reason to calculate the mean ahead of time:

library(dplyr)
  
set.seed(1234)
df &lt;- data.frame(&quot;Colonne1&quot; = c(1, 2, NA, 4, 5, NA), &quot;Colonne2&quot; = c(6, NA, 8, 9, NA, 11), &quot;Colonne3&quot; = c(NA, 13, 14, NA, 16, 17), &quot;Colonne4&quot; = runif(6))
df &lt;- df %&gt;% 
  mutate(across(Colonne1:Colonne3, 
                ~ifelse(is.na(.x), 
                        Colonne4*mean(.x, na.rm=TRUE), 
                        .x)))
df
#&gt;   Colonne1  Colonne2  Colonne3  Colonne4
#&gt; 1 1.000000  6.000000  1.705551 0.1137034
#&gt; 2 2.000000  5.289545 13.000000 0.6222994
#&gt; 3 1.827824  8.000000 14.000000 0.6092747
#&gt; 4 4.000000  9.000000  9.350692 0.6233794
#&gt; 5 5.000000  7.317781 16.000000 0.8609154
#&gt; 6 1.920932 11.000000 17.000000 0.6403106

If you want to know how it would work in the loop, you could do it like below. The first thing to note is that when using a string as a variable name in mutate() you need to change = to :=. You can also evaluate the names with !!sym(name), which will treat it like a variable in mutate() rather than a string.

set.seed(1234)
df &lt;- data.frame(&quot;Colonne1&quot; = c(1, 2, NA, 4, 5, NA), &quot;Colonne2&quot; = c(6, NA, 8, 9, NA, 11), &quot;Colonne3&quot; = c(NA, 13, 14, NA, 16, 17), &quot;Colonne4&quot; = runif(6))
means_ &lt;- colMeans(df, na.rm = TRUE)[-c(ncol(df))]
for(name in names(means_)){
  df &lt;- df %&gt;% mutate({{name}} := ifelse(is.na(!!sym(name)), Colonne4 * means_[!!name], !!sym(name)))
}
df
#&gt;   Colonne1  Colonne2  Colonne3  Colonne4
#&gt; 1 1.000000  6.000000  1.705551 0.1137034
#&gt; 2 2.000000  5.289545 13.000000 0.6222994
#&gt; 3 1.827824  8.000000 14.000000 0.6092747
#&gt; 4 4.000000  9.000000  9.350692 0.6233794
#&gt; 5 5.000000  7.317781 16.000000 0.8609154
#&gt; 6 1.920932 11.000000 17.000000 0.6403106

<sup>Created on 2023-06-15 with reprex v2.0.2</sup>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在 mutate 中传递一个变量

问题

Error

答案1

删除一列，保持单元格的格式。

如何根据一列添加增量值？

Subset a list of matrices by common column names.

操作数据框并总结

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。