使用R中的`across()`创建多个新列

huangapple go评论80阅读模式
英文:

Creating multiple NEW columns using across() in R

问题

我的问题与现有问题的不同之处在于我想要使用 mutate 创建不依赖于现有列的新列。

一些虚拟数据:

library(dplyr)
dat <- tibble(
    a = 1:5,
    b = LETTERS[1:5]
)

我知道我可以像这样逐个创建新列:

dat <- dat %>%
    mutate(foo = NA, bar = NA, bar2 = NA)

并且我可以更方便地使用 across 修改列,例如:

new_vars <- c("foo", "bar", "bar2")
dat <- dat %>%
    mutate(across(all_of(new_vars), ~ replace(., is.na(.), 0)))

但是如何以类似的方式创建新列而不引用现有列呢?例如,添加新列并填充 NA

tibble(
    a = 1:5,
    b = LETTERS[1:5]
) %>%
    # mutate(across(all_of(new_vars), ~ function(.x) NA))  # 错误
    mutate(across(all_of(new_vars), NA))                   # 错误

欢迎任何 tidyverse 的替代方法。

英文:

The difference between my question and existing questions is that I want to create new columns with mutate that do not depend on existing columns.

Some dummy data:

library(dplyr)
dat &lt;- tibble(
    a = 1:5,
    b = LETTERS[1:5]
)

I know I can create new columns one-by-one like so

dat &lt;- dat %&gt;%
    mutate(foo = NA, bar = NA, bar2 = NA)

And I can modify columns more conveniently using across, e.g. :

new_vars &lt;- c(&quot;foo&quot;, &quot;bar&quot;, &quot;bar2&quot;)
dat &lt;- dat %&gt;%
    mutate(across(all_of(new_vars), ~ replace(., is.na(.), 0)))

But how do I create new columns without referencing existing columns in a similar manner? E.g. adding new columns filled with NA:

tibble(
    a = 1:5,
    b = LETTERS[1:5]
) %&gt;% 
    # mutate(across(all_of(new_vars), ~ function(.x) NA))  # Error
    mutate(across(all_of(new_vars), NA))                   # Error

Open to any tidyverse alternatives.

答案1

得分: 4

类似于这个答案在流行问题这里中,你可以使用:

new_vars <- c("foo", "bar", "bar2")

tibble(
  a = 1:5,
  b = LETTERS[1:5]
) %>% 
  mutate(!!!setNames(rep(NA, length(new_vars)), new_vars))
# 或者(感谢 @joran)
# tibble::add_column(!!!setNames(rep(NA, length(new_vars)), new_vars))

输出:

     a b     foo   bar   bar2 
  <int> <chr> <lgl> <lgl> <lgl>
1     1 A     NA    NA    NA   
2     2 B     NA    NA    NA   
3     3 C     NA    NA    NA   
4     4 D     NA    NA    NA   
5     5 E     NA    NA    NA   
英文:

Similar to this answer buried in the popular question here, you can use:

new_vars &lt;- c(&quot;foo&quot;, &quot;bar&quot;, &quot;bar2&quot;)

tibble(
  a = 1:5,
  b = LETTERS[1:5]
) %&gt;% 
  mutate(!!!setNames(rep(NA, length(new_vars)), new_vars))
# or (thanks @joran)
# tibble::add_column(!!!setNames(rep(NA, length(new_vars)), new_vars))

output

     a b     foo   bar   bar2 
  &lt;int&gt; &lt;chr&gt; &lt;lgl&gt; &lt;lgl&gt; &lt;lgl&gt;
1     1 A     NA    NA    NA   
2     2 B     NA    NA    NA   
3     3 C     NA    NA    NA   
4     4 D     NA    NA    NA   
5     5 E     NA    NA    NA   

答案2

得分: 4

我尽量保留代码部分的原文,以下是翻译的内容:

我像下一个人一样经常使用tidyverse工具,但为了避免以简单的方式完成任务,我们正在采取一些有点可笑的措施,以我个人的看法。

这里,管道友好。

library(dplyr)
dat <- tibble(
  a = 1:5,
  b = LETTERS[1:5]
)

new_vars <- c("foo", "bar", "bar2")

# ?
# dat[new_vars] <- NA

add_vars <- function(df, vars, val) {
  df[vars] <- val
  df
}

dat |>
  add_vars(df = _, vars = new_vars, val = NA)

你甚至可以使用匿名函数(但只能在magrittr管道中使用):

dat %>%
  (\(x) {x[new_vars] <- NA; x})

这也适用于function(x)语法(与magrittr管道一起使用)。

英文:

I use tidyverse stuff as much as the next fellow, but the lengths we're going to to avoid doing things the simple way is getting a little silly, imho.

Here. Pipe friendly.

library(dplyr)
dat &lt;- tibble(
  a = 1:5,
  b = LETTERS[1:5]
)

new_vars &lt;- c(&quot;foo&quot;, &quot;bar&quot;, &quot;bar2&quot;)

# ?
# dat[new_vars] &lt;- NA

add_vars &lt;- function(df,vars,val){
  df[vars] &lt;- val
  df
}

dat |&gt;
  add_vars(df = _,vars = new_vars,val = NA)

You could even use an anonymous function (but only with the magrittr pipe):

dat %&gt;%
  (\(x) {x[new_vars] &lt;- NA; x})

This also works (with the magrittr pipe) with the function(x) syntax.

答案3

得分: 3

使用dplyr::bind_cols()和管道:

library(dplyr)

tibble(a = 1:5,
       b = LETTERS[1:5]) %>% 
bind_cols(., setNames(lapply(new_vars, function(x) x = NA), new_vars))

结果:

# A tibble: 5 × 5
      a b     foo   bar   bar2 
  <int> <chr> <lgl> <lgl> <lgl>
1     1 A     NA    NA    NA   
2     2 B     NA    NA    NA   
3     3 C     NA    NA    NA   
4     4 D     NA    NA    NA   
5     5 E     NA    NA    NA

虽然我认为此问题第二个回答也很不错。

如果你真的想要使用mutate,Ritchie在评论中的答案也可以。

英文:

Using dplyr::bind_cols() and pipes:

library(dplyr)

tibble(a = 1:5,
       b = LETTERS[1:5]) %&gt;% 
bind_cols(., setNames(lapply(new_vars, function(x) x = NA), new_vars))

Result:

# A tibble: 5 &#215; 5
      a b     foo   bar   bar2 
  &lt;int&gt; &lt;chr&gt; &lt;lgl&gt; &lt;lgl&gt; &lt;lgl&gt;
1     1 A     NA    NA    NA   
2     2 B     NA    NA    NA   
3     3 C     NA    NA    NA   
4     4 D     NA    NA    NA   
5     5 E     NA    NA    NA

Although I think the second answer to this question, on which this is based, is just as good.

If you really want mutate, Ritchie's answer in the comments works.

答案4

得分: 1

也许这是你正在寻找的样式:

library(dplyr)

dat <- tibble(
    a = 1:5,
    b = LETTERS[1:5]
)

new_vars <- c("foo", "bar", "bar2")

dat %>% 
    purrr::reduce(new_vars, ~mutate(.x, {{.y}} := 0), .init = .)

与使用 across() 不同,我们使用 purrr::reduce(),它将循环遍历 new_vars。我们将 mutate 函数应用于前一次迭代的输出。我们希望以 .init = dat 开始,但将其传递给管道中。

如果你想为每个 new_vars 设置不同的值,甚至可以使用 reduce2

英文:

Maybe this is the style you're looking for:

library(dplyr)

dat &lt;- tibble(
	a = 1:5,
	b = LETTERS[1:5]
)

new_vars &lt;- c(&quot;foo&quot;, &quot;bar&quot;, &quot;bar2&quot;)

dat %&gt;% 
	purrr::reduce(new_vars, ~mutate(.x, {{.y}} := 0), .init = .)

Instead of using across() we use purrr::reduce() which will loop over the new_vars. We apply the mutate function to the output of the previous iteration. We want to start with .init = dat, but we pipe it in.

You could even use reduce2 if you wanted to have different values for each of the new_vars.

huangapple
  • 本文由 发表于 2023年8月9日 08:45:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/76863910-2.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定