2023年6月27日 20:14:37go评论100阅读模式

英文:

How to prevent dplyr::select from combining names rather than assigning a new name?

问题

# 我试图基于另一个命名的向量选择列，并同时为该列分配一个新名称。但是，dplyr似乎会合并名称，我在文档中找不到停止此操作的选项。
data <- data.frame(day = 1,
                   week = 2,
                   n_in_year = 365)
new_values <- c(period = "day",
 n = 365)
# 这会合并名称而不是分配新名称
data %>%
  dplyr::select(time = new_values[1])
# 例如
#   time...period
# 1             1
# 我希望它的行为像这样
data %>%
  dplyr::select(new_values[1]) %>%
  dplyr::rename(time = period)

英文:

I'm trying to select columns based on another named vector and assign a new name for that column at the same time. However dplyr appears to combine the names and I can't see an option to stop this in the documentation.

# dplyr ‘1.1.2’
# R Version 4.3.0
data &lt;- data.frame(day = 1,
                   week = 2,
                   n_in_year = 365)
new_values &lt;- c(period = &quot;day&quot;,
 n = 365)
# This combines the names rather than assigning a new name
data %&gt;%
  dplyr::select(time = new_values[1])
# e.g
#   time...period
# 1             1
# I want it to behave like this
data %&gt;%
  dplyr::select(new_values[1]) %&gt;%
  dplyr::rename(time = period)

答案1

得分: 2

问题

您的问题是new_values仍然保留了其names()：

data %>% dplyr::select(time = new_values[1])
#>   time...period
#> 1             1
data %>% dplyr::select(time = unname(new_values)[1])
#>   time
#> 1    1

这种行为是有意为之的，用于dplyr::select()所使用的"整洁选择"。传递一个命名的character向量（如new_values）的列名将允许程序化用户在各种层次中“组合”和“传播”列名。下面的文档以symbol而不是字符串来说明这一点：

mtcars %>% select_loc(foo = c(mpg, cyl))
#> foo1 foo2 
#>    1    2

mtcars %>% select_loc(foo = c(bar = mpg, baz = cyl))
#> foo...bar foo...baz 
#>         1         2

mtcars %>% select_loc(foo = c(bar = c(mpg, cyl)))
#> foo...bar1 foo...bar2 
#>          1          2

解决方案

虽然unname()可以完成任务，但最好直接使用[[来提取没有名称（period）的值...

#                                       |---|
data %>% dplyr::select(time = new_values[[1]])
#>   time
#> 1    1
#                                       |----------|
data %>% dplyr::select(time = new_values[["period"]])
#>   time
#> 1    1

...或者更好的办法是将new_values制作成一个list，这样其值（如365）不会全部被强制转换为字符串（如"365"）在一个character向量中：

# 原始的'new_values'作为向量...
new_values <- c(period = "day", n = 365)
new_values
#> period      n 
#>  "day"  "365" 
# ...以及新的'new_values'作为列表：
new_values <- list(period = "day", n = 365)
new_values
#> $period
#> [1] "day"
#> 
#> $n
#> [1] 365
# 轻松选择()您想要的：        |-----|
data %>% dplyr::select(time = new_values$period)
#>   time
#> 1    1

英文:

Issue

Your issue is that new_values still has its names():

data %&gt;% dplyr::select(time = new_values[1])
#&gt;   time...period
#&gt; 1             1
data %&gt;% dplyr::select(time = unname(new_values)[1])
#&gt;   time
#&gt; 1    1

This behavior is intentional, for the "tidy selection" used by dplyr::select(). Passing a named character vector (like new_values) of column names will allow a programmatic user to "combine" and "propagate" column names in various hierarchies. This is illustrated by the documentation below, with symbols rather than strings:

> mtcars %>% select_loc(foo = c(mpg, cyl))
> #> foo1 foo2
> #> 1 2

> mtcars %>% select_loc(foo = c(bar = mpg, baz = cyl))
> #> foo...bar foo...baz
> #> 1 2

> mtcars %>% select_loc(foo = c(bar = c(mpg, cyl)))
> #> foo...bar1 foo...bar2
> #> 1 2

Solution

While unname() does the job, you're better off just using [[ to extract the value without the name (period)...

#                                       |---|
data %&gt;% dplyr::select(time = new_values[[1]])
#&gt;   time
#&gt; 1    1
#                                       |----------|
data %&gt;% dplyr::select(time = new_values[[&quot;period&quot;]])
#&gt;   time
#&gt; 1    1

...or better yet, making new_values a list, so its values (like 365) are not all coerced to strings (like "365") in a character vector:

# Original &#39;new_values&#39; as a vector...
new_values &lt;- c(period = &quot;day&quot;, n = 365)
new_values
#&gt; period      n 
#&gt;  &quot;day&quot;  &quot;365&quot; 
# ...and new &#39;new_values&#39; as a list:
new_values &lt;- list(period = &quot;day&quot;, n = 365)
new_values
#&gt; $period
#&gt; [1] &quot;day&quot;
#&gt; 
#&gt; $n
#&gt; [1] 365
# Easily select() what you want:        |-----|
data %&gt;% dplyr::select(time = new_values$period)
#&gt;   time
#&gt; 1    1

答案2

得分: 1

请检查下面更新的代码：

# dplyr ‘1.1.2’
# R Version 4.3.0
data <- data.frame(day = 1,
                   week = 2,
                   n_in_year = 365)
new_values <- c(period = "day",
                n = 365)
# 这个组合了名称，而不是分配新名称
data %>%
  dplyr::select(time = new_values[[1]])
  time
1    1
# 我希望它的行为像这样
data %>%
  dplyr::select(new_values[[1]]) %>%
  dplyr::rename(time = day)

请注意，我保留了代码中的英文部分，只翻译了注释和一些注释中的内容。

英文:

Please check the updated code below

# dplyr ‘1.1.2’
# R Version 4.3.0
data &lt;- data.frame(day = 1,
                   week = 2,
                   n_in_year = 365)
new_values &lt;- c(period = &quot;day&quot;,
                n = 365)
# This combines the names rather than assigning a new name
data %&gt;%
  dplyr::select(time = new_values[[1]])
  time
1    1
# I want it to behave like this
data %&gt;%
  dplyr::select(new_values[[1]]) %&gt;%
  dplyr::rename(time = day)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何防止dplyr::select合并名称而不是分配新名称？

问题

答案1

问题

解决方案

Issue

Solution

答案2

如何按自定义顺序排列数据框列中的字符向量？

检查是否在R中已存在一个图表？

如何解决在R中使用gsub函数时出现的.checkTypos(e, names_x)错误。

nops_eval在扫描响应中出现错误。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。