英文:
How to prevent dplyr::select from combining names rather than assigning a new name?
问题
# 我试图基于另一个命名的向量选择列,并同时为该列分配一个新名称。但是,dplyr似乎会合并名称,我在文档中找不到停止此操作的选项。
data <- data.frame(day = 1,
week = 2,
n_in_year = 365)
new_values <- c(period = "day",
n = 365)
# 这会合并名称而不是分配新名称
data %>%
dplyr::select(time = new_values[1])
# 例如
# time...period
# 1 1
# 我希望它的行为像这样
data %>%
dplyr::select(new_values[1]) %>%
dplyr::rename(time = period)
英文:
I'm trying to select columns based on another named vector and assign a new name for that column at the same time. However dplyr appears to combine the names and I can't see an option to stop this in the documentation.
# dplyr ‘1.1.2’
# R Version 4.3.0
data <- data.frame(day = 1,
week = 2,
n_in_year = 365)
new_values <- c(period = "day",
n = 365)
# This combines the names rather than assigning a new name
data %>%
dplyr::select(time = new_values[1])
# e.g
# time...period
# 1 1
# I want it to behave like this
data %>%
dplyr::select(new_values[1]) %>%
dplyr::rename(time = period)
答案1
得分: 2
问题
您的问题是new_values
仍然保留了其names()
:
data %>% dplyr::select(time = new_values[1])
#> time...period
#> 1 1
data %>% dplyr::select(time = unname(new_values)[1])
#> time
#> 1 1
这种行为是有意为之的,用于dplyr::select()
所使用的"整洁选择"。传递一个命名的character
向量(如new_values
)的列名将允许程序化用户在各种层次中“组合”和“传播”列名。下面的文档以symbol
而不是字符串来说明这一点:
mtcars %>% select_loc(foo = c(mpg, cyl)) #> foo1 foo2 #> 1 2
mtcars %>% select_loc(foo = c(bar = mpg, baz = cyl)) #> foo...bar foo...baz #> 1 2
mtcars %>% select_loc(foo = c(bar = c(mpg, cyl))) #> foo...bar1 foo...bar2 #> 1 2
解决方案
虽然unname()
可以完成任务,但最好直接使用[[
来提取没有名称(period
)的值...
# |---|
data %>% dplyr::select(time = new_values[[1]])
#> time
#> 1 1
# |----------|
data %>% dplyr::select(time = new_values[["period"]])
#> time
#> 1 1
...或者更好的办法是将new_values
制作成一个list
,这样其值(如365
)不会全部被强制转换为字符串(如"365"
)在一个character
向量中:
# 原始的'new_values'作为向量...
new_values <- c(period = "day", n = 365)
new_values
#> period n
#> "day" "365"
# ...以及新的'new_values'作为列表:
new_values <- list(period = "day", n = 365)
new_values
#> $period
#> [1] "day"
#>
#> $n
#> [1] 365
# 轻松选择()您想要的: |-----|
data %>% dplyr::select(time = new_values$period)
#> time
#> 1 1
英文:
Issue
Your issue is that new_values
still has its names()
:
data %>% dplyr::select(time = new_values[1])
#> time...period
#> 1 1
data %>% dplyr::select(time = unname(new_values)[1])
#> time
#> 1 1
This behavior is intentional, for the "tidy selection" used by dplyr::select()
. Passing a named character
vector (like new_values
) of column names will allow a programmatic user to "combine" and "propagate" column names in various hierarchies. This is illustrated by the documentation below, with symbol
s rather than strings:
> mtcars %>% select_loc(foo = c(mpg, cyl))
> #> foo1 foo2
> #> 1 2
> mtcars %>% select_loc(foo = c(bar = mpg, baz = cyl))
> #> foo...bar foo...baz
> #> 1 2
> mtcars %>% select_loc(foo = c(bar = c(mpg, cyl)))
> #> foo...bar1 foo...bar2
> #> 1 2
Solution
While unname()
does the job, you're better off just using [[
to extract the value without the name (period
)...
# |---|
data %>% dplyr::select(time = new_values[[1]])
#> time
#> 1 1
# |----------|
data %>% dplyr::select(time = new_values[["period"]])
#> time
#> 1 1
...or better yet, making new_values
a list
, so its values (like 365
) are not all coerced to strings (like "365"
) in a character
vector:
# Original 'new_values' as a vector...
new_values <- c(period = "day", n = 365)
new_values
#> period n
#> "day" "365"
# ...and new 'new_values' as a list:
new_values <- list(period = "day", n = 365)
new_values
#> $period
#> [1] "day"
#>
#> $n
#> [1] 365
# Easily select() what you want: |-----|
data %>% dplyr::select(time = new_values$period)
#> time
#> 1 1
答案2
得分: 1
请检查下面更新的代码:
# dplyr ‘1.1.2’
# R Version 4.3.0
data <- data.frame(day = 1,
week = 2,
n_in_year = 365)
new_values <- c(period = "day",
n = 365)
# 这个组合了名称,而不是分配新名称
data %>%
dplyr::select(time = new_values[[1]])
time
1 1
# 我希望它的行为像这样
data %>%
dplyr::select(new_values[[1]]) %>%
dplyr::rename(time = day)
请注意,我保留了代码中的英文部分,只翻译了注释和一些注释中的内容。
英文:
Please check the updated code below
# dplyr ‘1.1.2’
# R Version 4.3.0
data <- data.frame(day = 1,
week = 2,
n_in_year = 365)
new_values <- c(period = "day",
n = 365)
# This combines the names rather than assigning a new name
data %>%
dplyr::select(time = new_values[[1]])
time
1 1
# I want it to behave like this
data %>%
dplyr::select(new_values[[1]]) %>%
dplyr::rename(time = day)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论