英文:
R dplyr: how to filter a column within a grep() when its name is stored in a vector?
问题
Filtering a column of a data frame data
using dplyr
when its name is stored within an "external" vector col_name
can be achieved using !!
. However, this no longer appears to work if trying to apply the same logic within a filter and a grepl()
.
How one should do that in plyr in a single line?
data <- read.table(text="Sol_name geo_pos loc_pos dol_pos pol_pos kol_pos
A 1 1 0 0 1
B 0 1 1 0 0
C 1 0 1 1 1
D 0 1 0 0 1",h=T)
kw <- "B|D"
data %>%
filter(grepl(toupper(kw), toupper(Sol_name))) # works
Sol_name geo_pos loc_pos dol_pos pol_pos kol_pos
B 0 1 1 0 0
D 0 1 0 0 1
col_name <- "Sol_name"
data %>%
filter(grepl(toupper(kw), toupper(!!col_name))) # does not
[1] Sol_name geo_pos loc_pos dol_pos pol_pos kol_pos
<0 rows> (or 'row.names' of length 0)
英文:
Filtering a column of a data frame data
using dplyr
when its name is stored within an "external" vector col_name
can be achieved using !!
. However, this no longer appears to work if trying to apply the same logic within a filter and a grepl()
.
How one should do that in plyr in a single line?
data <- read.table(text="Sol_name geo_pos loc_pos dol_pos pol_pos kol_pos
A 1 1 0 0 1
B 0 1 1 0 0
C 1 0 1 1 1
D 0 1 0 0 1",h=T)
kw <- "B|D"
data %>%
filter(grepl(toupper(kw), toupper(Sol_name))) # works
Sol_name geo_pos loc_pos dol_pos pol_pos kol_pos
1 B 0 1 1 0 0
2 D 0 1 0 0 1
col_name <- "Sol_name"
data %>%
filter(grepl(toupper(kw), toupper(!!col_name))) # does not
[1] Sol_name geo_pos loc_pos dol_pos pol_pos kol_pos
<0 lignes> (ou 'row.names' de longueur nulle)
答案1
得分: 4
你可以使用 [[
与 .
或 .data
:
col_name <- "Sol_name"
kw <- "B|D"
data %>%
filter(grepl(kw, .data[[col_name]]))
# Sol_name geo_pos loc_pos dol_pos pol_pos kol_pos
# 1 B 0 1 1 0 0
# 2 D 0 1 0 0 1
来自 ?.data
.data 与 magrittr 代词 . 的区别
在 magrittr 流水线中,.data 与 magrittr 代词 .. 不一定可以互换使用。特别是在分组的数据框中,.data 代表当前组的切片,而代词 . 代表整个数据框。在数据蒙版上下文中,始终优先使用 .data。
英文:
You can use [[
with .
or .data
:
col_name <- "Sol_name"
kw <- "B|D"
data %>%
filter(grepl(kw, .data[[col_name]]))
# Sol_name geo_pos loc_pos dol_pos pol_pos kol_pos
# 1 B 0 1 1 0 0
# 2 D 0 1 0 0 1
From ?.data
> .data versus the magrittr pronoun .
> In a magrittr pipeline, .data is not necessarily interchangeable with the magrittr pronoun .. With grouped data frames in particular, .data represents the current group
> slice whereas the pronoun . represents the whole data frame. Always
> prefer using .data in data-masked context.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论