Filter tibble in R when column names (to be filtered) and values are in vectors?

huangapple go评论60阅读模式
英文:

Filter tibble in R when column names (to be filtered) and values are in vectors?

问题

这可能是一个晦涩的问题或用例,但是否有一种快速的方法可以在列名和值都在向量内的情况下筛选一个tibble?

比方说,我想在mtcars中筛选mpghp。我可以这样做:

filter(mtcars, mpg >= 15 & hp >= 100)

但相反,假设我有几个筛选案例,其中要筛选的列在一个向量中,而值在另一个向量中。在实际应用中,我可能在更大的数据框中有四到五个这样的案例。

car_stat <- c('mpg', 'hp')
car_value <- c(15, 100)

显然,这样不起作用。

filter(mtcars, car_stat >= car_value)

但是否有一种简洁的dplyr/tidyverse方式可以使用向量进行筛选,或者我必须使用循环将其拆分为长度为1的单独向量?

英文:

This might be an esoteric question or use-case, but is there a quick way to filter a tibble when the column names and values are inside vectors?

Say I want to filter mpg and hp in mtcars. I could do something like:

filter(mtcars, mpg &gt;= 15 &amp; hp &gt;= 100)

But instead, say I have several filtering cases -- with the columns to be filtered in one vector and the values in another. (In practice, I might have four or five cases in a larger df.)

car_stat &lt;- c(&#39;mpg&#39;, &#39;hp&#39;)
car_value &lt;- c(15, 100)

Obviously this doesn't work.

filter(mtcars, car_stat &gt;= car_value)

But is there some succinct dplyr/tidyverse way to filter with vectors, or am I resigned to using some loop to break it up into separate vectors, each of length one?

答案1

得分: 5

使用您的变量和数值,您可以将它们转化为过滤表达式。在这里,我们使用基本的R Mapbquote 函数。

car_stat <- c('mpg', 'hp')
car_value <- c(15, 100)

criteria <- unname(Map(function(c, v) bquote(.(as.name(c)) >= .(v)), car_stat, car_value))
criteria
# [[1]]
# mpg >= 15
# 
# [[2]]
# hp >= 100

这将创建一个表达式的列表,用于过滤。然后,您可以使用 !!! 将它们传递给 filter 函数。

dplyr::filter(mtcars, !!!criteria)
#                    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
# Mazda RX4         21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
# Mazda RX4 Wag     21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
# Hornet 4 Drive    21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
# Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
# ...

以上是您要求的翻译。

英文:

Using your variables and values, you can turn those into filtering expressions. Here we use the base R Map and bquote functions

car_stat &lt;- c(&#39;mpg&#39;, &#39;hp&#39;)
car_value &lt;- c(15, 100)

criteria &lt;- unname(Map(function(c, v) bquote(.(as.name(c))&gt;=.(v)), car_stat, car_value))
criteria
# [[1]]
# mpg &gt;= 15
# 
# [[2]]
# hp &gt;= 100

This creates a list of expressions that you want for your filter. Then you can inejct them to filter with !!!

dplyr::filter(mtcars, !!!criteria)
#                    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
# Mazda RX4         21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
# Mazda RX4 Wag     21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
# Hornet 4 Drive    21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
# Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
# ...

答案2

得分: 1

以下是翻译的代码部分:

这里是另一种方法,利用了data.table 1.14.9的`env`参数。
library(data.table)
cars = setDT(copy(mtcars))

do.call(
  fintersect,
  lapply(1:2, \(i) cars[k>=z, env = list(k=car_stat[i], z =car_value[i])])
)
输出:
     mpg cyl  disp  hp drat    wt  qsec vs am gear carb id
 1: 21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4  1
 2: 21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4  2
 3: 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1  4
 4: 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2  5
 5: 18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1  6
 6: 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4 10
 7: 17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4 11
 8: 16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3 12
 9: 17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3 13
10: 15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3 14
11: 15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2 22
12: 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2 23
13: 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2 25
14: 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2 28
15: 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4 29
16: 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6 30
17: 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8 31
18: 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2 32
英文:

Here is another approach that leverages the env parameter of data.table 1.14.9

library(data.table)
cars = setDT(copy(mtcars))
do.call(
fintersect,
lapply(1:2, \(i) cars[k&gt;=z, env = list(k=car_stat[i], z =car_value[i])])
)

Output:

     mpg cyl  disp  hp drat    wt  qsec vs am gear carb id
1: 21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4  1
2: 21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4  2
3: 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1  4
4: 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2  5
5: 18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1  6
6: 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4 10
7: 17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4 11
8: 16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3 12
9: 17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3 13
10: 15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3 14
11: 15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2 22
12: 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2 23
13: 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2 25
14: 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2 28
15: 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4 29
16: 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6 30
17: 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8 31
18: 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2 32

huangapple
  • 本文由 发表于 2023年3月4日 06:31:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/75632388.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定