英文:
generate R function to filter data from a new object
问题
尝试开发一个新的对象(用于生物信息学),所以我想从对象的某些元素中提取信息,并且我想创建一个类似于dplyr的filter函数的函数。对象有一个数据框,所以我想从中过滤信息,但问题是数据框的某些列始终存在,但有些列不是,所以我想创建一个函数,使用那些不总是存在的列进行提取。
类似于:
my_function(object, parametersXYZ){
data <- object@gene_table
# 生成过滤条件
result <- filter(data, parametersXYZ)
return(result)
}
在这个意义上,parametersXYZ 对应于列、逻辑运算符(==、!=、%in%)和要过滤的元素,因此函数将被使用如下:
my_function(myobject, genes == "rtxA")
这样函数将在gene列中过滤所有的rtxA元素。我在寻找示例但是没找到,所以我试图检查dplyr中filter的代码,但我不确定如何实现它!!!
这是数据框的一个例子:
myobject[["gene_table"]] %>% head()
cluster qseqid bp nseqs sample gene VF
1 cluster_00001 IOHEFJOD_02210 15627 20 Sample-001 rtxA rtxA
2 cluster_00001 CJMKIBHP_00364 15621 20 Sample-002 rtxA rtxA
3 cluster_00001 JEJKLKDJ_00421 15621 20 Sample-003 rtxA rtxA
4 cluster_00001 MOOCIOKH_00638 15621 20 Sample-004 rtxA rtxA
5 cluster_00001 HJJCNJPA_01986 15621 20 Sample-005 rtxA rtxA
6 cluster_00001 MDIJOING_00449 15621 20 Sample-006 rtxA rtxA
前4列始终存在,但其余的可能有不同的名称,甚至可能有任意数量的列,这些列是我的问题!!! 任何手册、建议或想法,非常感谢!!!
英文:
I’m trying to develop a new object (for bioinformatics), so I want to extract information from the some elements of the object, and I want to create a function similar to filter of dplyr. The object have a data frame, so I want to filter information from it, but the problem is that some columns of the data.frame are always present, but some of them not, so I want to create a function that extract using the columns that are not always present.
Something like:
my_function(object, parametersXYZ){
data <- object@gene_table
#generate the filter
result <- filter(data, parametersXYZ)
return(results)
}
In this sense, parametersXYZ correspond to the column, the logical operator (==, !=, %in%) and the element to filter, so the function will be use like
my_function(myobject, genes == “rtxA”)
so the function will filter all the rtxA elements in the column gene. I was looking for examples but I just didn’t find it, so I tried to check the code of filter on dplyr, but I’m not sure how to implement it !!!
this is an example of the data.frame
myobject[["gene_table"]] %>% head()
cluster qseqid bp nseqs sample gene VF
1 cluster_00001 IOHEFJOD_02210 15627 20 Sample-001 rtxA rtxA
2 cluster_00001 CJMKIBHP_00364 15621 20 Sample-002 rtxA rtxA
3 cluster_00001 JEJKLKDJ_00421 15621 20 Sample-003 rtxA rtxA
4 cluster_00001 MOOCIOKH_00638 15621 20 Sample-004 rtxA rtxA
5 cluster_00001 HJJCNJPA_01986 15621 20 Sample-005 rtxA rtxA
6 cluster_00001 MDIJOING_00449 15621 20 Sample-006 rtxA rtxA
the first 4 columns are always present, but the rest could present different names, or even could have any number of columns, those columns are my problem !!!
Any manual, suggestion or idea
Thanks so much !!!
答案1
得分: 2
这是一个获取评估的方法
my_function <- function(object, parametersXYZ){
filter(object@gene_table, {{parametersXYZ}})
}
-在不使用object@gene_table
的情况下进行测试
> my_function(iris, Species == "setosa") %>% head
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
英文:
Here is one way to get evaluated
my_function <- function(object, parametersXYZ){
filter(object@gene_table, {{parametersXYZ}})
}
-testing without using the object@gene_table
> my_function(iris, Species == "setosa") %>% head
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
</details>
# 答案2
**得分**: 1
以下是翻译的代码部分:
```R
my_function <- function(object, parametersXYZ){
par <- substitute(parametersXYZ)
dat <- object@gene_table
subset(dat, eval(par, dat))
}
英文:
In Base R you could do:
my_function <- function(object, parametersXYZ){
par <- substitute(parametersXYZ)
dat <- object@gene_table
subset(dat, eval(par, dat))
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论