英文:
R: remove rows in data frame for which all columns contain same content or nothing
问题
我有一个数据框:
# 创建一个数据框
V1 = c("gene_1", "gene_1", "", "")
V2 = c("gene_2", "gene_2", "", "")
V3 = c("gene_3", "gene_3", "gene_4", "")
V4 = c("gene_4", "gene_4", "", "")
V5 = c("gene_5", "gene_5", "gene_8", "")
V6 = c("gene_6", "gene_6", "gene_6", "gene_7")
df = as.data.frame(rbind(V1, V2, V3, V4, V5, V6))
数据框df看起来像这样:
V1 V2 V3 V4
1 gene_1 gene_1
2 gene_2 gene_2
3 gene_3 gene_3 gene_4
4 gene_4 gene_4
5 gene_5 gene_5 gene_8
6 gene_6 gene_6 gene_6 gene_7
现在,我想要删除所有只包含相同基因标签的行,结果如下:
V1 V2 V3 V4
3 gene_3 gene_3 gene_4
5 gene_5 gene_5 gene_8
6 gene_6 gene_6 gene_6 gene_7
我在Stack Overflow上找到了一些类似的问题,包括这里,但这些解决方案都不适用于我的确切问题。我觉得这应该很简单,但似乎找不到如何处理的方法。
英文:
I have a data frame:
# create a data frame
V1 = c("gene_1", "gene_1", "", "")
V2 = c("gene_2", "gene_2", "", "")
V3 = c("gene_3", "gene_3", "gene_4", "")
V4 = c("gene_4", "gene_4", "", "")
V5 = c("gene_5", "gene_5", "gene_8", "")
V6 = c("gene_6", "gene_6", "gene_6", "gene_7")
df = as.data.frame(rbind(V1, V2, V3, V4, V5, V6))
The data frame df looks like this:
> V1 V2 V3 V4
> V1 gene_1 gene_1
> V2 gene_2 gene_2
> V3 gene_3 gene_3 gene_4
> V4 gene_4 gene_4
> V5 gene_5 gene_5 gene_8
> V6 gene_6 gene_6 gene_6 gene_7
Now, I want to remove all the rows that have only labels of the same gene, resulting in:
> V1 V2 V3 V4
> V3 gene_3 gene_3 gene_4
> V5 gene_5 gene_5 gene_8
> V6 gene_6 gene_6 gene_6 gene_7
I found several similar questions on stack overflow, including here but none of these solutions work for my exact issue. I feel like this should be easy, but I can't seem to find how to go about this.
答案1
得分: 0
我找到了一个解决方案,基于我在这里找到的另一篇帖子:
df[df == '' | is.na(df)] <- NA
df %>% filter(if_any(V2:V4, ~ .x != V1))
给出结果:
> V1 V2 V3 V4
> V3 gene_3 gene_3 gene_4 <NA>
> V5 gene_5 gene_5 gene_8 <NA>
> V6 gene_6 gene_6 gene_6 gene_7
英文:
I found a solution, based on another post that I found here:
df[df == ''] <- NA
df %>% filter(if_any(V2:V4, ~ .x != V1))
Gives:
> V1 V2 V3 V4
> V3 gene_3 gene_3 gene_4 <NA>
> V5 gene_5 gene_5 gene_8 <NA>
> V6 gene_6 gene_6 gene_6 gene_7
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论