根据 tidyverse 风格在 R 中正确排序/排序列基于列名中的字符串。

huangapple go评论63阅读模式
英文:

Correctly order/sort columns based on string in colnames tidyverse style in R

问题

这只是我拥有的一个大数据框的一部分

dput(MyData)
structure(list(Frui1_Trea4_Ty4_0d = c(10L, 4L, 28L, 147L, 6L), 
    Frui1_Trea4_Ty4_14d = c(18L, 0L, 26L, 70L, 27L), Frui1_Trea4_Ty8_0d = c(9L, 
    1L, 21L, 168L, 6L), Frui1_Trea4_Ty8_14d = c(19L, 0L, 58L, 74L, 
    10L), Frui2_Trea4_Ty4_0d = c(40L, 6L, 39L, 141L, 15L), Frui2_Trea4_Ty4_14d = c(74L, 
    1L, 91L, 24L, 8L), Frui2_Trea4_Ty8_0d = c(22L, 0L, 50L, 54L, 
    17L), Frui2_Trea4_Ty8_14d = c(80L, 0L, 43L, 65L, 9L)), row.names = c("MTC88", 
"MTND2P28", "MTCO1P12", "MTATP6P1", "MTCO3P12"), class = "data.frame")

我有许多其他列,但它们保持相同的“逻辑”

我一直在努力,因为我只想重新排序列,使所有名称以“_0d”结尾的列首先排列在数据框中,以及名称以“_14d”结尾的列排在数据框的末尾。

我尝试过

MyData %>% dplyr::select(sort(names(.)))

如果我想按字母顺序排列,这将起作用,但当我尝试像这样的内容时:

MyData %>% dplyr::select(names(stringr::str_sort("d0", "d7")))

我只会得到一个错误。
我想使用select(contains(.))来解决问题,但似乎无法正确使用它。
有人可以帮忙吗?因为我有更多的列,而且我还要进行进一步的分析筛选,所以我希望保持“tidyverse”的方式。

英文:

This is just a slice of a large dataframe that I have

dput(MyData)
structure(list(Frui1_Trea4_Ty4_0d = c(10L, 4L, 28L, 147L, 6L), 
    Frui1_Trea4_Ty4_14d = c(18L, 0L, 26L, 70L, 27L), Frui1_Trea4_Ty8_0d = c(9L, 
    1L, 21L, 168L, 6L), Frui1_Trea4_Ty8_14d = c(19L, 0L, 58L, 74L, 
    10L), Frui2_Trea4_Ty4_0d = c(40L, 6L, 39L, 141L, 15L), Frui2_Trea4_Ty4_14d = c(74L, 
    1L, 91L, 24L, 8L), Frui2_Trea4_Ty8_0d = c(22L, 0L, 50L, 54L, 
    17L), Frui2_Trea4_Ty8_14d = c(80L, 0L, 43L, 65L, 9L)), row.names = c("MTC88", 
"MTND2P28", "MTCO1P12", "MTATP6P1", "MTCO3P12"), class = "data.frame")

I have many other columns, but they keep the same "logic"

I've struggling because I just want to re-order the columns so that all the columns that have names finishing with "_0d" are arranged first in the data frame, and the ones that have "_14d" are left together at the end of the dataframe.

I've tried

MyData %>% dplyr::select(sort(names(.)))

which works if I wanted to arrange alphabetically, but when I try something like:

  MyData %>% dplyr::select(names(stringr::str_sort("d0", "d7")))

I just get an error.
I suppose there's a turnaround with select(contains(.)) but I can't seem to get it right.
Can anyone help? I have many more columns and since I am also filtering for further analysis, I want to do keep it the "tidyverse-way"

答案1

得分: 1

一种解决方案是使用 ends_with()

myData %>% 
  dplyr::select(ends_with(" _0d "), ends_with(" _14d "))
英文:

One solution is to use ends_with():

myData %>% 
  dplyr::select(ends_with("_0d"), ends_with("_14d"))

huangapple
  • 本文由 发表于 2023年6月19日 18:24:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/76505720.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定