英文:
Using pivot_longer on two sets of columns
问题
我已经成功完成了我需要的任务,但代码有点混乱,我正在寻找一种更简洁的方法来完成它。我有两组需要并行旋转的列。
上面是一个简化的示例,看起来像这样:
> head(data)
well_short reporter1 reporter2 target1 target2
1 A1 FAM VIC <NA> GAPDH
2 A2 FAM VIC <NA> GAPDH
3 A3 FAM VIC <NA> GAPDH
4 A4 FAM VIC <NA> <NA>
5 A5 FAM VIC <NA> <NA>
6 A6 FAM VIC EIF4A2 ATP5B
我想将两个reporter
列一起旋转,将两个target
列一起旋转。我可以通过两步pivot_longer
来实现这一点,然后像这样清理生成的数据框:
data_long <- data %>%
pivot_longer(cols = starts_with("reporter"),
names_to = "reporter_n",
names_prefix = "reporter",
values_to = "reporter") %>%
pivot_longer(cols = starts_with("target"),
names_to = "target_n",
names_prefix = "target",
values_to = "target") %>%
filter(reporter_n == target_n,
!is.na(target)) %>%
select(-c(reporter_n, target_n))
这会生成以下结果:
> head(data_long)
# A tibble: 6 × 3
well_short reporter target
<chr> <chr> <chr>
1 A1 VIC GAPDH
2 A2 VIC GAPDH
3 A3 VIC GAPDH
4 A6 FAM EIF4A2
5 A6 VIC ATP5B
6 A7 FAM EIF4A2
然而,我觉得一定有更简洁和整洁的方法来实现这个目标。
英文:
I've managed to achieve what I need, but the code is messy and I'm looking for a cleaner way to do it. I have two sets of columns that need pivoting in parallel.
data <- structure(list(well_short = c("A1", "A2", "A3", "A4", "A5", "A6",
"A7", "A8", "A9", "A10", "A11", "A12", "B1", "B2", "B3", "B4",
"B5", "B6", "B7", "B8"), reporter1 = c("FAM", "FAM", "FAM", "FAM",
"FAM", "FAM", "FAM", "FAM", "FAM", "FAM", "FAM", "FAM", "FAM",
"FAM", "FAM", "FAM", "FAM", "FAM", "FAM", "FAM"), reporter2 = c("VIC",
"VIC", "VIC", "VIC", "VIC", "VIC", "VIC", "VIC", "VIC", "VIC",
"VIC", "VIC", "VIC", "VIC", "VIC", "VIC", "VIC", "VIC", "VIC",
"VIC"), target1 = c(NA, NA, NA, NA, NA, "EIF4A2", "EIF4A2", "EIF4A2",
NA, NA, NA, NA, NA, NA, NA, NA, NA, "EIF4A2", "EIF4A2", "EIF4A2"
), target2 = c("GAPDH", "GAPDH", "GAPDH", NA, NA, "ATP5B", "ATP5B",
"ATP5B", NA, NA, NA, NA, "GAPDH", "GAPDH", "GAPDH", NA, NA, "ATP5B",
"ATP5B", "ATP5B")), row.names = c(NA, -20L), class = "data.frame")
Above is a simplified example that looks like this:
> head(data)
well_short reporter1 reporter2 target1 target2
1 A1 FAM VIC <NA> GAPDH
2 A2 FAM VIC <NA> GAPDH
3 A3 FAM VIC <NA> GAPDH
4 A4 FAM VIC <NA> <NA>
5 A5 FAM VIC <NA> <NA>
6 A6 FAM VIC EIF4A2 ATP5B
I'd like to pivot_longer
the two reporter
columns together and the two target
columns together. I can achieve this with a two step pivot_longer
, and then cleaning up the resulting data frame like this:
data_long <- data %>%
pivot_longer(cols = starts_with("reporter"),
names_to = "reporter_n",
names_prefix = "reporter",
values_to = "reporter") %>%
pivot_longer(cols = starts_with("target"),
names_to = "target_n",
names_prefix = "target",
values_to = "target") %>%
filter(reporter_n == target_n,
!is.na(target)) %>%
select(-c(reporter_n, target_n))
Which produces this:
> head(data_long)
# A tibble: 6 × 3
well_short reporter target
<chr> <chr> <chr>
1 A1 VIC GAPDH
2 A2 VIC GAPDH
3 A3 VIC GAPDH
4 A6 FAM EIF4A2
5 A6 VIC ATP5B
6 A7 FAM EIF4A2
However, I feel there must be a cleaner and tidier way to achieve this?
答案1
得分: 1
你可以在 pivot_longer
中使用 name_pattern
来提取 "reporter" 和 "target" 标签,并将它们分配给列,使用特殊关键字 ".value" 作为 names_to
的参数。然后,通过过滤 complete.cases
来移除 NA 值。
library(tidyverse)
pivot_longer(data, -1, names_pattern = "(.*)\\d$", names_to = ".value") %>%
filter(complete.cases(.))
#> # A tibble: 18 x 3
#> well_short reporter target
#> <chr> <chr> <chr>
#> 1 A1 VIC GAPDH
#> 2 A2 VIC GAPDH
#> 3 A3 VIC GAPDH
#> 4 A6 FAM EIF4A2
#> 5 A6 VIC ATP5B
#> 6 A7 FAM EIF4A2
#> 7 A7 VIC ATP5B
#> 8 A8 FAM EIF4A2
#> 9 A8 VIC ATP5B
#> 10 B1 VIC GAPDH
#> 11 B2 VIC GAPDH
#> 12 B3 VIC GAPDH
#> 13 B6 FAM EIF4A2
#> 14 B6 VIC ATP5B
#> 15 B7 FAM EIF4A2
#> 16 B7 VIC ATP5B
#> 17 B8 FAM EIF4A2
#> 18 B8 VIC ATP5B
创建于2023-02-15,使用 reprex v2.0.2。
英文:
You can use name_pattern
in pivot_longer
to extract the "reporter" and "target" labels, and assign them to columns using the special keyword ".value" passed as an argument to names_to
. Then just remove the NA values by filtering complete.cases
library(tidyverse)
pivot_longer(data, -1, names_pattern = "(.*)\\d$", names_to = ".value") %>%
filter(complete.cases(.))
#> # A tibble: 18 x 3
#> well_short reporter target
#> <chr> <chr> <chr>
#> 1 A1 VIC GAPDH
#> 2 A2 VIC GAPDH
#> 3 A3 VIC GAPDH
#> 4 A6 FAM EIF4A2
#> 5 A6 VIC ATP5B
#> 6 A7 FAM EIF4A2
#> 7 A7 VIC ATP5B
#> 8 A8 FAM EIF4A2
#> 9 A8 VIC ATP5B
#> 10 B1 VIC GAPDH
#> 11 B2 VIC GAPDH
#> 12 B3 VIC GAPDH
#> 13 B6 FAM EIF4A2
#> 14 B6 VIC ATP5B
#> 15 B7 FAM EIF4A2
#> 16 B7 VIC ATP5B
#> 17 B8 FAM EIF4A2
#> 18 B8 VIC ATP5B
<sup>Created on 2023-02-15 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论