使用`pivot_longer`在两组列上进行操作。

huangapple go评论96阅读模式
英文:

Using pivot_longer on two sets of columns

问题

我已经成功完成了我需要的任务,但代码有点混乱,我正在寻找一种更简洁的方法来完成它。我有两组需要并行旋转的列。

上面是一个简化的示例,看起来像这样:

  1. > head(data)
  2. well_short reporter1 reporter2 target1 target2
  3. 1 A1 FAM VIC <NA> GAPDH
  4. 2 A2 FAM VIC <NA> GAPDH
  5. 3 A3 FAM VIC <NA> GAPDH
  6. 4 A4 FAM VIC <NA> <NA>
  7. 5 A5 FAM VIC <NA> <NA>
  8. 6 A6 FAM VIC EIF4A2 ATP5B

我想将两个reporter列一起旋转,将两个target列一起旋转。我可以通过两步pivot_longer来实现这一点,然后像这样清理生成的数据框:

  1. data_long <- data %>%
  2. pivot_longer(cols = starts_with("reporter"),
  3. names_to = "reporter_n",
  4. names_prefix = "reporter",
  5. values_to = "reporter") %>%
  6. pivot_longer(cols = starts_with("target"),
  7. names_to = "target_n",
  8. names_prefix = "target",
  9. values_to = "target") %>%
  10. filter(reporter_n == target_n,
  11. !is.na(target)) %>%
  12. select(-c(reporter_n, target_n))

这会生成以下结果:

  1. > head(data_long)
  2. # A tibble: 6 × 3
  3. well_short reporter target
  4. <chr> <chr> <chr>
  5. 1 A1 VIC GAPDH
  6. 2 A2 VIC GAPDH
  7. 3 A3 VIC GAPDH
  8. 4 A6 FAM EIF4A2
  9. 5 A6 VIC ATP5B
  10. 6 A7 FAM EIF4A2

然而,我觉得一定有更简洁和整洁的方法来实现这个目标。

英文:

I've managed to achieve what I need, but the code is messy and I'm looking for a cleaner way to do it. I have two sets of columns that need pivoting in parallel.

  1. data &lt;- structure(list(well_short = c(&quot;A1&quot;, &quot;A2&quot;, &quot;A3&quot;, &quot;A4&quot;, &quot;A5&quot;, &quot;A6&quot;,
  2. &quot;A7&quot;, &quot;A8&quot;, &quot;A9&quot;, &quot;A10&quot;, &quot;A11&quot;, &quot;A12&quot;, &quot;B1&quot;, &quot;B2&quot;, &quot;B3&quot;, &quot;B4&quot;,
  3. &quot;B5&quot;, &quot;B6&quot;, &quot;B7&quot;, &quot;B8&quot;), reporter1 = c(&quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;,
  4. &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;,
  5. &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;), reporter2 = c(&quot;VIC&quot;,
  6. &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;,
  7. &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;,
  8. &quot;VIC&quot;), target1 = c(NA, NA, NA, NA, NA, &quot;EIF4A2&quot;, &quot;EIF4A2&quot;, &quot;EIF4A2&quot;,
  9. NA, NA, NA, NA, NA, NA, NA, NA, NA, &quot;EIF4A2&quot;, &quot;EIF4A2&quot;, &quot;EIF4A2&quot;
  10. ), target2 = c(&quot;GAPDH&quot;, &quot;GAPDH&quot;, &quot;GAPDH&quot;, NA, NA, &quot;ATP5B&quot;, &quot;ATP5B&quot;,
  11. &quot;ATP5B&quot;, NA, NA, NA, NA, &quot;GAPDH&quot;, &quot;GAPDH&quot;, &quot;GAPDH&quot;, NA, NA, &quot;ATP5B&quot;,
  12. &quot;ATP5B&quot;, &quot;ATP5B&quot;)), row.names = c(NA, -20L), class = &quot;data.frame&quot;)

Above is a simplified example that looks like this:

  1. &gt; head(data)
  2. well_short reporter1 reporter2 target1 target2
  3. 1 A1 FAM VIC &lt;NA&gt; GAPDH
  4. 2 A2 FAM VIC &lt;NA&gt; GAPDH
  5. 3 A3 FAM VIC &lt;NA&gt; GAPDH
  6. 4 A4 FAM VIC &lt;NA&gt; &lt;NA&gt;
  7. 5 A5 FAM VIC &lt;NA&gt; &lt;NA&gt;
  8. 6 A6 FAM VIC EIF4A2 ATP5B

I'd like to pivot_longer the two reporter columns together and the two target columns together. I can achieve this with a two step pivot_longer, and then cleaning up the resulting data frame like this:

  1. data_long &lt;- data %&gt;%
  2. pivot_longer(cols = starts_with(&quot;reporter&quot;),
  3. names_to = &quot;reporter_n&quot;,
  4. names_prefix = &quot;reporter&quot;,
  5. values_to = &quot;reporter&quot;) %&gt;%
  6. pivot_longer(cols = starts_with(&quot;target&quot;),
  7. names_to = &quot;target_n&quot;,
  8. names_prefix = &quot;target&quot;,
  9. values_to = &quot;target&quot;) %&gt;%
  10. filter(reporter_n == target_n,
  11. !is.na(target)) %&gt;%
  12. select(-c(reporter_n, target_n))

Which produces this:

  1. &gt; head(data_long)
  2. # A tibble: 6 &#215; 3
  3. well_short reporter target
  4. &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
  5. 1 A1 VIC GAPDH
  6. 2 A2 VIC GAPDH
  7. 3 A3 VIC GAPDH
  8. 4 A6 FAM EIF4A2
  9. 5 A6 VIC ATP5B
  10. 6 A7 FAM EIF4A2

However, I feel there must be a cleaner and tidier way to achieve this?

答案1

得分: 1

你可以在 pivot_longer 中使用 name_pattern 来提取 "reporter" 和 "target" 标签,并将它们分配给列,使用特殊关键字 ".value" 作为 names_to 的参数。然后,通过过滤 complete.cases 来移除 NA 值。

  1. library(tidyverse)
  2. pivot_longer(data, -1, names_pattern = "(.*)\\d$", names_to = ".value") %>%
  3. filter(complete.cases(.))
  4. #> # A tibble: 18 x 3
  5. #> well_short reporter target
  6. #> <chr> <chr> <chr>
  7. #> 1 A1 VIC GAPDH
  8. #> 2 A2 VIC GAPDH
  9. #> 3 A3 VIC GAPDH
  10. #> 4 A6 FAM EIF4A2
  11. #> 5 A6 VIC ATP5B
  12. #> 6 A7 FAM EIF4A2
  13. #> 7 A7 VIC ATP5B
  14. #> 8 A8 FAM EIF4A2
  15. #> 9 A8 VIC ATP5B
  16. #> 10 B1 VIC GAPDH
  17. #> 11 B2 VIC GAPDH
  18. #> 12 B3 VIC GAPDH
  19. #> 13 B6 FAM EIF4A2
  20. #> 14 B6 VIC ATP5B
  21. #> 15 B7 FAM EIF4A2
  22. #> 16 B7 VIC ATP5B
  23. #> 17 B8 FAM EIF4A2
  24. #> 18 B8 VIC ATP5B

创建于2023-02-15,使用 reprex v2.0.2

英文:

You can use name_pattern in pivot_longer to extract the "reporter" and "target" labels, and assign them to columns using the special keyword ".value" passed as an argument to names_to. Then just remove the NA values by filtering complete.cases

  1. library(tidyverse)
  2. pivot_longer(data, -1, names_pattern = &quot;(.*)\\d$&quot;, names_to = &quot;.value&quot;) %&gt;%
  3. filter(complete.cases(.))
  4. #&gt; # A tibble: 18 x 3
  5. #&gt; well_short reporter target
  6. #&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
  7. #&gt; 1 A1 VIC GAPDH
  8. #&gt; 2 A2 VIC GAPDH
  9. #&gt; 3 A3 VIC GAPDH
  10. #&gt; 4 A6 FAM EIF4A2
  11. #&gt; 5 A6 VIC ATP5B
  12. #&gt; 6 A7 FAM EIF4A2
  13. #&gt; 7 A7 VIC ATP5B
  14. #&gt; 8 A8 FAM EIF4A2
  15. #&gt; 9 A8 VIC ATP5B
  16. #&gt; 10 B1 VIC GAPDH
  17. #&gt; 11 B2 VIC GAPDH
  18. #&gt; 12 B3 VIC GAPDH
  19. #&gt; 13 B6 FAM EIF4A2
  20. #&gt; 14 B6 VIC ATP5B
  21. #&gt; 15 B7 FAM EIF4A2
  22. #&gt; 16 B7 VIC ATP5B
  23. #&gt; 17 B8 FAM EIF4A2
  24. #&gt; 18 B8 VIC ATP5B

<sup>Created on 2023-02-15 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年2月16日 06:10:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/75465900.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定