使用`pivot_longer`在两组列上进行操作。

huangapple go评论65阅读模式
英文:

Using pivot_longer on two sets of columns

问题

我已经成功完成了我需要的任务,但代码有点混乱,我正在寻找一种更简洁的方法来完成它。我有两组需要并行旋转的列。

上面是一个简化的示例,看起来像这样:

> head(data)
  well_short reporter1 reporter2 target1 target2
1         A1       FAM       VIC    <NA>   GAPDH
2         A2       FAM       VIC    <NA>   GAPDH
3         A3       FAM       VIC    <NA>   GAPDH
4         A4       FAM       VIC    <NA>    <NA>
5         A5       FAM       VIC    <NA>    <NA>
6         A6       FAM       VIC  EIF4A2   ATP5B

我想将两个reporter列一起旋转,将两个target列一起旋转。我可以通过两步pivot_longer来实现这一点,然后像这样清理生成的数据框:

data_long <- data %>%
  pivot_longer(cols = starts_with("reporter"),
               names_to = "reporter_n",
               names_prefix = "reporter",
               values_to = "reporter") %>%
  pivot_longer(cols = starts_with("target"),
               names_to = "target_n",
               names_prefix = "target",
               values_to = "target") %>%
  filter(reporter_n == target_n,
         !is.na(target)) %>%
  select(-c(reporter_n, target_n))

这会生成以下结果:

> head(data_long)
# A tibble: 6 × 3
  well_short reporter target
  <chr>      <chr>    <chr> 
1 A1         VIC      GAPDH 
2 A2         VIC      GAPDH 
3 A3         VIC      GAPDH 
4 A6         FAM      EIF4A2
5 A6         VIC      ATP5B 
6 A7         FAM      EIF4A2

然而,我觉得一定有更简洁和整洁的方法来实现这个目标。

英文:

I've managed to achieve what I need, but the code is messy and I'm looking for a cleaner way to do it. I have two sets of columns that need pivoting in parallel.

data &lt;- structure(list(well_short = c(&quot;A1&quot;, &quot;A2&quot;, &quot;A3&quot;, &quot;A4&quot;, &quot;A5&quot;, &quot;A6&quot;, 
&quot;A7&quot;, &quot;A8&quot;, &quot;A9&quot;, &quot;A10&quot;, &quot;A11&quot;, &quot;A12&quot;, &quot;B1&quot;, &quot;B2&quot;, &quot;B3&quot;, &quot;B4&quot;, 
&quot;B5&quot;, &quot;B6&quot;, &quot;B7&quot;, &quot;B8&quot;), reporter1 = c(&quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, 
&quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, 
&quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;), reporter2 = c(&quot;VIC&quot;, 
&quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, 
&quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, 
&quot;VIC&quot;), target1 = c(NA, NA, NA, NA, NA, &quot;EIF4A2&quot;, &quot;EIF4A2&quot;, &quot;EIF4A2&quot;, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, &quot;EIF4A2&quot;, &quot;EIF4A2&quot;, &quot;EIF4A2&quot;
), target2 = c(&quot;GAPDH&quot;, &quot;GAPDH&quot;, &quot;GAPDH&quot;, NA, NA, &quot;ATP5B&quot;, &quot;ATP5B&quot;, 
&quot;ATP5B&quot;, NA, NA, NA, NA, &quot;GAPDH&quot;, &quot;GAPDH&quot;, &quot;GAPDH&quot;, NA, NA, &quot;ATP5B&quot;, 
&quot;ATP5B&quot;, &quot;ATP5B&quot;)), row.names = c(NA, -20L), class = &quot;data.frame&quot;)

Above is a simplified example that looks like this:

&gt; head(data)
  well_short reporter1 reporter2 target1 target2
1         A1       FAM       VIC    &lt;NA&gt;   GAPDH
2         A2       FAM       VIC    &lt;NA&gt;   GAPDH
3         A3       FAM       VIC    &lt;NA&gt;   GAPDH
4         A4       FAM       VIC    &lt;NA&gt;    &lt;NA&gt;
5         A5       FAM       VIC    &lt;NA&gt;    &lt;NA&gt;
6         A6       FAM       VIC  EIF4A2   ATP5B

I'd like to pivot_longer the two reporter columns together and the two target columns together. I can achieve this with a two step pivot_longer, and then cleaning up the resulting data frame like this:

data_long &lt;- data %&gt;%
  pivot_longer(cols = starts_with(&quot;reporter&quot;),
               names_to = &quot;reporter_n&quot;,
               names_prefix = &quot;reporter&quot;,
               values_to = &quot;reporter&quot;) %&gt;%
  pivot_longer(cols = starts_with(&quot;target&quot;),
               names_to = &quot;target_n&quot;,
               names_prefix = &quot;target&quot;,
               values_to = &quot;target&quot;) %&gt;%
  filter(reporter_n == target_n,
         !is.na(target)) %&gt;%
  select(-c(reporter_n, target_n))

Which produces this:

&gt; head(data_long)
# A tibble: 6 &#215; 3
  well_short reporter target
  &lt;chr&gt;      &lt;chr&gt;    &lt;chr&gt; 
1 A1         VIC      GAPDH 
2 A2         VIC      GAPDH 
3 A3         VIC      GAPDH 
4 A6         FAM      EIF4A2
5 A6         VIC      ATP5B 
6 A7         FAM      EIF4A2

However, I feel there must be a cleaner and tidier way to achieve this?

答案1

得分: 1

你可以在 pivot_longer 中使用 name_pattern 来提取 "reporter" 和 "target" 标签,并将它们分配给列,使用特殊关键字 ".value" 作为 names_to 的参数。然后,通过过滤 complete.cases 来移除 NA 值。

library(tidyverse)

pivot_longer(data, -1, names_pattern = "(.*)\\d$", names_to = ".value") %>%
  filter(complete.cases(.))
#> # A tibble: 18 x 3
#>    well_short reporter target
#>    <chr>      <chr>    <chr> 
#>  1 A1         VIC      GAPDH 
#>  2 A2         VIC      GAPDH 
#>  3 A3         VIC      GAPDH 
#>  4 A6         FAM      EIF4A2
#>  5 A6         VIC      ATP5B 
#>  6 A7         FAM      EIF4A2
#>  7 A7         VIC      ATP5B 
#>  8 A8         FAM      EIF4A2
#>  9 A8         VIC      ATP5B 
#> 10 B1         VIC      GAPDH 
#> 11 B2         VIC      GAPDH 
#> 12 B3         VIC      GAPDH 
#> 13 B6         FAM      EIF4A2
#> 14 B6         VIC      ATP5B 
#> 15 B7         FAM      EIF4A2
#> 16 B7         VIC      ATP5B 
#> 17 B8         FAM      EIF4A2
#> 18 B8         VIC      ATP5B

创建于2023-02-15,使用 reprex v2.0.2

英文:

You can use name_pattern in pivot_longer to extract the "reporter" and "target" labels, and assign them to columns using the special keyword ".value" passed as an argument to names_to. Then just remove the NA values by filtering complete.cases

library(tidyverse)

pivot_longer(data, -1, names_pattern = &quot;(.*)\\d$&quot;, names_to = &quot;.value&quot;) %&gt;%
  filter(complete.cases(.))
#&gt; # A tibble: 18 x 3
#&gt;    well_short reporter target
#&gt;    &lt;chr&gt;      &lt;chr&gt;    &lt;chr&gt; 
#&gt;  1 A1         VIC      GAPDH 
#&gt;  2 A2         VIC      GAPDH 
#&gt;  3 A3         VIC      GAPDH 
#&gt;  4 A6         FAM      EIF4A2
#&gt;  5 A6         VIC      ATP5B 
#&gt;  6 A7         FAM      EIF4A2
#&gt;  7 A7         VIC      ATP5B 
#&gt;  8 A8         FAM      EIF4A2
#&gt;  9 A8         VIC      ATP5B 
#&gt; 10 B1         VIC      GAPDH 
#&gt; 11 B2         VIC      GAPDH 
#&gt; 12 B3         VIC      GAPDH 
#&gt; 13 B6         FAM      EIF4A2
#&gt; 14 B6         VIC      ATP5B 
#&gt; 15 B7         FAM      EIF4A2
#&gt; 16 B7         VIC      ATP5B 
#&gt; 17 B8         FAM      EIF4A2
#&gt; 18 B8         VIC      ATP5B

<sup>Created on 2023-02-15 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年2月16日 06:10:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/75465900.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定