2023年2月16日 06:10:44go评论96阅读模式

英文:

Using pivot_longer on two sets of columns

问题

我已经成功完成了我需要的任务，但代码有点混乱，我正在寻找一种更简洁的方法来完成它。我有两组需要并行旋转的列。

上面是一个简化的示例，看起来像这样：

> head(data)
  well_short reporter1 reporter2 target1 target2
1         A1       FAM       VIC    <NA>   GAPDH
2         A2       FAM       VIC    <NA>   GAPDH
3         A3       FAM       VIC    <NA>   GAPDH
4         A4       FAM       VIC    <NA>    <NA>
5         A5       FAM       VIC    <NA>    <NA>
6         A6       FAM       VIC  EIF4A2   ATP5B

我想将两个reporter列一起旋转，将两个target列一起旋转。我可以通过两步pivot_longer来实现这一点，然后像这样清理生成的数据框：

data_long <- data %>%
  pivot_longer(cols = starts_with("reporter"),
               names_to = "reporter_n",
               names_prefix = "reporter",
               values_to = "reporter") %>%
  pivot_longer(cols = starts_with("target"),
               names_to = "target_n",
               names_prefix = "target",
               values_to = "target") %>%
  filter(reporter_n == target_n,
         !is.na(target)) %>%
  select(-c(reporter_n, target_n))

这会生成以下结果：

> head(data_long)
# A tibble: 6 × 3
  well_short reporter target
  <chr>      <chr>    <chr> 
1 A1         VIC      GAPDH 
2 A2         VIC      GAPDH 
3 A3         VIC      GAPDH 
4 A6         FAM      EIF4A2
5 A6         VIC      ATP5B 
6 A7         FAM      EIF4A2

然而，我觉得一定有更简洁和整洁的方法来实现这个目标。

英文:

I've managed to achieve what I need, but the code is messy and I'm looking for a cleaner way to do it. I have two sets of columns that need pivoting in parallel.

data &lt;- structure(list(well_short = c(&quot;A1&quot;, &quot;A2&quot;, &quot;A3&quot;, &quot;A4&quot;, &quot;A5&quot;, &quot;A6&quot;, 
&quot;A7&quot;, &quot;A8&quot;, &quot;A9&quot;, &quot;A10&quot;, &quot;A11&quot;, &quot;A12&quot;, &quot;B1&quot;, &quot;B2&quot;, &quot;B3&quot;, &quot;B4&quot;, 
&quot;B5&quot;, &quot;B6&quot;, &quot;B7&quot;, &quot;B8&quot;), reporter1 = c(&quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, 
&quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, 
&quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;, &quot;FAM&quot;), reporter2 = c(&quot;VIC&quot;, 
&quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, 
&quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, &quot;VIC&quot;, 
&quot;VIC&quot;), target1 = c(NA, NA, NA, NA, NA, &quot;EIF4A2&quot;, &quot;EIF4A2&quot;, &quot;EIF4A2&quot;, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, &quot;EIF4A2&quot;, &quot;EIF4A2&quot;, &quot;EIF4A2&quot;
), target2 = c(&quot;GAPDH&quot;, &quot;GAPDH&quot;, &quot;GAPDH&quot;, NA, NA, &quot;ATP5B&quot;, &quot;ATP5B&quot;, 
&quot;ATP5B&quot;, NA, NA, NA, NA, &quot;GAPDH&quot;, &quot;GAPDH&quot;, &quot;GAPDH&quot;, NA, NA, &quot;ATP5B&quot;, 
&quot;ATP5B&quot;, &quot;ATP5B&quot;)), row.names = c(NA, -20L), class = &quot;data.frame&quot;)

Above is a simplified example that looks like this:

&gt; head(data)
  well_short reporter1 reporter2 target1 target2
1         A1       FAM       VIC    &lt;NA&gt;   GAPDH
2         A2       FAM       VIC    &lt;NA&gt;   GAPDH
3         A3       FAM       VIC    &lt;NA&gt;   GAPDH
4         A4       FAM       VIC    &lt;NA&gt;    &lt;NA&gt;
5         A5       FAM       VIC    &lt;NA&gt;    &lt;NA&gt;
6         A6       FAM       VIC  EIF4A2   ATP5B

I'd like to pivot_longer the two reporter columns together and the two target columns together. I can achieve this with a two step pivot_longer, and then cleaning up the resulting data frame like this:

data_long &lt;- data %&gt;%
  pivot_longer(cols = starts_with(&quot;reporter&quot;),
               names_to = &quot;reporter_n&quot;,
               names_prefix = &quot;reporter&quot;,
               values_to = &quot;reporter&quot;) %&gt;%
  pivot_longer(cols = starts_with(&quot;target&quot;),
               names_to = &quot;target_n&quot;,
               names_prefix = &quot;target&quot;,
               values_to = &quot;target&quot;) %&gt;%
  filter(reporter_n == target_n,
         !is.na(target)) %&gt;%
  select(-c(reporter_n, target_n))

Which produces this:

&gt; head(data_long)
# A tibble: 6 &#215; 3
  well_short reporter target
  &lt;chr&gt;      &lt;chr&gt;    &lt;chr&gt; 
1 A1         VIC      GAPDH 
2 A2         VIC      GAPDH 
3 A3         VIC      GAPDH 
4 A6         FAM      EIF4A2
5 A6         VIC      ATP5B 
6 A7         FAM      EIF4A2

However, I feel there must be a cleaner and tidier way to achieve this?

答案1

得分: 1

你可以在 pivot_longer 中使用 name_pattern 来提取 "reporter" 和 "target" 标签，并将它们分配给列，使用特殊关键字 ".value" 作为 names_to 的参数。然后，通过过滤 complete.cases 来移除 NA 值。

library(tidyverse)
pivot_longer(data, -1, names_pattern = "(.*)\\d$", names_to = ".value") %>%
  filter(complete.cases(.))
#> # A tibble: 18 x 3
#>    well_short reporter target
#>    <chr>      <chr>    <chr> 
#>  1 A1         VIC      GAPDH 
#>  2 A2         VIC      GAPDH 
#>  3 A3         VIC      GAPDH 
#>  4 A6         FAM      EIF4A2
#>  5 A6         VIC      ATP5B 
#>  6 A7         FAM      EIF4A2
#>  7 A7         VIC      ATP5B 
#>  8 A8         FAM      EIF4A2
#>  9 A8         VIC      ATP5B 
#> 10 B1         VIC      GAPDH 
#> 11 B2         VIC      GAPDH 
#> 12 B3         VIC      GAPDH 
#> 13 B6         FAM      EIF4A2
#> 14 B6         VIC      ATP5B 
#> 15 B7         FAM      EIF4A2
#> 16 B7         VIC      ATP5B 
#> 17 B8         FAM      EIF4A2
#> 18 B8         VIC      ATP5B

^{创建于2023-02-15，使用 reprex v2.0.2。}

英文:

You can use name_pattern in pivot_longer to extract the "reporter" and "target" labels, and assign them to columns using the special keyword ".value" passed as an argument to names_to. Then just remove the NA values by filtering complete.cases

library(tidyverse)
pivot_longer(data, -1, names_pattern = &quot;(.*)\\d$&quot;, names_to = &quot;.value&quot;) %&gt;%
  filter(complete.cases(.))
#&gt; # A tibble: 18 x 3
#&gt;    well_short reporter target
#&gt;    &lt;chr&gt;      &lt;chr&gt;    &lt;chr&gt; 
#&gt;  1 A1         VIC      GAPDH 
#&gt;  2 A2         VIC      GAPDH 
#&gt;  3 A3         VIC      GAPDH 
#&gt;  4 A6         FAM      EIF4A2
#&gt;  5 A6         VIC      ATP5B 
#&gt;  6 A7         FAM      EIF4A2
#&gt;  7 A7         VIC      ATP5B 
#&gt;  8 A8         FAM      EIF4A2
#&gt;  9 A8         VIC      ATP5B 
#&gt; 10 B1         VIC      GAPDH 
#&gt; 11 B2         VIC      GAPDH 
#&gt; 12 B3         VIC      GAPDH 
#&gt; 13 B6         FAM      EIF4A2
#&gt; 14 B6         VIC      ATP5B 
#&gt; 15 B7         FAM      EIF4A2
#&gt; 16 B7         VIC      ATP5B 
#&gt; 17 B8         FAM      EIF4A2
#&gt; 18 B8         VIC      ATP5B

<sup>Created on 2023-02-15 with reprex v2.0.2</sup>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用`pivot_longer`在两组列上进行操作。

问题

答案1

重新整理数据框架 – 将具有重复值的列值转换为列标题

Read an excel file with separate range of cells.

可视化两个列表之间的链接

如何在R中从列表中找到具有最大值的列名

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。