2023年7月6日 16:31:27go评论93阅读模式

英文:

Reshape data with multiple columns to long format

问题

我相当确定，类似这样的问题以前已经被问过，但我找不到任何答案。

这是我的数据集：

data.frame(Group = c("a", "b"), 
           MEAN_A = 1:2, 
           MEAN_B = 3:4, 
           MED_A = 5:6, 
           MED_B = 7:8) 
  Group MEAN_A MEAN_B MED_A MED_B
1     a      1      3     5     7
2     b      2      4     6     8

我想要的是：

data.frame(Group = c("a", "a", "b", "b"), 
           Name = c("MEAN", "MED", "MEAN", "MED"),
           Value_A = c(1, 5, 2, 6),
           Value_B = c(3, 7, 4, 8))
  Group Name Value_A Value_B
1     a MEAN       1       3
2     a  MED       5       7
3     b MEAN       2       4
4     b  MED       6       8

所以我想保留变量Group，并有一个新的列告诉我原始变量是来自MEAN还是MED，以及两列值A和B，最初在MEAN或MED之后的变量名中。

我已经尝试过pivot_longer，甚至使用了模式，但是我无法获得我期望的输出。

英文:

I'm pretty sure, that a question like this was asked before but I cannot find any.

This is my dataset:

data.frame(Group = c(&quot;a&quot;, &quot;b&quot;), 
           MEAN_A = 1:2, 
           MEAN_B = 3:4, 
           MED_A = 5:6, 
           MED_B = 7:8) 
  Group MEAN_A MEAN_B MED_A MED_B
1     a      1      3     5     7
2     b      2      4     6     8

What I want is the following:

data.frame(Group = c(&quot;a&quot;, &quot;a&quot;, &quot;b&quot;, &quot;b&quot;), 
           Name = c(&quot;MEAN&quot;, &quot;MED&quot;, &quot;MEAN&quot;, &quot;MED&quot;),
           Value_A = c(1, 5, 2, 6),
           Value_B = c(3, 7, 4, 8))
  Group Name Value_A Value_B
1     a MEAN       1       3
2     a  MED       5       7
3     b MEAN       2       4
4     b  MED       6       8

So I want to keep the variable Group and have a new column which tells me, if the original variable was from MEAN or MED and two columns with the Values of A and B, that where initially in the variable names after MEAN or MED.

I've already tried pivot_longer, even with patterns, but I'm not able to get my desired output.

答案1

得分: 1

以下是翻译好的代码部分：

第一种方法：

选择组和"mean"列，将均值重命名为"Value_"，添加"mean"标识符。
选择组和"med"列，将中位数重命名为"Value_"，添加"med"标识符。
绑定这些框架，按"Group"排序：
df %>%
  select(1:3) %>%
  rename_with(~gsub(pattern = "MEAN", replacement = "Value", .), .cols = starts_with("MEAN")) %>%
  mutate(Name = "MEAN") %>%
  rbind(df %>%
              select(c(1,4,5)) %>%
              rename_with(~gsub(pattern = "MED", replacement = "Value", .), .cols = starts_with("MED")) %>%
              mutate(Name = "MED")) %>%
  select(Group, Name, Value_A, Value_B) %>%
  arrange(Group)

结果如下：

  Group Name Value_A Value_B
1     a MEAN       1       3
2     a  MED       5       7
3     b MEAN       2       4
4     b  MED       6       8

编辑：另一种整洁的方法：

df %>% 
  pivot_longer(cols = any_of(c(ends_with("_A"), ends_with("_B"))),
               names_to = c("Name", ".value"),
               names_sep = "_") %>%
  rename(Value_A = A, Value_B = B)

结果如下：

# A tibble: 4 × 4
  Group Name  Value_A Value_B
1 a     MEAN        1       3
2 a     MED         5       7
3 b     MEAN        2       4
4 b     MED         6       8

请注意，这是您提供的代码的翻译版本，没有其他附加内容。

英文:

Here's one approach:

Select the group and "mean" columns, rename the means to "Value_", add a "mean" identifier.
Select the group and "med" columns, rename the meds to "Value_", add a "med" identifier.
bind the frames together, sort by "Group":

df %&gt;% select(1:3) %&gt;%
  rename_with(~gsub(pattern = &quot;MEAN&quot;, replacement = &quot;Value&quot;, .), .cols = starts_with(&quot;MEAN&quot;)) %&gt;%
  mutate(Name = &quot;MEAN&quot;) %&gt;%
  rbind(df %&gt;%
              select(c(1,4,5)) %&gt;% 
              rename_with(~gsub(pattern = &quot;MED&quot;, replacement = &quot;Value&quot;, .), .cols = starts_with(&quot;MED&quot;)) %&gt;%
              mutate(Name = &quot;MED&quot;)) %&gt;%
  select(Group, Name, Value_A, Value_B) %&gt;%
  arrange(Group)

gives

  Group Name Value_A Value_B
1     a MEAN       1       3
2     a  MED       5       7
3     b MEAN       2       4
4     b  MED       6       8

Edit: another tidy approach:

df %&gt;% 
  pivot_longer(cols = any_of(c(ends_with(&quot;_A&quot;), ends_with(&quot;_B&quot;))),
               names_to = c(&quot;Name&quot;, &quot;.value&quot;),
               names_sep = &quot;_&quot;) %&gt;%
  rename(Value_A = A, Value_B = B)
# A tibble: 4 &#215; 4
  Group Name  Value_A Value_B
  &lt;chr&gt; &lt;chr&gt;   &lt;int&gt;   &lt;int&gt;
1 a     MEAN        1       3
2 a     MED         5       7
3 b     MEAN        2       4
4 b     MED         6       8

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将具有多列的数据重塑为长格式。

问题

答案1

Using package::function inside formula in R gives different results (specifically survival::strata inside coxph)

Using parse_expr(), quo_name(), and enquo() to define a character object for plotting country-wise graphs in ggplot

根据列的唯一级别修改数据框，然后将其中的2个其他列的值合并。

增加一个值，如果一行中的数字发生变化。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。