将具有多列的数据重塑为长格式。

huangapple go评论93阅读模式
英文:

Reshape data with multiple columns to long format

问题

我相当确定,类似这样的问题以前已经被问过,但我找不到任何答案。

这是我的数据集:

  1. data.frame(Group = c("a", "b"),
  2. MEAN_A = 1:2,
  3. MEAN_B = 3:4,
  4. MED_A = 5:6,
  5. MED_B = 7:8)
  6. Group MEAN_A MEAN_B MED_A MED_B
  7. 1 a 1 3 5 7
  8. 2 b 2 4 6 8

我想要的是:

  1. data.frame(Group = c("a", "a", "b", "b"),
  2. Name = c("MEAN", "MED", "MEAN", "MED"),
  3. Value_A = c(1, 5, 2, 6),
  4. Value_B = c(3, 7, 4, 8))
  5. Group Name Value_A Value_B
  6. 1 a MEAN 1 3
  7. 2 a MED 5 7
  8. 3 b MEAN 2 4
  9. 4 b MED 6 8

所以我想保留变量Group,并有一个新的列告诉我原始变量是来自MEAN还是MED,以及两列值AB,最初在MEANMED之后的变量名中。

我已经尝试过pivot_longer,甚至使用了模式,但是我无法获得我期望的输出。

英文:

I'm pretty sure, that a question like this was asked before but I cannot find any.

This is my dataset:

  1. data.frame(Group = c("a", "b"),
  2. MEAN_A = 1:2,
  3. MEAN_B = 3:4,
  4. MED_A = 5:6,
  5. MED_B = 7:8)
  6. Group MEAN_A MEAN_B MED_A MED_B
  7. 1 a 1 3 5 7
  8. 2 b 2 4 6 8

What I want is the following:

  1. data.frame(Group = c("a", "a", "b", "b"),
  2. Name = c("MEAN", "MED", "MEAN", "MED"),
  3. Value_A = c(1, 5, 2, 6),
  4. Value_B = c(3, 7, 4, 8))
  5. Group Name Value_A Value_B
  6. 1 a MEAN 1 3
  7. 2 a MED 5 7
  8. 3 b MEAN 2 4
  9. 4 b MED 6 8

So I want to keep the variable Group and have a new column which tells me, if the original variable was from MEAN or MED and two columns with the Values of A and B, that where initially in the variable names after MEAN or MED.

I've already tried pivot_longer, even with patterns, but I'm not able to get my desired output.

答案1

得分: 1

以下是翻译好的代码部分:

第一种方法:

  1. 选择组和"mean"列,将均值重命名为"Value_",添加"mean"标识符。
  2. 选择组和"med"列,将中位数重命名为"Value_",添加"med"标识符。
  3. 绑定这些框架,按"Group"排序:
  4. df %>%
  5. select(1:3) %>%
  6. rename_with(~gsub(pattern = "MEAN", replacement = "Value", .), .cols = starts_with("MEAN")) %>%
  7. mutate(Name = "MEAN") %>%
  8. rbind(df %>%
  9. select(c(1,4,5)) %>%
  10. rename_with(~gsub(pattern = "MED", replacement = "Value", .), .cols = starts_with("MED")) %>%
  11. mutate(Name = "MED")) %>%
  12. select(Group, Name, Value_A, Value_B) %>%
  13. arrange(Group)

结果如下:

  1. Group Name Value_A Value_B
  2. 1 a MEAN 1 3
  3. 2 a MED 5 7
  4. 3 b MEAN 2 4
  5. 4 b MED 6 8

编辑:另一种整洁的方法:

  1. df %>%
  2. pivot_longer(cols = any_of(c(ends_with("_A"), ends_with("_B"))),
  3. names_to = c("Name", ".value"),
  4. names_sep = "_") %>%
  5. rename(Value_A = A, Value_B = B)

结果如下:

  1. # A tibble: 4 × 4
  2. Group Name Value_A Value_B
  3. 1 a MEAN 1 3
  4. 2 a MED 5 7
  5. 3 b MEAN 2 4
  6. 4 b MED 6 8

请注意,这是您提供的代码的翻译版本,没有其他附加内容。

英文:

Here's one approach:

Select the group and "mean" columns, rename the means to "Value_", add a "mean" identifier.
Select the group and "med" columns, rename the meds to "Value_", add a "med" identifier.
bind the frames together, sort by "Group":

  1. df %>% select(1:3) %>%
  2. rename_with(~gsub(pattern = "MEAN", replacement = "Value", .), .cols = starts_with("MEAN")) %>%
  3. mutate(Name = "MEAN") %>%
  4. rbind(df %>%
  5. select(c(1,4,5)) %>%
  6. rename_with(~gsub(pattern = "MED", replacement = "Value", .), .cols = starts_with("MED")) %>%
  7. mutate(Name = "MED")) %>%
  8. select(Group, Name, Value_A, Value_B) %>%
  9. arrange(Group)

gives

  1. Group Name Value_A Value_B
  2. 1 a MEAN 1 3
  3. 2 a MED 5 7
  4. 3 b MEAN 2 4
  5. 4 b MED 6 8

Edit: another tidy approach:

  1. df %>%
  2. pivot_longer(cols = any_of(c(ends_with("_A"), ends_with("_B"))),
  3. names_to = c("Name", ".value"),
  4. names_sep = "_") %>%
  5. rename(Value_A = A, Value_B = B)
  6. # A tibble: 4 × 4
  7. Group Name Value_A Value_B
  8. <chr> <chr> <int> <int>
  9. 1 a MEAN 1 3
  10. 2 a MED 5 7
  11. 3 b MEAN 2 4
  12. 4 b MED 6 8

huangapple
  • 本文由 发表于 2023年7月6日 16:31:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/76626940.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定