英文:
R pivot longer multiple columns
问题
我已经尝试了很多次,甚至ChatGPT也无法帮助我:D
以下是你提供的代码的翻译部分:
library(tidyr)
# 创建输出数据集
data <- data.frame(
id = 1:12,
f51a_01 = c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3),
f51a_02 = c(1, 2, 3, 6, 1, 2, 3, 4, 5, 6, 1, 6),
f51b_01 = c(1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 3, 3),
f51b_02 = c(1, 2, 2, 6, 6, 6, 6, 6, 6, 6, 1, 6)
)
期望的输出:
id var var_01 var_02
1 f51a 1 1
1 f51b 1 1
2 f51a 1 2
2 f51b 1 2
3 f51a 1 3
3 f51b 2 2
4 f51a 1 6
4 f51b 1 6
5 f51a 2 1
5 f51b 1 6
6 f51a 2 2
6 f51b 1 6
7 f51a 2 3
7 f51b 1 6
8 f51a 2 4
8 f51b 1 6
9 f51a 2 5
9 f51b 1 6
10 f51a 2 6
10 f51b 1 6
11 f51a 3 1
11 f51b 3 1
12 f51a 3 6
12 f51b 3 6
至于你的解决方案,它已经能够正常工作,但可能不是最优解。如果你需要更多提示或改进建议,请告诉我。谢谢!
英文:
have already tried quite a lot and even chatgpt could not help me
The following case:
library(tidyr)
# create the output dataset
data <- data.frame(
id = 1:12,
f51a_01 = c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3),
f51a_02 = c(1, 2, 3, 6, 1, 2, 3, 4, 5, 6, 1, 6),
f51b_01 = c(1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 3, 3),
f51b_02 = c(1, 2, 2, 6, 6, 6, 6, 6, 6, 6, 1, 6)
)
expected output:
id var var_01 var_02
1 f51a 1 1
1 f51b 1 1
2 f51a 1 2
2 f51b 1 2
3 f51a 1 3
3 f51b 2 2
4 f51a 1 6
4 f51b 1 6
5 f51a 2 1
5 f51b 1 6
6 f51a 2 2
6 f51b 1 6
7 f51a 2 3
7 f51b 1 6
8 f51a 2 4
8 f51b 1 6
9 f51a 2 5
9 f51b 1 6
10 f51a 2 6
10 f51b 1 6
11 f51a 3 1
11 f51b 3 1
12 f51a 3 6
12 f51b 3 6
My solution so far works, but is not optimal.
data%>%
select(id, matches("f51[a-z]_01"))%>%
pivot_longer(cols = -id, names_to = "var", values_to = "var_01")%>%
mutate(var = str_extract(var, "f51[a-z]"))%>%
left_join(data2%>%
select(id, matches("f51[a-z]_02"))%>%
pivot_longer(cols = -id, names_to = "var", values_to = "var_02")%>%
mutate(var = str_extract(var, "f51[a-z]")))
Glad about every hint. Thanks
答案1
得分: 0
当您想要同时将多个变量放入长格式中,就像这里一样,您应该结合使用names_to
和names_pattern
或names_sep
。在names_to
中,您定义新的列名,.value
表示:
".value" 表示列名的相应组件
定义包含单元格值的输出列的名称,
覆盖了完全的 values_to。
pivot_longer(data, -id, names_sep = "_", names_to = c("var", ".value"))
# id var `01` `02`
# 1 1 f51a 1 1
# 2 1 f51b 1 1
# 3 2 f51a 1 2
# 4 2 f51b 1 2
# 5 3 f51a 1 3
# 6 3 f51b 2 2
# 7 4 f51a 1 6
# 8 4 f51b 1 6
# 9 5 f51a 2 1
# 10 5 f51b 1 6
# 11 6 f51a 2 2
# 12 6 f51b 1 6
# 13 7 f51a 2 3
# 14 7 f51b 1 6
# 15 8 f51a 2 4
# 16 8 f51b 1 6
# 17 9 f51a 2 5
# 18 9 f51b 1 6
# 19 10 f51a 2 6
# 20 10 f51b 1 6
# 21 11 f51a 3 1
# 22 11 f51b 3 1
# 23 12 f51a 3 6
# 24 12 f51b 3 6
英文:
When you want to simultaneously put multiple variables in long format, like here, you should combine names_to
and names_pattern
or names_sep
. In names_to
, where you define new column names, .value
refers to:
> ".value" indicates that the corresponding component of the column name
> defines the name of the output column containing the cell values,
> overriding values_to entirely.
pivot_longer(data, -id, names_sep = "_", names_to = c("var", ".value"))
# id var `01` `02`
# 1 1 f51a 1 1
# 2 1 f51b 1 1
# 3 2 f51a 1 2
# 4 2 f51b 1 2
# 5 3 f51a 1 3
# 6 3 f51b 2 2
# 7 4 f51a 1 6
# 8 4 f51b 1 6
# 9 5 f51a 2 1
# 10 5 f51b 1 6
# 11 6 f51a 2 2
# 12 6 f51b 1 6
# 13 7 f51a 2 3
# 14 7 f51b 1 6
# 15 8 f51a 2 4
# 16 8 f51b 1 6
# 17 9 f51a 2 5
# 18 9 f51b 1 6
# 19 10 f51a 2 6
# 20 10 f51b 1 6
# 21 11 f51a 3 1
# 22 11 f51b 3 1
# 23 12 f51a 3 6
# 24 12 f51b 3 6
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论