英文:
Reshaping data into long forma
问题
I have reshaped the data as per your request:
Name Value A B
1 ST01 1 0.020 0.010
2 ST01 2 0.040 0.020
3 ST01 3 0.050 0.040
4 ST01 4 0.060 0.080
5 ST01 5 0.080 0.090
6 ST02 1 0.036 0.003
7 ST02 2 0.034 0.002
8 ST02 3 0.054 0.008
9 ST02 4 0.034 0.009
10 ST02 5 0.240 0.004
11 ST03 1 0.140 0.040
12 ST03 2 0.320 0.020
13 ST03 3 0.560 0.060
14 ST03 4 0.780 0.080
15 ST03 5 0.590 0.050
This format should allow you to use ggplot to plot the data as you described. If you have any more questions or need further assistance, feel free to ask.
英文:
I have two large data frames, the sample below shows what they look like:
Name<-c("ST01","ST02","ST03")
B1<-c(0.01,0.003,0.04)
B2<-c(0.02,0.002,0.02)
B3<-c(0.04,0.008,0.06)
B4<-c(0.08,0.009,0.08)
B5<-c(0.09,0.004,0.05)
df1<-data.frame(Name,B1,B2,B3,B4,B5)
Name A1 A2 A3 A4 A5
1 ST01 0.020 0.040 0.050 0.060 0.08
2 ST02 0.036 0.034 0.054 0.034 0.24
3 ST03 0.140 0.320 0.560 0.780 0.59
Name<-c("ST01","ST02","ST03")
A1<-c(0.02,0.036,0.14)
A2<-c(0.04,0.034,0.32)
A3<-c(0.05,0.054,0.56)
A4<-c(0.06,0.034,0.78)
A5<-c(0.08,0.24,0.59)
df2<-data.frame(Name,A1,A2,A3,A4,A5)
Name B1 B2 B3 B4 B5
1 ST01 0.010 0.020 0.040 0.080 0.090
2 ST02 0.003 0.002 0.008 0.009 0.004
3 ST03 0.040 0.020 0.060 0.080 0.050
I would like to reshape them so that I can use ggplot to plot all the As against Bs and then plot As against Bs for each value, my desired output would be something like this:
Name Value A B
1 ST01 1 0.020 0.010
2 ST01 2 0.040 0.020
3 ST01 3 0.050 0.040
4 ST01 4 0.060 0.080
5 ST01 5 0.080 0.090
6 STO2 1 0.036 0.003
7 STO2 2 0.034 0.002
8 STO2 3 0.054 0.008
9 STO2 4 0.034 0.009
10 STO2 5 0.240 0.004
11 ST03 1 0.140 0.040
12 ST03 2 0.320 0.020
13 ST03 3 0.560 0.060
14 ST03 4 0.780 0.080
15 ST03 5 0.590 0.050
>
I am new to R so not quite if this can be done, or if there is a more appropriate way to shape the data in order to plot. Any help would be appreciated.
答案1
得分: 1
你可以首先使用 left_join
合并你的数据集,然后利用 pivot_longer
的 names_pattern
参数和特殊的 ".value"
来将数据重塑为长格式。对于 names_pattern
参数,我使用正则表达式 "(.) (.)"
来将列名拆分为两个单字符字符串,例如,"A1" 被拆分为 "A" 和 "1",其中第一部分 ("A") 给出了值列的名称,第二部分 ("1") 存储在名称列 ("Value") 中作为类别:
英文:
You could first merge your datasets using a left_join
then reshape to long by utilizing the names_pattern
argument of pivot_longer
and the special ".value"
. For the names_pattern
argument I use the regex "(.)(.)"
to split the column names into two one character strings, e.g. "A1"
is split in to "A"
and "1"
where the first part ("A"
) gives the name of the value column and the second part ("1"
) is stored as category in the name column ("Value"
):
library(dplyr, warn = FALSE)
library(tidyr)
left_join(df2, df1, by = "Name") |>
pivot_longer(-Name, names_to = c(".value", "Value"), names_pattern = "(.)(.)")
#> # A tibble: 15 × 4
#> Name Value A B
#> <chr> <chr> <dbl> <dbl>
#> 1 ST01 1 0.02 0.01
#> 2 ST01 2 0.04 0.02
#> 3 ST01 3 0.05 0.04
#> 4 ST01 4 0.06 0.08
#> 5 ST01 5 0.08 0.09
#> 6 ST02 1 0.036 0.003
#> 7 ST02 2 0.034 0.002
#> 8 ST02 3 0.054 0.008
#> 9 ST02 4 0.034 0.009
#> 10 ST02 5 0.24 0.004
#> 11 ST03 1 0.14 0.04
#> 12 ST03 2 0.32 0.02
#> 13 ST03 3 0.56 0.06
#> 14 ST03 4 0.78 0.08
#> 15 ST03 5 0.59 0.05
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论