如何根据两列而不是单列来将数据透视成宽格式。

huangapple go评论73阅读模式
英文:

How to pivot table to wide based on two colums instead of single

问题

我有这样的数据

df <- data.frame(ptid = rep(1:2, each = 6),
num = rep(1:3, each = 4),
type = rep(c("type1", "type2"), each = 6),
time = rep(1:3, times = 4),
volume = rnorm(12))

现在,它是长格式,我应该如何将其改为宽格式,基于时间和类型,而不是只有一个: 如 time1type1, time2type1, time3type1, time1type2, time2type2, time3type3, time3type3...?

我写了这段代码

df_wide <- df %>%
pivot_wider(names_from = c("time", "type"),
values_from = "volume",
names_prefix = "time",
names_sep = "")

但它说体积不足以识别,并且无法工作。谢谢帮助~~!
英文:

I have data like this

df &lt;- data.frame(ptid = rep(1:2, each = 6),
                 num = rep(1:3, each = 4),
                 type = rep(c(&quot;type1&quot;, &quot;type2&quot;), each = 6),
                 time = rep(1:3, times = 4),
                 volume = rnorm(12))


Now, it is long format, how should I change it to wide format based on both time and type instead one: like time1type1, time2type1, time3type1, time1type2, time2type2, time3type3, time3type3...?

I wrote this code

df_wide &lt;- df %&gt;%
  pivot_wider(names_from = c(&quot;time&quot;, &quot;type&quot;),
              values_from = &quot;volume&quot;,
              names_prefix = &quot;time&quot;,
              names_sep = &quot;&quot;)

but it says the volume is not sufficient to identify, and could not work. Thanks for help~~!

答案1

得分: 2

library(dplyr)
library(tidyr)

df %>%
  unite(time_type, c(time, type), sep = "") %>%
  pivot_wider(names_from = time_type,
              values_from = volume,
              names_prefix = "time",
              values_fn = list) %>%
  unnest(cols = c(time1type1, time2type1, time3type1, time1type2, time2type2, time3type2))
ptid   num time1type1 time2type1 time3type1 time1type2 time2type2 time3type2
<int> <int>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
1     1     1      0.864     -1.02       0.577     NA         NA         NA
2     1     1      1.33      -1.02       0.577     NA         NA         NA
3     1     2     NA         -0.909     -1.30      NA         NA         NA
4     2     2     NA         NA         NA          0.255     -0.549     NA
5     2     3     NA         NA         NA         -0.213      0.139     -1.59
6     2     3     NA         NA         NA         -0.213      0.139     -0.384

<details>
<summary>英文:</summary>

library(dplyr)
library(tidyr)

df %>%
unite(time_type, c(time, type), sep = "") %>%
pivot_wider(names_from = time_type,
values_from = volume,
names_prefix = "time",
values_fn = list) %>%
unnest(cols = c(time1type1, time2type1, time3type1, time1type2, time2type2, time3type2))

ptid num time1type1 time2type1 time3type1 time1type2 time2type2 time3type2
<int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 0.864 -1.02 0.577 NA NA NA
2 1 1 1.33 -1.02 0.577 NA NA NA
3 1 2 NA -0.909 -1.30 NA NA NA
4 2 2 NA NA NA 0.255 -0.549 NA
5 2 3 NA NA NA -0.213 0.139 -1.59
6 2 3 NA NA NA -0.213 0.139 -0.384


</details>



# 答案2
**得分**: 0

你有观测数据,这些数据匹配了ptid/num/type/time。如果你想让它们成为单独的行而不是列表,请添加一个标识符来区分它们:

```R
df %>%
  dplyr::mutate(iteration = dplyr::row_number(), .by = c(ptid:time)) %>%
  pivot_wider(names_from = c("time", "type"),
              values_from = "volume",
              names_prefix = "time",
              names_sep = "")

# 一个 tibble: 6 × 9
   ptid   num iteration time1type1 time2type1 time3type1 time1type2 time2type2 time3type2
  <int> <int>     <int>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
1     1     1         1     0.0202       1.09      -1.72      NA        NA         NA    
2     1     1         2    -0.473       NA         NA         NA        NA         NA    
3     1     2         1    NA           -1.06      -1.02      NA        NA         NA    
4     2     2         1    NA           NA         NA         -1.11     -0.494     NA    
5     2     3         1    NA           NA         NA         -1.34     -0.146     -1.08 
6     2     3         2    NA           NA         NA         NA        NA         -0.463

如果你需要更多帮助,请告诉我。

英文:

You have observations which match ptid/num/type/time. If you want them to be separate rows instead of a list in that spot, add an identifier to distinguish them:

df %&gt;%
  dplyr::mutate(iteration = dplyr::row_number(), .by = c(ptid:time)) %&gt;%
  pivot_wider(names_from = c(&quot;time&quot;, &quot;type&quot;),
              values_from = &quot;volume&quot;,
              names_prefix = &quot;time&quot;,
              names_sep = &quot;&quot;)

# A tibble: 6 &#215; 9
   ptid   num iteration time1type1 time2type1 time3type1 time1type2 time2type2 time3type2
  &lt;int&gt; &lt;int&gt;     &lt;int&gt;      &lt;dbl&gt;      &lt;dbl&gt;      &lt;dbl&gt;      &lt;dbl&gt;      &lt;dbl&gt;      &lt;dbl&gt;
1     1     1         1     0.0202       1.09      -1.72      NA        NA         NA    
2     1     1         2    -0.473       NA         NA         NA        NA         NA    
3     1     2         1    NA           -1.06      -1.02      NA        NA         NA    
4     2     2         1    NA           NA         NA         -1.11     -0.494     NA    
5     2     3         1    NA           NA         NA         -1.34     -0.146     -1.08 
6     2     3         2    NA           NA         NA         NA        NA         -0.463

huangapple
  • 本文由 发表于 2023年5月11日 02:52:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76221737.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定