2023年7月31日 18:17:12go评论115阅读模式

英文:

convert large to long

问题

我想将数据转换为长格式。这是一个重复测量设计，有3个条件。

这是我目前的数据：

参与者编号	测量1（条件1）	测量1（条件2）	测量1（条件3）	测量2（条件1）	测量2（条件2）	测量2（条件3）	测量3（条件1）	测量3（条件2）	测量3（条件3）	年龄	性别
1
2
3

我希望得到的结果是：

参与者编号	条件	测量1	测量2	测量3	年龄	性别
1	1
1	2
1	3

我尝试了以下代码：data_long <- gather(df, condition_measure1, measure1, measure1_cond1, measure1_cond2, measure1_cond3, factor_key=TRUE)。

这个方法可以工作，但是我不知道如何将所有3个条件都转换成长格式。我尝试重复相同的代码来处理测量2，但是它没有起作用，它为每个参与者添加了另外3行。

你能帮助我吗？我对R非常陌生，所以请原谅我，我只是想转换我的数据并回到Jamovi ^^
谢谢！

编辑：这是数据的结构：

structure(list(num_pp = c(1, 2, 3, 4, 5, 6), nombre_dp1 = c(24, 
14, 2, 6, 6, 21), nombre_dp05 = c(20, 28, 2, 9, 8, 21), nombre_dp0 = c(24, 
20, 4, 11, 8, 20), jugement_causal_dp1 = c("Oui", "Oui", "Oui", 
"Oui", "Oui", "Oui"), jugement_causal_dp05 = c("Non", "Oui", 
"Non", "Non", "Oui", "Non"), jugement_causal_dp0 = c("Non", "Non", 
"Oui", "Non", "Non", "Non"), confiance_dp1 = c(90, 80, 63, 80, 
90, 80), confiance_dp05 = c(60, 50, 86, 65, 50, 90), confiance_dp0 = c(65, 
60, 55, 43, 50, 80), age = c(33, 22, 20, 20, 18, 18), genre = c("Masculin", 
"Feminin", "Feminin", "Feminin", "Feminin", "Feminin"), etude = c("L1", 
"L1", "L1", "L1", "L1", "L1"), ordre = c("dp_05|dp_1|dp_0", "dp_0|dp_1|dp_05", 
"dp_0|dp_1|dp_05", "dp_0|dp_05|dp_1", "dp_1|dp_05|dp_0", "dp_1|dp_05|dp_0"
), wdif_dp1dp05 = c(-4, 14, 0, 3, 2, 0)), row.names = c(NA, -6L
), class = c("tbl_df", "tbl", "data.frame"))

英文:

I want to convert data to a long format. It's a repeated measure design, with 3 conditions.
This is what I have :

participant id	measure 1 (cond1)	measure 1 (cond2)	measure 1 (cond3)	measure2 (cond1)	measure2 (cond2)	measure2 (cond3)	measure3 (cond1)	measure3(cond2)	measure3 (cond3)	age	gender
1
2
3

And this is what I would like:

participant id	condition	measure1	measure2	measure3	age	gender
1	1
1	2
1	3

I Tried data_long <- gather(df, condition_measure1, measure1, measure1_cond1, measure1_cond2, measure1_cond3, factor_key=TRUE)

It works, but if I don't know how to put all 3 conditions in long format. I tried repeating the same code but for measure2, it did not work, it added another 3 rows for each participant.
Can you hep me ? I a very new to R, so forgive me, I just want to convert my data and go back to Jamovi ^^
Thank you!

edit: here is the data

structure(list(num_pp = c(1, 2, 3, 4, 5, 6), nombre_dp1 = c(24, 
14, 2, 6, 6, 21), nombre_dp05 = c(20, 28, 2, 9, 8, 21), nombre_dp0 = c(24, 
20, 4, 11, 8, 20), jugement_causal_dp1 = c(&quot;Oui&quot;, &quot;Oui&quot;, &quot;Oui&quot;, 
&quot;Oui&quot;, &quot;Oui&quot;, &quot;Oui&quot;), jugement_causal_dp05 = c(&quot;Non&quot;, &quot;Oui&quot;, 
&quot;Non&quot;, &quot;Non&quot;, &quot;Oui&quot;, &quot;Non&quot;), jugement_causal_dp0 = c(&quot;Non&quot;, &quot;Non&quot;, 
&quot;Oui&quot;, &quot;Non&quot;, &quot;Non&quot;, &quot;Non&quot;), confiance_dp1 = c(90, 80, 63, 80, 
90, 80), confiance_dp05 = c(60, 50, 86, 65, 50, 90), confiance_dp0 = c(65, 
60, 55, 43, 50, 80), age = c(33, 22, 20, 20, 18, 18), genre = c(&quot;Masculin&quot;, 
&quot;Feminin&quot;, &quot;Feminin&quot;, &quot;Feminin&quot;, &quot;Feminin&quot;, &quot;Feminin&quot;), etude = c(&quot;L1&quot;, 
&quot;L1&quot;, &quot;L1&quot;, &quot;L1&quot;, &quot;L1&quot;, &quot;L1&quot;), ordre = c(&quot;dp_05|dp_1|dp_0&quot;, &quot;dp_0|dp_1|dp_05&quot;, 
&quot;dp_0|dp_1|dp_05&quot;, &quot;dp_0|dp_05|dp_1&quot;, &quot;dp_1|dp_05|dp_0&quot;, &quot;dp_1|dp_05|dp_0&quot;
), wdif_dp1dp05 = c(-4, 14, 0, 3, 2, 0)), row.names = c(NA, -6L
), class = c(&quot;tbl_df&quot;, &quot;tbl&quot;, &quot;data.frame&quot;))

答案1

得分: 2

使用OP提供的数据，您可以直接使用pivot_longer函数，如下所示：

df %>%
  pivot_longer(matches('_dp\\d+$'), names_to = c('.value', 'dp'), 
                names_pattern = '(.*)_(\\w+)')
# A tibble: 18 × 10
   num_pp   age genre    etude ordre  wdif_dp1dp05 dp    nombre jugement_causal confiance
    <dbl> <dbl> <chr>    <chr> <chr>         <dbl> <chr>  <dbl> <chr>               <dbl>
 1      1    33 Masculin L1    dp_05…           -4 dp1       24 Oui                    90
 2      1    33 Masculin L1    dp_05…           -4 dp05      20 Non                    60
 3      1    33 Masculin L1    dp_05…           -4 dp0       24 Non                    65
 4      2    22 Feminin  L1    dp_0|…           14 dp1       14 Oui                    80
 5      2    22 Feminin  L1    dp_0|…           14 dp05      28 Oui                    50
 6      2    22 Feminin  L1    dp_0|…           14 dp0       20 Non                    60
 7      3    20 Feminin  L1    dp_0|…            0 dp1        2 Oui                    63
 8      3    20 Feminin  L1    dp_0|…            0 dp05       2 Non                    86
 9      3    20 Feminin  L1    dp_0|…            0 dp0        4 Oui                    55
10      4    20 Feminin  L1    dp_0|…            3 dp1        6 Oui                    80
11      4    20 Feminin  L1    dp_0|…            3 dp05       9 Non                    65
12      4    20 Feminin  L1    dp_0|…            3 dp0       11 Non                    43
13      5    18 Feminin  L1    dp_1|…            2 dp1        6 Oui                    90
14      5    18 Feminin  L1    dp_1|…            2 dp05       8 Oui                    50
15      5    18 Feminin  L1    dp_1|…            2 dp0        8 Non                    50
16      6    18 Feminin  L1    dp_1|…            0 dp1       21 Oui                    80
17      6    18 Feminin  L1    dp_1|…            0 dp05      21 Non                    90
18      6    18 Feminin  L1    dp_1|…            0 dp0       20 Non                    80

我们在names_to中使用.value，以确保这3个测量值分布在不同的列中。

英文:

Using the data OP provided, you could directly use pivot_longer as shown below:

df %&gt;%
pivot_longer(matches(&#39;_dp\\d+$&#39;), names_to = c(&#39;.value&#39;, &#39;dp&#39;), 
names_pattern = &#39;(.*)_(\\w+)&#39;)
# A tibble: 18 &#215; 10
num_pp   age genre    etude ordre  wdif_dp1dp05 dp    nombre jugement_causal confiance
&lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt;    &lt;chr&gt; &lt;chr&gt;         &lt;dbl&gt; &lt;chr&gt;  &lt;dbl&gt; &lt;chr&gt;               &lt;dbl&gt;
1      1    33 Masculin L1    dp_05…           -4 dp1       24 Oui                    90
2      1    33 Masculin L1    dp_05…           -4 dp05      20 Non                    60
3      1    33 Masculin L1    dp_05…           -4 dp0       24 Non                    65
4      2    22 Feminin  L1    dp_0|…           14 dp1       14 Oui                    80
5      2    22 Feminin  L1    dp_0|…           14 dp05      28 Oui                    50
6      2    22 Feminin  L1    dp_0|…           14 dp0       20 Non                    60
7      3    20 Feminin  L1    dp_0|…            0 dp1        2 Oui                    63
8      3    20 Feminin  L1    dp_0|…            0 dp05       2 Non                    86
9      3    20 Feminin  L1    dp_0|…            0 dp0        4 Oui                    55
10      4    20 Feminin  L1    dp_0|…            3 dp1        6 Oui                    80
11      4    20 Feminin  L1    dp_0|…            3 dp05       9 Non                    65
12      4    20 Feminin  L1    dp_0|…            3 dp0       11 Non                    43
13      5    18 Feminin  L1    dp_1|…            2 dp1        6 Oui                    90
14      5    18 Feminin  L1    dp_1|…            2 dp05       8 Oui                    50
15      5    18 Feminin  L1    dp_1|…            2 dp0        8 Non                    50
16      6    18 Feminin  L1    dp_1|…            0 dp1       21 Oui                    80
17      6    18 Feminin  L1    dp_1|…            0 dp05      21 Non                    90
18      6    18 Feminin  L1    dp_1|…            0 dp0       20 Non                    80

We use .value within the names_to to ensure that the 3 measure values are spread across different columns.

答案2

得分: 1

假设数据如下所示：

df <- structure(list(`participant id` = c(1, 2, 3), `measure 1 (cond1)` = c(10, 
12, 8), `measure 1 (cond2)` = c(15, 14, 9), `measure 1 (cond3)` = c(20, 
18, 10), `measure2 (cond1)` = c(25, 22, 15), `measure2 (cond2)` = c(30, 
28, 19), `measure2 (cond3)` = c(35, 30, 22), `measure3 (cond1)` = c(40, 
38, 25), `measure3 (cond2)` = c(45, 42, 28), `measure3 (cond3)` = c(50, 
48, 32), age = c(25, 30, 27), gender = c("Male", "Female", "Male"
)), class = "data.frame", row.names = c(NA, -3L))

你可以这样操作：

library(dplyr)
library(tidyr)
df <- pivot_longer(df,
             cols = starts_with("measure"), 
             names_pattern = "measure ?(\\d+) \\(cond(\\d+)\\)",
             names_to = c("measure", "condition")) %>%
    mutate(condition = as.integer(condition),
           measure = as.integer(measure))
# 输出结果：
   `participant id`   age gender measure condition value
              <dbl> <dbl> <chr>    <int>     <int> <dbl>
 1                1    25 Male         1         1    10
 2                1    25 Male         1         2    15
 3                1    25 Male         1         3    20
 4                1    25 Male         2         1    25
 5                1    25 Male         2         2    30
 6                1    25 Male         2         3    35
 7                1    25 Male         3         1    40
 8                1    25 Male         3         2    45
 9                1    25 Male         3         3    50
10                2    30 Female       1         1    12
# ℹ 还有17行数据

要恢复为宽格式：

df %>% pivot_wider(names_from = measure, values_from = value, names_prefix = "measure")

英文:

Assuming the data is something like this:

df &lt;- structure(list(`participant id` = c(1, 2, 3), `measure 1 (cond1)` = c(10, 
12, 8), `measure 1 (cond2)` = c(15, 14, 9), `measure 1 (cond3)` = c(20, 
18, 10), `measure2 (cond1)` = c(25, 22, 15), `measure2 (cond2)` = c(30, 
28, 19), `measure2 (cond3)` = c(35, 30, 22), `measure3 (cond1)` = c(40, 
38, 25), `measure3 (cond2)` = c(45, 42, 28), `measure3 (cond3)` = c(50, 
48, 32), age = c(25, 30, 27), gender = c(&quot;Male&quot;, &quot;Female&quot;, &quot;Male&quot;
)), class = &quot;data.frame&quot;, row.names = c(NA, -3L))

You can do something like this:

library(dplyr)
library(tidyr)
df &lt;- pivot_longer(df,
cols = starts_with(&quot;measure&quot;), 
names_pattern = &quot;measure ?(\\d+) \\(cond(\\d+)\\)&quot;,
names_to = c(&quot;measure&quot;, &quot;condition&quot;)) |&gt;
mutate(condition = as.integer(condition),
measure = as.integer(measure))
# Output:
`participant id`   age gender measure condition value
&lt;dbl&gt; &lt;dbl&gt; &lt;chr&gt;    &lt;int&gt;     &lt;int&gt; &lt;dbl&gt;
1                1    25 Male         1         1    10
2                1    25 Male         1         2    15
3                1    25 Male         1         3    20
4                1    25 Male         2         1    25
5                1    25 Male         2         2    30
6                1    25 Male         2         3    35
7                1    25 Male         3         1    40
8                1    25 Male         3         2    45
9                1    25 Male         3         3    50
10                2    30 Female       1         1    12
# ℹ 17 more rows

To make it wide again:

df |&gt; pivot_wider(names_from = measure, values_from = value, names_prefix = &quot;measure&quot;)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将 “large” 转换为 “long”。

问题

答案1

答案2

如何检查行趋势并将失败案例的差异和差异百分比分别添加到单独的列中

运行 sapply 函数，其中有两个输入（变量和数据框）。

Pandas “Consecutive”/Rolling Percent Rank

Catching LAPACK errors in Armadillo.

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。