英文:
Calculating differences with specific values in data frame in R
问题
我有以下的数据框在RStudio中:
screenshot from my dataframe
时间点a和b是前后的数值,我想计算两者之间的差值,即b-a。
我想要为每个受试者和每个会话单独进行计算,也就是说,对于受试者1,我想要计算T1、T2和T3的差值。
非常感谢任何帮助!
我考虑过使用Tidyverse进行过滤和子集选择,但这似乎非常复杂,我想肯定有更简单的方法。
英文:
I have the following dataframe in RStudio:
screenshot from my dataframe
Timepoint a and b are pre- and post values and I want to calculate the difference between the two i.e. b-a
I want to do this for each subject and each session seperately meaning for subject 1 I want to calculate the difference for T1, T2 and T3.
I greatly appreciate any help!
I thought about filtering and subsetting with Tidyverse but this seems very complicated and I gues there must be an easier way.
答案1
得分: 2
这是一个使用dplyr
的方法:
library(dplyr)
df <- data.frame(subject = c(rep(1, 6), rep(2, 3)),
var = "SMSPAg",
session = c("T1", "T1", "T2", "T2", "T3", "T3","T1", "T1","T2"),
timepoint = c("a", "b", "a", "b","a", "b","a", "b", "a"),
value = c(50, 48, 52, 65, 51, 61, 53, 50, 54)
)
df %>%
summarise(diff = last(value) - first(value), .by = c(subject, session))
subject session diff
1 1 T1 -2
2 1 T2 13
3 1 T3 10
4 2 T1 -3
5 2 T2 0
英文:
Here is a dplyr
way:
library(dplyr)
df <- data.frame(subject = c(rep(1, 6), rep(2, 3)),
var = "SMSPAg",
session = c("T1", "T1", "T2", "T2", "T3", "T3","T1", "T1","T2"),
timepoint = c("a", "b", "a", "b","a", "b","a", "b", "a"),
value = c(50, 48, 52, 65, 51, 61, 53, 50, 54)
)
df %>%
summarise(diff = last(value) - first(value), .by = c(subject, session))
subject session diff
1 1 T1 -2
2 1 T2 13
3 1 T3 10
4 2 T1 -3
5 2 T2 0
答案2
得分: 2
你可以选择将数据变宽,然后按常规计算两个变量之间的差异,或者保持数据在长格式中,并按照TarJae的建议从每个向量中提取特定元素(按组拆分)。
变宽重塑(如下所示)的优点是不需要a
和b
按正确顺序排列。
library(tidyverse)
df |>
pivot_wider(
names_from = timepoint,
values_from = value
) |>
mutate(difference = b - a)
#> # A tibble: 5 × 6
#> subject var session a b difference
#> <int> <chr> <chr> <int> <int> <int>
#> 1 1 SMSPAg T1 50 48 -2
#> 2 1 SMSPAg T2 52 65 13
#> 3 1 SMSPAg T3 51 61 10
#> 4 2 SMSPAg T1 53 50 -3
#> 5 2 SMSPAg T2 54 NA NA
<sup>Created on 2023-04-13 with reprex v2.0.2</sup>
其中
df <- tribble(
~subject, ~var, ~session, ~timepoint, ~value,
1L, "SMSPAg", "T1", "a", 50L,
1L, "SMSPAg", "T1", "b", 48L,
1L, "SMSPAg", "T2", "a", 52L,
1L, "SMSPAg", "T2", "b", 65L,
1L, "SMSPAg", "T3", "a", 51L,
1L, "SMSPAg", "T3", "b", 61L,
2L, "SMSPAg", "T1", "a", 53L,
2L, "SMSPAg", "T1", "b", 50L,
2L, "SMSPAg", "T2", "a", 54L
)
英文:
You can either reshape the data wider and then compute the difference between two variables as normal, or you can keep the data in long format and extract a specific element from each vector (broken down by group) as suggested by TarJae.
Reshaping wider (as shown below) has the advantage that it does not require a
and b
to be in the correct order.
library(tidyverse)
df |>
pivot_wider(
names_from = timepoint,
values_from = value
) |>
mutate(difference = b - a)
#> # A tibble: 5 × 6
#> subject var session a b difference
#> <int> <chr> <chr> <int> <int> <int>
#> 1 1 SMSPAg T1 50 48 -2
#> 2 1 SMSPAg T2 52 65 13
#> 3 1 SMSPAg T3 51 61 10
#> 4 2 SMSPAg T1 53 50 -3
#> 5 2 SMSPAg T2 54 NA NA
<sup>Created on 2023-04-13 with reprex v2.0.2</sup>
where
df <- tribble(
~subject, ~var, ~session, ~timepoint, ~value,
1L, "SMSPAg", "T1", "a", 50L,
1L, "SMSPAg", "T1", "b", 48L,
1L, "SMSPAg", "T2", "a", 52L,
1L, "SMSPAg", "T2", "b", 65L,
1L, "SMSPAg", "T3", "a", 51L,
1L, "SMSPAg", "T3", "b", 61L,
2L, "SMSPAg", "T1", "a", 53L,
2L, "SMSPAg", "T1", "b", 50L,
2L, "SMSPAg", "T2", "a", 54L
)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论