英文:
Subtracting after and before values of each sample for each numeric column in R
问题
以下是翻译好的部分:
我有一个类似下面示例结构的数据框
```R
df <- data.frame(rbind(c("Sample1_x2", 10, 23, 6, 5, "Sample1", "after"),
c("Sample2_x2", 8, 53, 22, 52, "Sample2", "after"),
c("Sample1_x1", 12, 2, 44, 15, "Sample1", "before"),
c("Sample3_x1", 27, 46, 16, 65, "Sample3", "before"),
c("Sample2_x1", 41, 44, 27, 25, "Sample2", "before"),
c("Sample3_x2", 5, 38, 9, 29, "Sample3", "after")))
colnames(df) <- c("name", "alpha", "beta", "gamma", "rho", "id", "group")
df <- tibble::column_to_rownames(df, var = "name")
df
var1 var2 var3 var4 id group
Sample1_x2 10 23 6 5 Sample1 after
Sample2_x2 8 53 22 52 Sample2 after
Sample1_x1 12 2 44 15 Sample1 before
Sample3_x1 27 46 16 65 Sample3 before
Sample2_x1 41 44 27 25 Sample2 before
Sample3_x2 5 38 9 29 Sample3 after
我想通过计算每个样本的'id',每个变量列的'after - before'来获得变化的数据框。所需输出如下:
id alpha beta gamma rho
Sample1 -2 21 -38 -10
Sample2 -33 9 -5 27
Sample3 -22 -8 -7 -36
我尝试使用dplyr::group_by(id, group)
,但在mutate()
部分无法成功计算每个样本的差异。谢谢您提前的帮助。
<details>
<summary>英文:</summary>
I have a dataframe in a similar structure of the example below
df <- data.frame(rbind(c("Sample1_x2", 10, 23, 6, 5, "Sample1", "after"),
c("Sample2_x2", 8, 53, 22, 52, "Sample2", "after"),
c("Sample1_x1", 12, 2, 44, 15, "Sample1", "before"),
c("Sample3_x1", 27, 46, 16, 65, "Sample3", "before"),
c("Sample2_x1", 41, 44, 27, 25, "Sample2", "before"),
c("Sample3_x2", 5, 38, 9, 29, "Sample3", "after")))
colnames(df) <- c("name", "alpha", "beta", "gamma", "rho", "id", "group")
df <- tibble::column_to_rownames(df, var = "name")
df
var1 var2 var3 var4 id group
Sample1_x2 10 23 6 5 Sample1 after
Sample2_x2 8 53 22 52 Sample2 after
Sample1_x1 12 2 44 15 Sample1 before
Sample3_x1 27 46 16 65 Sample3 before
Sample2_x1 41 44 27 25 Sample2 before
Sample3_x2 5 38 9 29 Sample3 after
I want to get the change dataframe by calculating 'after - before' for each sample by `id`, for each variable column (they are not numeric, and each numeric column has different name). Desired output is:
id alpha beta gamma rho
Sample1 -2 21 -38 -10
Sample2 -33 9 -5 27
Sample3 -22 -8 -7 -36
I was trying to use `dplyr::group_by(id, group)` but could not succeed in `mutate()` part to calculate the difference for each sample. Thank you in advance.
</details>
# 答案1
**得分**: 3
你可以在 `data.table` 中使用以下方法:
```R
library(data.table)
setDT(df)[order(-group), lapply(.SD, \(s) diff(as.numeric(s))), id, .SDcols=1:4]
输出:
id alpha beta gamma rho
1: Sample1 -2 21 -38 -10
2: Sample3 -22 -8 -7 -36
3: Sample2 -33 9 -5 27
在 dplyr
中,相应的方法如下:
library(dplyr)
df %>%
arrange(desc(group)) %>%
reframe(across(-group, ~diff(as.numeric(.x))), .by=id)
英文:
You could use this approach in data.table
library(data.table)
setDT(df)[order(-group), lapply(.SD, \(s) diff(as.numeric(s))), id, .SDcols=1:4]
Output:
id alpha beta gamma rho
1: Sample1 -2 21 -38 -10
2: Sample3 -22 -8 -7 -36
3: Sample2 -33 9 -5 27
An equivalent approach in dplyr
is as follows:
library(dplyr)
df %>%
arrange(desc(group)) %>%
reframe(across(-group, ~diff(as.numeric(.x))), .by=id)
答案2
得分: 2
这是一个 dplyr
的选项:
library(dplyr)
df %>%
readr::type_convert() %>%
summarise(across(where(is.numeric), ~ .x[group == "after"] - .x[group == "before"]),
.by = id)
# id alpha beta gamma rho
# 1 Sample1 -2 21 -38 -10
# 2 Sample2 -33 9 -5 27
# 3 Sample3 -22 -8 -7 -36
英文:
This is one dplyr
option:
library(dplyr)
df %>%
readr::type_convert() %>%
summarise(across(where(is.numeric), ~ .x[group == "after"] - .x[group == "before"]),
.by = id)
# id alpha beta gamma rho
# 1 Sample1 -2 21 -38 -10
# 2 Sample2 -33 9 -5 27
# 3 Sample3 -22 -8 -7 -36
答案3
得分: 1
以下是您要翻译的内容:
library(dplyr)
df %>%
group_by(id) %>%
arrange(group, .by_group = TRUE) %>%
type.convert(as.is = TRUE) %>%
summarise(across(-c(group, name), ~first(.)-last(.))) %>%
ungroup()
# dplyr >= 1.1.0
df %>%
arrange(group) %>%
type.convert(as.is = TRUE) %>%
summarise(across(-c(group, name), ~first(.)-last(.)), .by=id)
# See @langtang's solution who uses reframe very elegantly instead of summarise!
id alpha beta gamma rho
<chr> <int> <int> <int> <int>
1 Sample1 -2 21 -38 -10
2 Sample2 -33 9 -5 27
3 Sample3 -22 -8 -7 -36
希望这有所帮助!
英文:
Update after clarification (removed first answer):
library(dplyr)
df %>%
group_by(id) %>%
arrange(group, .by_group = TRUE) %>%
type.convert(as.is = TRUE) %>%
summarise(across(-c(group, name), ~first(.)-last(.))) %>%
ungroup()
# dplyr >= 1.1.0
df %>%
arrange(group) %>%
type.convert(as.is = TRUE) %>%
summarise(across(-c(group, name), ~first(.)-last(.)), .by=id)
# See @langtang's solution who uses reframe very elegantly instead of summarise!
id alpha beta gamma rho
<chr> <int> <int> <int> <int>
1 Sample1 -2 21 -38 -10
2 Sample2 -33 9 -5 27
3 Sample3 -22 -8 -7 -36
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论