Subtracting after and before values of each sample for each numeric column in R

huangapple go评论54阅读模式
英文:

Subtracting after and before values of each sample for each numeric column in R

问题

以下是翻译好的部分:

我有一个类似下面示例结构的数据框

```R
df <- data.frame(rbind(c("Sample1_x2", 10, 23, 6, 5, "Sample1", "after"),
            c("Sample2_x2", 8, 53, 22, 52, "Sample2", "after"),
            c("Sample1_x1", 12, 2, 44, 15, "Sample1", "before"),
            c("Sample3_x1", 27, 46, 16, 65, "Sample3", "before"),
            c("Sample2_x1", 41, 44, 27, 25, "Sample2", "before"),
            c("Sample3_x2", 5, 38, 9, 29, "Sample3", "after")))
colnames(df) <- c("name", "alpha", "beta", "gamma", "rho", "id", "group")
df <- tibble::column_to_rownames(df, var = "name")
df
           var1 var2 var3 var4      id  group
Sample1_x2   10   23    6    5 Sample1  after
Sample2_x2    8   53   22   52 Sample2  after
Sample1_x1   12    2   44   15 Sample1 before
Sample3_x1   27   46   16   65 Sample3 before
Sample2_x1   41   44   27   25 Sample2 before
Sample3_x2    5   38    9   29 Sample3  after

我想通过计算每个样本的'id',每个变量列的'after - before'来获得变化的数据框。所需输出如下:

     id   alpha  beta  gamma    rho       
Sample1      -2    21    -38    -10
Sample2     -33     9     -5     27    
Sample3     -22    -8     -7    -36

我尝试使用dplyr::group_by(id, group),但在mutate()部分无法成功计算每个样本的差异。谢谢您提前的帮助。


<details>
<summary>英文:</summary>

I have a dataframe in a similar structure of the example below

df <- data.frame(rbind(c("Sample1_x2", 10, 23, 6, 5, "Sample1", "after"),
c("Sample2_x2", 8, 53, 22, 52, "Sample2", "after"),
c("Sample1_x1", 12, 2, 44, 15, "Sample1", "before"),
c("Sample3_x1", 27, 46, 16, 65, "Sample3", "before"),
c("Sample2_x1", 41, 44, 27, 25, "Sample2", "before"),
c("Sample3_x2", 5, 38, 9, 29, "Sample3", "after")))
colnames(df) <- c("name", "alpha", "beta", "gamma", "rho", "id", "group")
df <- tibble::column_to_rownames(df, var = "name")
df
var1 var2 var3 var4 id group
Sample1_x2 10 23 6 5 Sample1 after
Sample2_x2 8 53 22 52 Sample2 after
Sample1_x1 12 2 44 15 Sample1 before
Sample3_x1 27 46 16 65 Sample3 before
Sample2_x1 41 44 27 25 Sample2 before
Sample3_x2 5 38 9 29 Sample3 after


I want to get the change dataframe by calculating &#39;after - before&#39; for each sample by `id`, for each variable column (they are not numeric, and each numeric column has different name). Desired output is:

 id   alpha  beta  gamma    rho       

Sample1 -2 21 -38 -10
Sample2 -33 9 -5 27
Sample3 -22 -8 -7 -36


I was trying to use `dplyr::group_by(id, group)` but could not succeed in `mutate()` part to calculate the difference for each sample. Thank you in advance.

</details>


# 答案1
**得分**: 3

你可以在 `data.table` 中使用以下方法:

```R
library(data.table)

setDT(df)[order(-group), lapply(.SD, \(s) diff(as.numeric(s))), id, .SDcols=1:4]

输出:

        id alpha beta gamma rho
1: Sample1    -2   21   -38 -10
2: Sample3   -22   -8    -7 -36
3: Sample2   -33    9    -5  27

dplyr 中,相应的方法如下:

library(dplyr)

df %>%
  arrange(desc(group)) %>%
  reframe(across(-group, ~diff(as.numeric(.x))), .by=id)
英文:

You could use this approach in data.table

library(data.table)

setDT(df)[order(-group), lapply(.SD, \(s) diff(as.numeric(s))), id, .SDcols=1:4]

Output:

        id alpha beta gamma rho
1: Sample1    -2   21   -38 -10
2: Sample3   -22   -8    -7 -36
3: Sample2   -33    9    -5  27

An equivalent approach in dplyr is as follows:

library(dplyr)

df %&gt;% 
  arrange(desc(group)) %&gt;% 
  reframe(across(-group, ~diff(as.numeric(.x))), .by=id)

答案2

得分: 2

这是一个 dplyr 的选项:

library(dplyr)

df %>%
  readr::type_convert() %>%
  summarise(across(where(is.numeric), ~ .x[group == "after"] - .x[group == "before"]),
            .by = id)

#        id alpha beta gamma rho
# 1 Sample1    -2   21   -38 -10
# 2 Sample2   -33    9    -5  27
# 3 Sample3   -22   -8    -7 -36
英文:

This is one dplyr option:

library(dplyr)

df %&gt;%
  readr::type_convert() %&gt;%
  summarise(across(where(is.numeric), ~ .x[group == &quot;after&quot;] - .x[group == &quot;before&quot;]),
            .by = id)

#        id alpha beta gamma rho
# 1 Sample1    -2   21   -38 -10
# 2 Sample2   -33    9    -5  27
# 3 Sample3   -22   -8    -7 -36

答案3

得分: 1

以下是您要翻译的内容:

library(dplyr)

df %>%
  group_by(id) %>%
  arrange(group, .by_group = TRUE) %>%
  type.convert(as.is = TRUE) %>%
  summarise(across(-c(group, name), ~first(.)-last(.))) %>%
  ungroup()

# dplyr >= 1.1.0

df %>%
  arrange(group) %>%
  type.convert(as.is = TRUE) %>%
  summarise(across(-c(group, name), ~first(.)-last(.)), .by=id)

# See @langtang's solution who uses reframe very elegantly instead of summarise!

 id      alpha  beta gamma   rho
  <chr>   <int> <int> <int> <int>
1 Sample1    -2    21   -38   -10
2 Sample2   -33     9    -5    27
3 Sample3   -22    -8    -7   -36

希望这有所帮助!

英文:

Update after clarification (removed first answer):

library(dplyr)

df %&gt;%  
  group_by(id) %&gt;% 
  arrange(group, .by_group = TRUE) %&gt;%
  type.convert(as.is = TRUE) %&gt;% 
  summarise(across(-c(group, name), ~first(.)-last(.))) %&gt;% 
  ungroup()


# dplyr &gt;= 1.1.0

df %&gt;%  
  arrange(group) %&gt;%
  type.convert(as.is = TRUE) %&gt;% 
  summarise(across(-c(group, name), ~first(.)-last(.)), .by=id)

# See @langtang&#39;s solution who uses reframe very elegantly instead of summarise!

 id      alpha  beta gamma   rho
  &lt;chr&gt;   &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;
1 Sample1    -2    21   -38   -10
2 Sample2   -33     9    -5    27
3 Sample3   -22    -8    -7   -36

huangapple
  • 本文由 发表于 2023年5月17日 07:06:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/76267615.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定