英文:
How can I efficiently calculate differences between values of a column and the preceding column in R?
问题
我正在尝试计算每行列值与前一列值之间的差异,例如:
1 2 3
R1. 50 60 90
R2. 80 95 115
R3. 90 100 120
我想要代码执行以下操作:
(第2列中的值 - 第1列中的值),(第3列中的值 - 第2列中的值),(第4列中的值 - 第3列中的值),等等...
以便将计算出的值存储在一个新表格中,如下所示:
1 2
R1. 10 30
R2. 15 20
R3. 10 20
我一直在一个小列集上手动执行此操作,但是否有代码可以更高效地在较大的列/行集上执行此操作?
# 计算差异
diff_new = tab_new[,2] - tab_new[,1]
diff_new2 = tab_new[,3] - tab_new[,2]
diff_new3 = tab_new[,4] - tab_new[,3]
diff_new4 = tab_new[,5] - tab_new[,4]
# 创建包含差异的新表格
diff.table_new = cbind(diff_new, diff_new2, diff_new3, diff_new4)
如果需要在较大的列/行集上执行相同的操作,你可以使用循环或函数来自动化这个过程,以避免手动指定每一列。
英文:
I'm trying to calculate the differences between values of a column and the values from a preceding column for each row, for example:
1 2 3
R1. 50 60. 90
R2. 80.95. 115
R3. 90 100 120
I would want the code to do the following:
(values in col 2 - values in col 1), (values in col 3 - values in col 2), (values in col 4 - values in col 3), etc..
for an output that would store the calculated values in a new table like this:
1 2
R1. 10 30.
R2. 10.20
R3. 10 20
I've been doing this manually on a small set of columns as follows, but is there a code to more efficiently do this on a larger set of columns/rows
#to calculate differences
diff_new = tab_new[,2] - tab_new[,1]
diff_new2 = tab_new[,3] - tab_new[,2]
diff_new3 = tab_new[,4] - tab_new[,3]
diff_new4 = tab_new[,5] - tab_new[,4]
#to create a new table with differences
diff.table_new = cbind(diff_new,diff_new2,diff_new3,diff_new4)
答案1
得分: 3
一个基本的R选项是:
```R
df[-1] - df[-ncol(df)] # 感谢 @user20650
输出:
X2 X3
R1. 10 30
R2. 15 20
R3. 10 20
或者使用sapply
:
sapply(seq_len(ncol(df))[-1], function(x) df[,x] - df[,x-1])
输出:
[,1] [,2]
[1,] 10 30
[2,] 15 20
[3,] 10 20
数据:
df <- read.table(text = "1 2 3
R1. 50 60 90
R2. 80 95 115
R3. 90 100 120", h = TRUE)
<details>
<summary>英文:</summary>
One base R option would be
df[-1] - df[-ncol(df)] # thanks @user20650
Output
X2 X3
R1. 10 30
R2. 15 20
R3. 10 20
Or use `sapply`:
sapply(seq_len(ncol(df))[-1], function(x) df[,x] - df[,x-1])
Output:
[,1] [,2]
[1,] 10 30
[2,] 15 20
[3,] 10 20
Data
df <- read.table(text = "1 2 3
R1. 50 60 90
R2. 80 95 115
R3. 90 100 120", h = TRUE)
</details>
# 答案2
**得分**: 2
`lag`的`diff`:
```R
df <- read.table(header = T, text = " 1 2 3
R1. 50 60 90
R2. 80 95 115
R3. 90 100 120")
matrix(diff(unlist(df), lag=nrow(df)), nrow=nrow(df))
英文:
diff
with a lag:
df <- read.table(header = T, text = " 1 2 3
R1. 50 60 90
R2. 80 95 115
R3. 90 100 120")
matrix(diff(unlist(df), lag=nrow(df)), nrow=nrow(df))
答案3
得分: 1
另一个完全矢量化的选项:
data.frame(t(diff(t(df))))
你可以使用 apply
+ diff
:
df <- read.table(header = T, text = " 1 2 3
R1. 50 60 90
R2. 80 95 115
R3. 90 100 120")
t(apply(df, 1, diff))
# X2 X3
#R1. 10 30
#R2. 15 20
#R3. 10 20
英文:
Another fully vectorized option:
data.frame(t(diff(t(df))))
You can use apply
+ diff
:
df <- read.table(header = T, text = " 1 2 3
R1. 50 60 90
R2. 80 95 115
R3. 90 100 120")
t(apply(df, 1, diff))
# X2 X3
#R1. 10 30
#R2. 15 20
#R3. 10 20
答案4
得分: 0
以下是翻译好的部分:
"An efficient solution using data.table
and collapse
as long as your data doesn't have NA
values.
Here we make a copy of the original data (except for the 1st col) and modify the columns by reference.
library(data.table)
library(collapse)
# Take a copy
out <- copy(fselect(df, -1))
# Subtract lagged columns by reference
for (i in seq_col(out)){
out[[i]] %-=% df[[i]]
}
out
#> X2 X3
#> R1. 10 30
#> R2. 15 20
#> R3. 10 20
Created on 2023-06-02 with reprex v2.0.2
Data
df <- read.table(header = T, text = " 1 2 3
R1. 50 60 90
R2. 80 95 115
R3. 90 100 120")
```"
<details>
<summary>英文:</summary>
An efficient solution using `data.table` and `collapse` as long as your data doesn't have `NA` values.
Here we make a copy of the original data (except for the 1st col) and modify the columns by reference.
``` r
library(data.table)
library(collapse)
# Take a copy
out <- copy(fselect(df, -1))
# Subtract lagged columns by reference
for (i in seq_col(out)){
out[[i]] %-=% df[[i]]
}
out
#> X2 X3
#> R1. 10 30
#> R2. 15 20
#> R3. 10 20
<sup>Created on 2023-06-02 with reprex v2.0.2</sup>
Data
df <- read.table(header = T, text = " 1 2 3
R1. 50 60 90
R2. 80 95 115
R3. 90 100 120")
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论