英文:
Difference of second and following column
问题
我相对R还比较新,正在尝试编写一个函数来创建一个新的数据框,显示原始数据集的第二列和随后的每一列之间的差异。假设这是我的数据(尽管我有许多变量)
obs var1 var2 var3
1 5 10 14
2 6 11 15
3 7 12 16
4 8 13 17
输出应该类似于这样:
obs var2_1 var3_1
1 -5 -9
2 -5 -9
3 -5 -9
4 -5 -9
非常感谢!
英文:
I'm fairly new to R and trying to write a function to create a new dataframe that depicts the difference of the second and every following column of the original dataset. Imagine this was may data (although I have many variables)
obs var1 var2 var3
1 5 10 14
2 6 11 15
3 7 12 16
4 8 13 17
The output should look something like this
obs var2_1 var3_1
1 -5 -9
2 -5 -9
3 -5 -9
4 -5 -9
Thank you very much in advance!
答案1
得分: 1
You can use across()
to apply the same transformation to multiple columns:
library(tidyverse)
df <- tibble::tribble(
~obs, ~var1, ~var2, ~var3,
1, 5, 10, 14,
2, 6, 11, 15,
3, 7, 12, 16,
4, 8, 13, 17
)
mutate(df, across(starts_with("var") & !var1, ~var1 - .x))
Created on 2023-03-20 with reprex v2.0.2
Add the option .keep = "unused"
to mutate()
to remove var1
from the output.
Update: To refer to columns based on their position, use
mutate(df, across(!1:2, ~ pull(pick(2), 1) - .x))
英文:
You can use across()
to apply the same transformation to multiple columns:
library(tidyverse)
df <- tibble::tribble(
~obs, ~var1, ~var2, ~var3,
1, 5, 10, 14,
2, 6, 11, 15,
3, 7, 12, 16,
4, 8, 13, 17
)
mutate(df, across(starts_with("var") & !var1, ~var1 - .x))
#> # A tibble: 4 × 4
#> obs var1 var2 var3
#> <dbl> <dbl> <dbl> <dbl>
#> 1 1 5 -5 -9
#> 2 2 6 -5 -9
#> 3 3 7 -5 -9
#> 4 4 8 -5 -9
<sup>Created on 2023-03-20 with reprex v2.0.2</sup>
Add the option .keep = "unused"
to mutate()
to remove var1
from the output.
Update: To refer to columns based on their position, use
mutate(df, across(!1:2, ~ pull(pick(2), 1) - .x))
答案2
得分: 0
如果你想在不依赖任何库的情况下完成这个任务:
df <- data.frame(
obs = 1:4,
var1 = 5:8,
var2 = 10:13,
var3 = 14:17
)
column_diff <- function(df, base_col, ignore_cols = "obs") {
# 获取不是基准列(即var1)或其他要忽略的列(即"obs")的列名
cols_to_change <- setdiff(colnames(df), c(ignore_cols, base_col))
# 用基准列与要更改的列之间的差值替换这些要更改的列
df[cols_to_change] <-
df[[base_col]] - df[cols_to_change]
# 删除基准列
df[base_col] <- NULL
df
}
column_diff(df, "var1")
#> obs var2 var3
#> 1 1 -5 -9
#> 2 2 -5 -9
#> 3 3 -5 -9
#> 4 4 -5 -9
创建于2023年3月20日,使用 reprex v2.0.2。
英文:
If you want to do it without any dependencies:
df <- data.frame(
obs = 1:4,
var1 = 5:8,
var2 = 10:13,
var3 = 14:17
)
column_diff <- function(df, base_col, ignore_cols = "obs") {
# Get column names which are not the basecol (i.e. var1) or other
# columns to ignore (i.e. "obs")
cols_to_change <- setdiff(colnames(df), c(ignore_cols, base_col))
# replace the cols to change with the difference between the basecol
# and the cols to change
df[cols_to_change] <-
df[[base_col]] - df[cols_to_change]
# remove the basecol
df[base_col] <- NULL
df
}
column_diff(df, "var1")
#> obs var2 var3
#> 1 1 -5 -9
#> 2 2 -5 -9
#> 3 3 -5 -9
#> 4 4 -5 -9
<sup>Created on 2023-03-20 with reprex v2.0.2</sup>
答案3
得分: 0
你可以使用mutate(across...)
来操作相应的列,并用rename_with
更改列名:
df %>%
mutate(across(matches("\\d$"), ~var1 - .)) %>%
rename_with(~str_replace(., "$", "_1"), !matches("1$"))
# 一个数据框:4 × 4
obs_1 var1 var2_1 var3_1
<dbl> <dbl> <dbl> <dbl>
1 1 0 -5 -9
2 2 0 -5 -9
3 3 0 -5 -9
4 4 0 -5 -9
英文:
You can mutate(across...)
the columns in question and change the column names with rename_with
:
df %>%
mutate(across(matches("\\d$"), ~var1 - .)) %>%
rename_with(~str_replace(., "$", "_1"), !matches("1$"))
# A tibble: 4 × 4
obs_1 var1 var2_1 var3_1
<dbl> <dbl> <dbl> <dbl>
1 1 0 -5 -9
2 2 0 -5 -9
3 3 0 -5 -9
4 4 0 -5 -9
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论