第二列和后续列的差异

huangapple go评论82阅读模式
英文:

Difference of second and following column

问题

我相对R还比较新,正在尝试编写一个函数来创建一个新的数据框,显示原始数据集的第二列和随后的每一列之间的差异。假设这是我的数据(尽管我有许多变量)

obs  var1   var2   var3    
1     5      10     14   
2     6      11     15   
3     7      12     16   
4     8      13     17    

输出应该类似于这样:

obs var2_1 var3_1
1     -5       -9
2     -5       -9
3     -5       -9
4     -5       -9

非常感谢!

英文:

I'm fairly new to R and trying to write a function to create a new dataframe that depicts the difference of the second and every following column of the original dataset. Imagine this was may data (although I have many variables)

obs  var1   var2   var3    
1     5      10     14   
2     6      11     15   
3     7      12     16   
4     8      13     17    

The output should look something like this

obs var2_1 var3_1
1     -5       -9
2     -5       -9
3     -5       -9
4     -5       -9

Thank you very much in advance!

答案1

得分: 1

You can use across() to apply the same transformation to multiple columns:

library(tidyverse)

df <- tibble::tribble(
  ~obs, ~var1, ~var2, ~var3,
  1,  5,   10,   14,
  2,  6,   11,   15,
  3,  7,   12,   16,
  4,  8,   13,   17
  )

mutate(df, across(starts_with("var") & !var1, ~var1 - .x))

Created on 2023-03-20 with reprex v2.0.2

Add the option .keep = "unused" to mutate() to remove var1 from the output.

Update: To refer to columns based on their position, use

mutate(df, across(!1:2, ~ pull(pick(2), 1) - .x))
英文:

You can use across() to apply the same transformation to multiple columns:

library(tidyverse)

df &lt;- tibble::tribble(
  ~obs, ~var1, ~var2, ~var3,
  1,  5,   10,   14,
  2,  6,   11,   15,
  3,  7,   12,   16,
  4,  8,   13,   17
  )

mutate(df, across(starts_with(&quot;var&quot;) &amp; !var1, ~var1 - .x))
#&gt; # A tibble: 4 &#215; 4
#&gt;     obs  var1  var2  var3
#&gt;   &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
#&gt; 1     1     5    -5    -9
#&gt; 2     2     6    -5    -9
#&gt; 3     3     7    -5    -9
#&gt; 4     4     8    -5    -9

<sup>Created on 2023-03-20 with reprex v2.0.2</sup>

Add the option .keep = &quot;unused&quot; to mutate() to remove var1 from the output.

Update: To refer to columns based on their position, use

mutate(df, across(!1:2, ~ pull(pick(2), 1) - .x))

答案2

得分: 0

如果你想在不依赖任何库的情况下完成这个任务:

df &lt;- data.frame(
  obs = 1:4,
  var1 = 5:8,
  var2 = 10:13,
  var3 = 14:17 
)

column_diff &lt;- function(df, base_col, ignore_cols = &quot;obs&quot;) {
  
  # 获取不是基准列(即var1)或其他要忽略的列(即&quot;obs&quot;)的列名
  cols_to_change &lt;- setdiff(colnames(df), c(ignore_cols, base_col))
  
  # 用基准列与要更改的列之间的差值替换这些要更改的列
  df[cols_to_change] &lt;-
    df[[base_col]] - df[cols_to_change]
  
  # 删除基准列
  df[base_col] &lt;- NULL

  df
}

column_diff(df, &quot;var1&quot;)
#&gt;   obs var2 var3
#&gt; 1   1   -5   -9
#&gt; 2   2   -5   -9
#&gt; 3   3   -5   -9
#&gt; 4   4   -5   -9

创建于2023年3月20日,使用 reprex v2.0.2

英文:

If you want to do it without any dependencies:

df &lt;- data.frame(
  obs = 1:4,
  var1 = 5:8,
  var2 = 10:13,
  var3 = 14:17 
)

column_diff &lt;- function(df, base_col, ignore_cols = &quot;obs&quot;) {
  
  # Get column names which are not the basecol (i.e. var1) or other 
  # columns to ignore (i.e. &quot;obs&quot;)
  cols_to_change &lt;- setdiff(colnames(df), c(ignore_cols, base_col))
  
  # replace the cols to change with the difference between the basecol
  # and the cols to change
  df[cols_to_change] &lt;-
    df[[base_col]] - df[cols_to_change]
  
  # remove the basecol
  df[base_col] &lt;- NULL

  df
}

column_diff(df, &quot;var1&quot;)
#&gt;   obs var2 var3
#&gt; 1   1   -5   -9
#&gt; 2   2   -5   -9
#&gt; 3   3   -5   -9
#&gt; 4   4   -5   -9

<sup>Created on 2023-03-20 with reprex v2.0.2</sup>

答案3

得分: 0

你可以使用mutate(across...)来操作相应的列,并用rename_with更改列名:

df %>%
  mutate(across(matches("\\d$"), ~var1 - .)) %>%
  rename_with(~str_replace(., "$", "_1"), !matches("1$"))
# 一个数据框:4 × 4
  obs_1  var1 var2_1 var3_1
  <dbl> <dbl>  <dbl>  <dbl>
1     1     0     -5     -9
2     2     0     -5     -9
3     3     0     -5     -9
4     4     0     -5     -9
英文:

You can mutate(across...) the columns in question and change the column names with rename_with:

df %&gt;%
  mutate(across(matches(&quot;\\d$&quot;), ~var1 - .)) %&gt;%
  rename_with(~str_replace(., &quot;$&quot;, &quot;_1&quot;), !matches(&quot;1$&quot;))
# A tibble: 4 &#215; 4
  obs_1  var1 var2_1 var3_1
  &lt;dbl&gt; &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;
1     1     0     -5     -9
2     2     0     -5     -9
3     3     0     -5     -9
4     4     0     -5     -9

huangapple
  • 本文由 发表于 2023年3月21日 00:20:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/75792812.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定