2023年3月21日 00:20:32go评论95阅读模式

英文:

Difference of second and following column

问题

我相对R还比较新，正在尝试编写一个函数来创建一个新的数据框，显示原始数据集的第二列和随后的每一列之间的差异。假设这是我的数据（尽管我有许多变量）

obs  var1   var2   var3    
1     5      10     14   
2     6      11     15   
3     7      12     16   
4     8      13     17

输出应该类似于这样：

obs var2_1 var3_1
1     -5       -9
2     -5       -9
3     -5       -9
4     -5       -9

非常感谢！

英文:

I'm fairly new to R and trying to write a function to create a new dataframe that depicts the difference of the second and every following column of the original dataset. Imagine this was may data (although I have many variables)

obs  var1   var2   var3    
1     5      10     14   
2     6      11     15   
3     7      12     16   
4     8      13     17

The output should look something like this

obs var2_1 var3_1
1     -5       -9
2     -5       -9
3     -5       -9
4     -5       -9

Thank you very much in advance!

答案1

得分: 1

You can use across() to apply the same transformation to multiple columns:

library(tidyverse)

df <- tibble::tribble(
  ~obs, ~var1, ~var2, ~var3,
  1,  5,   10,   14,
  2,  6,   11,   15,
  3,  7,   12,   16,
  4,  8,   13,   17
  )

mutate(df, across(starts_with("var") & !var1, ~var1 - .x))

^{Created on 2023-03-20 with reprex v2.0.2}

Add the option .keep = "unused" to mutate() to remove var1 from the output.

Update: To refer to columns based on their position, use

mutate(df, across(!1:2, ~ pull(pick(2), 1) - .x))

英文:

You can use across() to apply the same transformation to multiple columns:

library(tidyverse)

df &lt;- tibble::tribble(
  ~obs, ~var1, ~var2, ~var3,
  1,  5,   10,   14,
  2,  6,   11,   15,
  3,  7,   12,   16,
  4,  8,   13,   17
  )

mutate(df, across(starts_with(&quot;var&quot;) &amp; !var1, ~var1 - .x))
#&gt; # A tibble: 4 &#215; 4
#&gt;     obs  var1  var2  var3
#&gt;   &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
#&gt; 1     1     5    -5    -9
#&gt; 2     2     6    -5    -9
#&gt; 3     3     7    -5    -9
#&gt; 4     4     8    -5    -9

<sup>Created on 2023-03-20 with reprex v2.0.2</sup>

Add the option .keep = "unused" to mutate() to remove var1 from the output.

Update: To refer to columns based on their position, use

mutate(df, across(!1:2, ~ pull(pick(2), 1) - .x))

答案2

得分: 0

如果你想在不依赖任何库的情况下完成这个任务：

df &lt;- data.frame(
  obs = 1:4,
  var1 = 5:8,
  var2 = 10:13,
  var3 = 14:17 
)

column_diff &lt;- function(df, base_col, ignore_cols = &quot;obs&quot;) {
  
  # 获取不是基准列（即var1）或其他要忽略的列（即&quot;obs&quot;）的列名
  cols_to_change &lt;- setdiff(colnames(df), c(ignore_cols, base_col))
  
  # 用基准列与要更改的列之间的差值替换这些要更改的列
  df[cols_to_change] &lt;-
    df[[base_col]] - df[cols_to_change]
  
  # 删除基准列
  df[base_col] &lt;- NULL

  df
}

column_diff(df, &quot;var1&quot;)
#&gt;   obs var2 var3
#&gt; 1   1   -5   -9
#&gt; 2   2   -5   -9
#&gt; 3   3   -5   -9
#&gt; 4   4   -5   -9

^{创建于2023年3月20日，使用 reprex v2.0.2。}

英文:

If you want to do it without any dependencies:

df &lt;- data.frame(
  obs = 1:4,
  var1 = 5:8,
  var2 = 10:13,
  var3 = 14:17 
)

column_diff &lt;- function(df, base_col, ignore_cols = &quot;obs&quot;) {
  
  # Get column names which are not the basecol (i.e. var1) or other 
  # columns to ignore (i.e. &quot;obs&quot;)
  cols_to_change &lt;- setdiff(colnames(df), c(ignore_cols, base_col))
  
  # replace the cols to change with the difference between the basecol
  # and the cols to change
  df[cols_to_change] &lt;-
    df[[base_col]] - df[cols_to_change]
  
  # remove the basecol
  df[base_col] &lt;- NULL

  df
}

column_diff(df, &quot;var1&quot;)
#&gt;   obs var2 var3
#&gt; 1   1   -5   -9
#&gt; 2   2   -5   -9
#&gt; 3   3   -5   -9
#&gt; 4   4   -5   -9

<sup>Created on 2023-03-20 with reprex v2.0.2</sup>

答案3

得分: 0

你可以使用mutate(across...)来操作相应的列，并用rename_with更改列名：

df %>%
  mutate(across(matches("\\d$"), ~var1 - .)) %>%
  rename_with(~str_replace(., "$", "_1"), !matches("1$"))
# 一个数据框：4 × 4
  obs_1  var1 var2_1 var3_1
  <dbl> <dbl>  <dbl>  <dbl>
1     1     0     -5     -9
2     2     0     -5     -9
3     3     0     -5     -9
4     4     0     -5     -9

英文:

You can mutate(across...) the columns in question and change the column names with rename_with:

df %&gt;%
  mutate(across(matches(&quot;\\d$&quot;), ~var1 - .)) %&gt;%
  rename_with(~str_replace(., &quot;$&quot;, &quot;_1&quot;), !matches(&quot;1$&quot;))
# A tibble: 4 &#215; 4
  obs_1  var1 var2_1 var3_1
  &lt;dbl&gt; &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;
1     1     0     -5     -9
2     2     0     -5     -9
3     3     0     -5     -9
4     4     0     -5     -9

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

第二列和后续列的差异

问题

答案1

答案2

答案3

如何更高效地创建这样的矩阵？

应用select()函数创建包含三个变量的新数据框。

从一行中提取数据，创建一个新的列，每个ID对应一个列。

在数据框中添加经过的时间列与日期。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论