基于名称(正则表达式)对选定列进行变异。

huangapple go评论89阅读模式
英文:

mutate on selected columns based on name (regular expression)

问题

以下是翻译好的部分:

这是一个示例数据集:

ff <- data.frame(
       id = c(1:4),
       w1...33112..Value = c(10, 20, 30, 40),
       w1...33112..Time = c(4, 3, 2, 1),
       w2...33113..Value = c(1, .9, .75, .7),
       w2...33113..Time = c(10, 50, 30, 20),
       w3...33552..Value = c(1, 2, 3, 4),
       w3...33552..Time = c(.5, .5, .9, .9),
       w4...33442..Value = c(100, 50, 40, 30),
       w4...33442..Time = c(2, 1, 4, 3),
       w5...35692..Value = c(.5, .6, .7, .8)
       )

我想对基于名称选择的列执行一些简单的操作(通常使用diff) - 列名必须包含字符串Value

下面的示例是针对两个变量的,实际数据中有几十种这样的情况。

ff.2 <- ff %>% mutate (
  w1.used = c(0, diff(w1...33112..Value)), 
  w2.used = c(0, diff(w2...33113..Value)), 
)

新列的名称应以字符字符串开头,直到第一个点和所选字符串(例如"used")。

英文:

Here's a sample dataset:


ff &lt;- data.frame(
       id = c(1:4),
       w1...33112..Value = c(10, 20, 30, 40),
       w1...33112..Time = c(4, 3, 2, 1),
       w2...33113..Value = c(1, .9, .75, .7),
       w2...33113..Time = c(10, 50, 30, 20),
       w3...33552..Value = c(1, 2, 3, 4),
       w3...33552..Time = c(.5, .5, .9, .9),
       w4...33442..Value = c(100, 50, 40, 30),
       w4...33442..Time = c(2, 1, 4, 3),
       w5...35692..Value = c(.5, .6, .7, .8)
       )

I want to perform some simple operations (usually using diff) on columns selected based on name - the column name must contain the string Value

The example below is for two variables, and there are dozens of such cases in real data.

ff.2 &lt;- ff %&gt;% mutate (
  w1.used = c(0, diff(w1...33112..Value)), 
  w2.used = c(0, diff(w2...33113..Value)), 
)

The name of the new column should start with a string of characters up to the first dot and the selected string (for example "used").

答案1

得分: 4

你可以简单地执行以下操作:

ff %>%
  mutate(across(ends_with('Value'), ~c(0, diff(.)),
                .names = "{str_extract(.col, 'w[0-9]+')}.used"))
英文:

You could simply do:

ff %&gt;%
  mutate(across(ends_with(&#39;Value&#39;), ~c(0, diff(.)),
                .names = &quot;{str_extract(.col, &#39;w[0-9]+&#39;)}.used&quot;))

答案2

得分: 1

The mutate_at() 函数允许您基于满足特定条件的现有变量创建新变量,例如,如果它们的名称包含特定字符串。此外,str_extract() 函数将提取字符串直到第一个点以创建新的列名。

以下是完整的翻译:

mutate_at() 函数允许您基于满足特定条件的现有变量创建新变量,例如,如果它们的名称包含特定字符串。此外,str_extract() 函数将提取字符串直到第一个点以创建新的列名。以下是完整的翻译:

这里是一个示例代码,它正好可以实现您想要的功能:

library(tidyverse)

# 选择只包含“Value”在其名称中的列
value_cols <- grep("Value", names(ff), value = TRUE)

# 创建新列
ff <- ff %>%
  mutate_at(
    .vars = value_cols,
    .funs = list(
      used = ~ c(0, diff(.))
    )
  )

# 重命名新列
new_col_names <- str_extract(value_cols, "^[^.]*") %>% paste0(".used")

names(ff)[grep("used", names(ff))] <- new_col_names

ff
英文:

The mutate_at() function allows you to create new variables based on existing variables that match certain criteria, like if their name contains a certain string. Also, str_extract() function will extract the string until the first dot to create new column names.

Here is the sample code which does exactly what you want:

library(tidyverse)

# Select only columns with &quot;Value&quot; in their names
value_cols &lt;- grep(&quot;Value&quot;, names(ff), value = TRUE)

# Create new columns
ff &lt;- ff %&gt;%
  mutate_at(
    .vars = value_cols,
    .funs = list(
      used = ~ c(0, diff(.))
    )
  )

# Rename new columns
new_col_names &lt;- str_extract(value_cols, &quot;^[^.]*&quot;) %&gt;% paste0(&quot;.used&quot;)

names(ff)[grep(&quot;used&quot;, names(ff))] &lt;- new_col_names

ff

答案3

得分: 1

以下是代码的翻译部分:

# 选取以"Value"结尾的列
d <- ff[endsWith(names(ff), "Value")]

# 计算差值
u <- d - rbind(d[1, ], d[-nrow(d), ])

# 将差值列添加到数据框中,并设置列名
ff.2 <- cbind(ff, setNames(u, sub("\\..*", ".used", names(u))))

输出结果如下:

> ff.2
  id w1...33112..Value w1...33112..Time w2...33113..Value w2...33113..Time
1  1                10                4              1.00               10
2  2                20                3              0.90               50
3  3                30                2              0.75               30
4  4                40                1              0.70               20
  w3...33552..Value w3...33552..Time w4...33442..Value w4...33442..Time
1                 1              0.5               100                2
2                 2              0.5                50                1
3                 3              0.9                40                4
4                 4              0.9                30                3
  w5...35692..Value w1.used w2.used w3.used w4.used w5.used
1               0.5       0    0.00       0       0     0.0
2               0.6      10   -0.10       1     -50     0.1
3               0.7      10   -0.15       1     -10     0.1
4               0.8      10   -0.05       1     -10     0.1

希望这对您有帮助!

英文:

A base R option

d &lt;- ff[endsWith(names(ff), &quot;Value&quot;)]
u &lt;- d - rbind(d[1, ], d[-nrow(d), ])
ff.2 &lt;- cbind(ff, setNames(u, sub(&quot;\\..*&quot;, &quot;.used&quot;, names(u))))

gives

&gt; ff.2
  id w1...33112..Value w1...33112..Time w2...33113..Value w2...33113..Time
1  1                10                4              1.00               10
2  2                20                3              0.90               50
3  3                30                2              0.75               30
4  4                40                1              0.70               20
  w3...33552..Value w3...33552..Time w4...33442..Value w4...33442..Time
1                 1              0.5               100                2
2                 2              0.5                50                1
3                 3              0.9                40                4
4                 4              0.9                30                3
  w5...35692..Value w1.used w2.used w3.used w4.used w5.used
1               0.5       0    0.00       0       0     0.0
2               0.6      10   -0.10       1     -50     0.1
3               0.7      10   -0.15       1     -10     0.1
4               0.8      10   -0.05       1     -10     0.1

huangapple
  • 本文由 发表于 2023年6月29日 19:30:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76580615.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定