英文:
mutate on selected columns based on name (regular expression)
问题
以下是翻译好的部分:
这是一个示例数据集:
ff <- data.frame(
id = c(1:4),
w1...33112..Value = c(10, 20, 30, 40),
w1...33112..Time = c(4, 3, 2, 1),
w2...33113..Value = c(1, .9, .75, .7),
w2...33113..Time = c(10, 50, 30, 20),
w3...33552..Value = c(1, 2, 3, 4),
w3...33552..Time = c(.5, .5, .9, .9),
w4...33442..Value = c(100, 50, 40, 30),
w4...33442..Time = c(2, 1, 4, 3),
w5...35692..Value = c(.5, .6, .7, .8)
)
我想对基于名称选择的列执行一些简单的操作(通常使用diff) - 列名必须包含字符串Value
下面的示例是针对两个变量的,实际数据中有几十种这样的情况。
ff.2 <- ff %>% mutate (
w1.used = c(0, diff(w1...33112..Value)),
w2.used = c(0, diff(w2...33113..Value)),
)
新列的名称应以字符字符串开头,直到第一个点和所选字符串(例如"used")。
英文:
Here's a sample dataset:
ff <- data.frame(
id = c(1:4),
w1...33112..Value = c(10, 20, 30, 40),
w1...33112..Time = c(4, 3, 2, 1),
w2...33113..Value = c(1, .9, .75, .7),
w2...33113..Time = c(10, 50, 30, 20),
w3...33552..Value = c(1, 2, 3, 4),
w3...33552..Time = c(.5, .5, .9, .9),
w4...33442..Value = c(100, 50, 40, 30),
w4...33442..Time = c(2, 1, 4, 3),
w5...35692..Value = c(.5, .6, .7, .8)
)
I want to perform some simple operations (usually using diff) on columns selected based on name - the column name must contain the string Value
The example below is for two variables, and there are dozens of such cases in real data.
ff.2 <- ff %>% mutate (
w1.used = c(0, diff(w1...33112..Value)),
w2.used = c(0, diff(w2...33113..Value)),
)
The name of the new column should start with a string of characters up to the first dot and the selected string (for example "used").
答案1
得分: 4
你可以简单地执行以下操作:
ff %>%
mutate(across(ends_with('Value'), ~c(0, diff(.)),
.names = "{str_extract(.col, 'w[0-9]+')}.used"))
英文:
You could simply do:
ff %>%
mutate(across(ends_with('Value'), ~c(0, diff(.)),
.names = "{str_extract(.col, 'w[0-9]+')}.used"))
答案2
得分: 1
The mutate_at() 函数允许您基于满足特定条件的现有变量创建新变量,例如,如果它们的名称包含特定字符串。此外,str_extract() 函数将提取字符串直到第一个点以创建新的列名。
以下是完整的翻译:
mutate_at() 函数允许您基于满足特定条件的现有变量创建新变量,例如,如果它们的名称包含特定字符串。此外,str_extract() 函数将提取字符串直到第一个点以创建新的列名。以下是完整的翻译:
这里是一个示例代码,它正好可以实现您想要的功能:
library(tidyverse)
# 选择只包含“Value”在其名称中的列
value_cols <- grep("Value", names(ff), value = TRUE)
# 创建新列
ff <- ff %>%
mutate_at(
.vars = value_cols,
.funs = list(
used = ~ c(0, diff(.))
)
)
# 重命名新列
new_col_names <- str_extract(value_cols, "^[^.]*") %>% paste0(".used")
names(ff)[grep("used", names(ff))] <- new_col_names
ff
英文:
The mutate_at() function allows you to create new variables based on existing variables that match certain criteria, like if their name contains a certain string. Also, str_extract() function will extract the string until the first dot to create new column names.
Here is the sample code which does exactly what you want:
library(tidyverse)
# Select only columns with "Value" in their names
value_cols <- grep("Value", names(ff), value = TRUE)
# Create new columns
ff <- ff %>%
mutate_at(
.vars = value_cols,
.funs = list(
used = ~ c(0, diff(.))
)
)
# Rename new columns
new_col_names <- str_extract(value_cols, "^[^.]*") %>% paste0(".used")
names(ff)[grep("used", names(ff))] <- new_col_names
ff
答案3
得分: 1
以下是代码的翻译部分:
# 选取以"Value"结尾的列
d <- ff[endsWith(names(ff), "Value")]
# 计算差值
u <- d - rbind(d[1, ], d[-nrow(d), ])
# 将差值列添加到数据框中,并设置列名
ff.2 <- cbind(ff, setNames(u, sub("\\..*", ".used", names(u))))
输出结果如下:
> ff.2
id w1...33112..Value w1...33112..Time w2...33113..Value w2...33113..Time
1 1 10 4 1.00 10
2 2 20 3 0.90 50
3 3 30 2 0.75 30
4 4 40 1 0.70 20
w3...33552..Value w3...33552..Time w4...33442..Value w4...33442..Time
1 1 0.5 100 2
2 2 0.5 50 1
3 3 0.9 40 4
4 4 0.9 30 3
w5...35692..Value w1.used w2.used w3.used w4.used w5.used
1 0.5 0 0.00 0 0 0.0
2 0.6 10 -0.10 1 -50 0.1
3 0.7 10 -0.15 1 -10 0.1
4 0.8 10 -0.05 1 -10 0.1
希望这对您有帮助!
英文:
A base R option
d <- ff[endsWith(names(ff), "Value")]
u <- d - rbind(d[1, ], d[-nrow(d), ])
ff.2 <- cbind(ff, setNames(u, sub("\\..*", ".used", names(u))))
gives
> ff.2
id w1...33112..Value w1...33112..Time w2...33113..Value w2...33113..Time
1 1 10 4 1.00 10
2 2 20 3 0.90 50
3 3 30 2 0.75 30
4 4 40 1 0.70 20
w3...33552..Value w3...33552..Time w4...33442..Value w4...33442..Time
1 1 0.5 100 2
2 2 0.5 50 1
3 3 0.9 40 4
4 4 0.9 30 3
w5...35692..Value w1.used w2.used w3.used w4.used w5.used
1 0.5 0 0.00 0 0 0.0
2 0.6 10 -0.10 1 -50 0.1
3 0.7 10 -0.15 1 -10 0.1
4 0.8 10 -0.05 1 -10 0.1
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论