2023年6月29日 19:30:04go评论114阅读模式

英文:

mutate on selected columns based on name (regular expression)

问题

以下是翻译好的部分：

这是一个示例数据集：

ff <- data.frame(
       id = c(1:4),
       w1...33112..Value = c(10, 20, 30, 40),
       w1...33112..Time = c(4, 3, 2, 1),
       w2...33113..Value = c(1, .9, .75, .7),
       w2...33113..Time = c(10, 50, 30, 20),
       w3...33552..Value = c(1, 2, 3, 4),
       w3...33552..Time = c(.5, .5, .9, .9),
       w4...33442..Value = c(100, 50, 40, 30),
       w4...33442..Time = c(2, 1, 4, 3),
       w5...35692..Value = c(.5, .6, .7, .8)
       )

我想对基于名称选择的列执行一些简单的操作（通常使用diff） - 列名必须包含字符串Value

下面的示例是针对两个变量的，实际数据中有几十种这样的情况。

ff.2 <- ff %>% mutate (
  w1.used = c(0, diff(w1...33112..Value)), 
  w2.used = c(0, diff(w2...33113..Value)), 
)

新列的名称应以字符字符串开头，直到第一个点和所选字符串（例如"used"）。

英文:

Here's a sample dataset:


ff &lt;- data.frame(
       id = c(1:4),
       w1...33112..Value = c(10, 20, 30, 40),
       w1...33112..Time = c(4, 3, 2, 1),
       w2...33113..Value = c(1, .9, .75, .7),
       w2...33113..Time = c(10, 50, 30, 20),
       w3...33552..Value = c(1, 2, 3, 4),
       w3...33552..Time = c(.5, .5, .9, .9),
       w4...33442..Value = c(100, 50, 40, 30),
       w4...33442..Time = c(2, 1, 4, 3),
       w5...35692..Value = c(.5, .6, .7, .8)
       )

I want to perform some simple operations (usually using diff) on columns selected based on name - the column name must contain the string Value

The example below is for two variables, and there are dozens of such cases in real data.

ff.2 &lt;- ff %&gt;% mutate (
  w1.used = c(0, diff(w1...33112..Value)), 
  w2.used = c(0, diff(w2...33113..Value)), 
)

The name of the new column should start with a string of characters up to the first dot and the selected string (for example "used").

答案1

得分: 4

你可以简单地执行以下操作：

ff %>%
  mutate(across(ends_with('Value'), ~c(0, diff(.)),
                .names = "{str_extract(.col, 'w[0-9]+')}.used"))

英文:

You could simply do:

ff %&gt;%
  mutate(across(ends_with(&#39;Value&#39;), ~c(0, diff(.)),
                .names = &quot;{str_extract(.col, &#39;w[0-9]+&#39;)}.used&quot;))

答案2

得分: 1

The mutate_at() 函数允许您基于满足特定条件的现有变量创建新变量，例如，如果它们的名称包含特定字符串。此外，str_extract() 函数将提取字符串直到第一个点以创建新的列名。

以下是完整的翻译：

mutate_at() 函数允许您基于满足特定条件的现有变量创建新变量，例如，如果它们的名称包含特定字符串。此外，str_extract() 函数将提取字符串直到第一个点以创建新的列名。以下是完整的翻译：

这里是一个示例代码，它正好可以实现您想要的功能：

library(tidyverse)
# 选择只包含“Value”在其名称中的列
value_cols <- grep("Value", names(ff), value = TRUE)
# 创建新列
ff <- ff %>%
  mutate_at(
    .vars = value_cols,
    .funs = list(
      used = ~ c(0, diff(.))
    )
  )
# 重命名新列
new_col_names <- str_extract(value_cols, "^[^.]*") %>% paste0(".used")
names(ff)[grep("used", names(ff))] <- new_col_names
ff

英文:

The mutate_at() function allows you to create new variables based on existing variables that match certain criteria, like if their name contains a certain string. Also, str_extract() function will extract the string until the first dot to create new column names.

Here is the sample code which does exactly what you want:

library(tidyverse)
# Select only columns with &quot;Value&quot; in their names
value_cols &lt;- grep(&quot;Value&quot;, names(ff), value = TRUE)
# Create new columns
ff &lt;- ff %&gt;%
  mutate_at(
    .vars = value_cols,
    .funs = list(
      used = ~ c(0, diff(.))
    )
  )
# Rename new columns
new_col_names &lt;- str_extract(value_cols, &quot;^[^.]*&quot;) %&gt;% paste0(&quot;.used&quot;)
names(ff)[grep(&quot;used&quot;, names(ff))] &lt;- new_col_names
ff

答案3

得分: 1

以下是代码的翻译部分：

# 选取以"Value"结尾的列
d <- ff[endsWith(names(ff), "Value")]
# 计算差值
u <- d - rbind(d[1, ], d[-nrow(d), ])
# 将差值列添加到数据框中，并设置列名
ff.2 <- cbind(ff, setNames(u, sub("\\..*", ".used", names(u))))

输出结果如下：

> ff.2
  id w1...33112..Value w1...33112..Time w2...33113..Value w2...33113..Time
1  1                10                4              1.00               10
2  2                20                3              0.90               50
3  3                30                2              0.75               30
4  4                40                1              0.70               20
  w3...33552..Value w3...33552..Time w4...33442..Value w4...33442..Time
1                 1              0.5               100                2
2                 2              0.5                50                1
3                 3              0.9                40                4
4                 4              0.9                30                3
  w5...35692..Value w1.used w2.used w3.used w4.used w5.used
1               0.5       0    0.00       0       0     0.0
2               0.6      10   -0.10       1     -50     0.1
3               0.7      10   -0.15       1     -10     0.1
4               0.8      10   -0.05       1     -10     0.1

希望这对您有帮助！

英文:

A base R option

d &lt;- ff[endsWith(names(ff), &quot;Value&quot;)]
u &lt;- d - rbind(d[1, ], d[-nrow(d), ])
ff.2 &lt;- cbind(ff, setNames(u, sub(&quot;\\..*&quot;, &quot;.used&quot;, names(u))))

gives

&gt; ff.2
  id w1...33112..Value w1...33112..Time w2...33113..Value w2...33113..Time
1  1                10                4              1.00               10
2  2                20                3              0.90               50
3  3                30                2              0.75               30
4  4                40                1              0.70               20
  w3...33552..Value w3...33552..Time w4...33442..Value w4...33442..Time
1                 1              0.5               100                2
2                 2              0.5                50                1
3                 3              0.9                40                4
4                 4              0.9                30                3
  w5...35692..Value w1.used w2.used w3.used w4.used w5.used
1               0.5       0    0.00       0       0     0.0
2               0.6      10   -0.10       1     -50     0.1
3               0.7      10   -0.15       1     -10     0.1
4               0.8      10   -0.05       1     -10     0.1

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

基于名称（正则表达式）对选定列进行变异。

问题

答案1

答案2

答案3

在dplyr代码中添加一行总计，但只在特定列下方。

如何在R绘图中插入自定义刻度标记和相应的标签？

如何格式化组合柱状图和折线图

如何将Shiny应用程序中的响应式值发送到全局环境以供审核？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。