在R中动态地对每一列执行特定的数学函数。

huangapple go评论58阅读模式
英文:

Perform a specific Mathematical Function on each column dynamically in R

问题

我想要在数据框中对每个唯一项目执行数学函数。

通常,要执行数学函数,我们使用 mutate 语句并创建一个列,然后在每个 mutate 语句之后手动执行数学函数。

这在几列上是可行的。但如果我有100列,并且需要执行2-5个数学函数,例如:一个是初始数字的增加20%,另一个是在每列上将初始数字除以2并保持原始列不变。

除了为每个特定项目编写 mutate 语句之外,R中是否有可能实现这一点?

我正在使用的数据框是:

structure(list(`Row Labels` = c("2023-03-01", "2023-04-01", "2023-05-01", 
"2023-06-01", "2023-07-01", "2023-08-01", "2023-09-01", "2023-10-01"
), X6 = c(14, 16, 14, 11, 9, 9, 11, 11), X7 = c(50, 50, 50, 50, 
50, 50, 50, 50), X8 = c(75, 75, 75, 75, 75, 75, 75, 75), X9 = c(100, 
100, 100, 100, 100, 100, 100, 100), X11 = c(25, 25, 50, 75, 125, 
200, 325, 525), X12 = c(50, 50, 100, 150, 250, 400, 650, 1050
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-8L))

对于单独的情况,以下代码就足够了:

library(readxl)
library(dplyr)
Book1 <- read_excel("C:/X/X/X- X/X/Book1.xlsx", sheet = "Sheet6")

dput(Book1)

Book1 <- Book1 %>%
  mutate(`X6 20%` = X6*1.20) %>%
  mutate(`X6 by 2` = X6/2)

我考虑过通过循环来运行这个代码,但是选择要进行乘法的列会成为一个问题,因为我们必须在 mutate 语句中指定列名,我认为这在这里可能不可行。

有没有人能告诉我是否可以用一种简单的方法实现这个目标?

期望的输出如下:

在R中动态地对每一列执行特定的数学函数。

英文:

I wanted to perform a mathematical function on each unique item in a data frame dynamically.

Normally to perform a mathematical function, we use mutate statement and create a column and perform the mathematical function manually by writing mutate statement after mutate statement.

Which is feasible on a few columns. But what if I have 100 columns and I have to perform 2-5 mathematical function, For example: one would be 20% increase on the initial number, The other one would be to divide the initial number by 2 on each column and keep the original column as is.

Is this possible in R other than writing mutate statement for each specific item?

The data frame I am working with is:

structure(list(`Row Labels` = c(&quot;2023-03-01&quot;, &quot;2023-04-01&quot;, &quot;2023-05-01&quot;, 
&quot;2023-06-01&quot;, &quot;2023-07-01&quot;, &quot;2023-08-01&quot;, &quot;2023-09-01&quot;, &quot;2023-10-01&quot;
), X6 = c(14, 16, 14, 11, 9, 9, 11, 11), X7 = c(50, 50, 50, 50, 
50, 50, 50, 50), X8 = c(75, 75, 75, 75, 75, 75, 75, 75), X9 = c(100, 
100, 100, 100, 100, 100, 100, 100), X11 = c(25, 25, 50, 75, 125, 
200, 325, 525), X12 = c(50, 50, 100, 150, 250, 400, 650, 1050
)), class = c(&quot;tbl_df&quot;, &quot;tbl&quot;, &quot;data.frame&quot;), row.names = c(NA, 
-8L))

For individual cases this code would suffice:

library(readxl)
library(dplyr)
Book1 &lt;- read_excel(&quot;C:/X/X/X- X/X/Book1.xlsx&quot;,sheet = &quot;Sheet6&quot;)

dput(Book1)

Book1 &lt;- Book1 %&gt;% 
  mutate(`X6 20%` = X6*1.20) %&gt;% 
  mutate(`X6 by 2`= X6/2)

I was thinking of running this through a loop but then selection of columns to multiple becomes a problem as we have to specify the column name in mutate statement, which I believe would not be possible here right.

Can anyone let me know if this can be achieved in a simple approach?

The expected output is given below:

在R中动态地对每一列执行特定的数学函数。

答案1

得分: 2

We could use across()

update: shorter:

library(dplyr)

df %>% 
  mutate(across(2:7, list("20" = ~. * 1.20, 
                          "By_2" = ~. / 2), .names = "{col}_{fn}"))

first answer:

library(dplyr)

df %>% 
  mutate(across(2:7, ~. * 1.20, .names = "{.col}_20%"),
         across(2:7, ~. /2, .names = "{.col}_By 2"))

  `Row Labels`    X6    X7    X8    X9   X11   X12 `X6_20%` `X7_20%` `X8_20%` `X9_20%` `X11_20%` `X12_20%` `X6_By 2` `X7_By 2` `X8_By 2` `X9_By 2` `X11_By 2` `X12_By 2`
  &lt;chr&gt;        &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt;      &lt;dbl&gt;      &lt;dbl&gt;
1 2023-03-01      14    50    75   100    25    50     16.8       60       90      120        30        60       7          25      37.5        50       12.5         25
2 2023-04-01      16    50    75   100    25    50     19.2       60       90      120        30        60       8          25      37.5        50       12.5         25
3 2023-05-01      14    50    75   100    50   100     16.8       60       90      120        60       120       7          25      37.5        50       25           50
4 2023-06-01      11    50    75   100    75   150     13.2       60       90      120        90       180       5.5        25      37.5        50       37.5         75
5 2023-07-01       9    50    75   100   125   250     10.8       60       90      120       150       300       4.5        25      37.5        50       62.5        125
6 2023-08-01       9    50    75   100   200   400     10.8       60       90      120       240       480       4.5        25      37.5        50      100          200
7 2023-09-01      11    50    75   100   325   650     13.2       60       90      120       390       780       5.5        25      37.5        50      162.         325
8 2023-10-01      11    50    75   100   525  1050     13.2       60       90      120       630      1260       5.5        25      37.5        50      262.         525
英文:

We could use across()

update: shorter:

library(dplyr)

df %&gt;% 
  mutate(across(2:7, list(&quot;20&quot; = ~. * 1.20, 
                          &quot;By_2&quot; = ~. / 2), .names = &quot;{col}_{fn}&quot;))

first answer:

library(dplyr)

df %&gt;% 
  mutate(across(2:7, ~. * 1.20, .names = &quot;{.col}_20%&quot;),
         across(2:7, ~. /2, .names = &quot;{.col}_By 2&quot;))

  `Row Labels`    X6    X7    X8    X9   X11   X12 `X6_20%` `X7_20%` `X8_20%` `X9_20%` `X11_20%` `X12_20%` `X6_By 2` `X7_By 2` `X8_By 2` `X9_By 2` `X11_By 2` `X12_By 2`
  &lt;chr&gt;        &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;    &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt;      &lt;dbl&gt;      &lt;dbl&gt;
1 2023-03-01      14    50    75   100    25    50     16.8       60       90      120        30        60       7          25      37.5        50       12.5         25
2 2023-04-01      16    50    75   100    25    50     19.2       60       90      120        30        60       8          25      37.5        50       12.5         25
3 2023-05-01      14    50    75   100    50   100     16.8       60       90      120        60       120       7          25      37.5        50       25           50
4 2023-06-01      11    50    75   100    75   150     13.2       60       90      120        90       180       5.5        25      37.5        50       37.5         75
5 2023-07-01       9    50    75   100   125   250     10.8       60       90      120       150       300       4.5        25      37.5        50       62.5        125
6 2023-08-01       9    50    75   100   200   400     10.8       60       90      120       240       480       4.5        25      37.5        50      100          200
7 2023-09-01      11    50    75   100   325   650     13.2       60       90      120       390       780       5.5        25      37.5        50      162.         325
8 2023-10-01      11    50    75   100   525  1050     13.2       60       90      120       630      1260       5.5        25      37.5        50      262.         525

huangapple
  • 本文由 发表于 2023年6月1日 23:37:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/76383571.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定