英文:
Perform a specific Mathematical Function on each column dynamically in R
问题
我想要在数据框中对每个唯一项目执行数学函数。
通常,要执行数学函数,我们使用 mutate
语句并创建一个列,然后在每个 mutate
语句之后手动执行数学函数。
这在几列上是可行的。但如果我有100列,并且需要执行2-5个数学函数,例如:一个是初始数字的增加20%,另一个是在每列上将初始数字除以2并保持原始列不变。
除了为每个特定项目编写 mutate
语句之外,R中是否有可能实现这一点?
我正在使用的数据框是:
structure(list(`Row Labels` = c("2023-03-01", "2023-04-01", "2023-05-01",
"2023-06-01", "2023-07-01", "2023-08-01", "2023-09-01", "2023-10-01"
), X6 = c(14, 16, 14, 11, 9, 9, 11, 11), X7 = c(50, 50, 50, 50,
50, 50, 50, 50), X8 = c(75, 75, 75, 75, 75, 75, 75, 75), X9 = c(100,
100, 100, 100, 100, 100, 100, 100), X11 = c(25, 25, 50, 75, 125,
200, 325, 525), X12 = c(50, 50, 100, 150, 250, 400, 650, 1050
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-8L))
对于单独的情况,以下代码就足够了:
library(readxl)
library(dplyr)
Book1 <- read_excel("C:/X/X/X- X/X/Book1.xlsx", sheet = "Sheet6")
dput(Book1)
Book1 <- Book1 %>%
mutate(`X6 20%` = X6*1.20) %>%
mutate(`X6 by 2` = X6/2)
我考虑过通过循环来运行这个代码,但是选择要进行乘法的列会成为一个问题,因为我们必须在 mutate
语句中指定列名,我认为这在这里可能不可行。
有没有人能告诉我是否可以用一种简单的方法实现这个目标?
期望的输出如下:
英文:
I wanted to perform a mathematical function on each unique item in a data frame dynamically.
Normally to perform a mathematical function, we use mutate
statement and create a column and perform the mathematical function manually by writing mutate statement after mutate statement.
Which is feasible on a few columns. But what if I have 100 columns and I have to perform 2-5 mathematical function, For example: one would be 20% increase on the initial number, The other one would be to divide the initial number by 2 on each column and keep the original column as is.
Is this possible in R other than writing mutate statement for each specific item?
The data frame I am working with is:
structure(list(`Row Labels` = c("2023-03-01", "2023-04-01", "2023-05-01",
"2023-06-01", "2023-07-01", "2023-08-01", "2023-09-01", "2023-10-01"
), X6 = c(14, 16, 14, 11, 9, 9, 11, 11), X7 = c(50, 50, 50, 50,
50, 50, 50, 50), X8 = c(75, 75, 75, 75, 75, 75, 75, 75), X9 = c(100,
100, 100, 100, 100, 100, 100, 100), X11 = c(25, 25, 50, 75, 125,
200, 325, 525), X12 = c(50, 50, 100, 150, 250, 400, 650, 1050
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-8L))
For individual cases this code would suffice:
library(readxl)
library(dplyr)
Book1 <- read_excel("C:/X/X/X- X/X/Book1.xlsx",sheet = "Sheet6")
dput(Book1)
Book1 <- Book1 %>%
mutate(`X6 20%` = X6*1.20) %>%
mutate(`X6 by 2`= X6/2)
I was thinking of running this through a loop but then selection of columns to multiple becomes a problem as we have to specify the column name in mutate statement, which I believe would not be possible here right.
Can anyone let me know if this can be achieved in a simple approach?
The expected output is given below:
答案1
得分: 2
We could use across()
update: shorter:
library(dplyr)
df %>%
mutate(across(2:7, list("20" = ~. * 1.20,
"By_2" = ~. / 2), .names = "{col}_{fn}"))
first answer:
library(dplyr)
df %>%
mutate(across(2:7, ~. * 1.20, .names = "{.col}_20%"),
across(2:7, ~. /2, .names = "{.col}_By 2"))
`Row Labels` X6 X7 X8 X9 X11 X12 `X6_20%` `X7_20%` `X8_20%` `X9_20%` `X11_20%` `X12_20%` `X6_By 2` `X7_By 2` `X8_By 2` `X9_By 2` `X11_By 2` `X12_By 2`
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2023-03-01 14 50 75 100 25 50 16.8 60 90 120 30 60 7 25 37.5 50 12.5 25
2 2023-04-01 16 50 75 100 25 50 19.2 60 90 120 30 60 8 25 37.5 50 12.5 25
3 2023-05-01 14 50 75 100 50 100 16.8 60 90 120 60 120 7 25 37.5 50 25 50
4 2023-06-01 11 50 75 100 75 150 13.2 60 90 120 90 180 5.5 25 37.5 50 37.5 75
5 2023-07-01 9 50 75 100 125 250 10.8 60 90 120 150 300 4.5 25 37.5 50 62.5 125
6 2023-08-01 9 50 75 100 200 400 10.8 60 90 120 240 480 4.5 25 37.5 50 100 200
7 2023-09-01 11 50 75 100 325 650 13.2 60 90 120 390 780 5.5 25 37.5 50 162. 325
8 2023-10-01 11 50 75 100 525 1050 13.2 60 90 120 630 1260 5.5 25 37.5 50 262. 525
英文:
We could use across()
update: shorter:
library(dplyr)
df %>%
mutate(across(2:7, list("20" = ~. * 1.20,
"By_2" = ~. / 2), .names = "{col}_{fn}"))
first answer:
library(dplyr)
df %>%
mutate(across(2:7, ~. * 1.20, .names = "{.col}_20%"),
across(2:7, ~. /2, .names = "{.col}_By 2"))
`Row Labels` X6 X7 X8 X9 X11 X12 `X6_20%` `X7_20%` `X8_20%` `X9_20%` `X11_20%` `X12_20%` `X6_By 2` `X7_By 2` `X8_By 2` `X9_By 2` `X11_By 2` `X12_By 2`
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2023-03-01 14 50 75 100 25 50 16.8 60 90 120 30 60 7 25 37.5 50 12.5 25
2 2023-04-01 16 50 75 100 25 50 19.2 60 90 120 30 60 8 25 37.5 50 12.5 25
3 2023-05-01 14 50 75 100 50 100 16.8 60 90 120 60 120 7 25 37.5 50 25 50
4 2023-06-01 11 50 75 100 75 150 13.2 60 90 120 90 180 5.5 25 37.5 50 37.5 75
5 2023-07-01 9 50 75 100 125 250 10.8 60 90 120 150 300 4.5 25 37.5 50 62.5 125
6 2023-08-01 9 50 75 100 200 400 10.8 60 90 120 240 480 4.5 25 37.5 50 100 200
7 2023-09-01 11 50 75 100 325 650 13.2 60 90 120 390 780 5.5 25 37.5 50 162. 325
8 2023-10-01 11 50 75 100 525 1050 13.2 60 90 120 630 1260 5.5 25 37.5 50 262. 525
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论