在R中计算数据框中行内的百分比贡献的方法

huangapple go评论107阅读模式
英文:

How to Calculate the Percentage Contribution Within Rows in a Data Frame in R

问题

I apologize for the confusion, but it seems that your request is to provide a translation of the content you provided, without addressing any specific questions or issues in the code. Here is the translated content:

我想知道,我们是否可以计算数据帧中行内值的百分比贡献。

我正在使用的数据帧如下所示:

  1. structure(list(`Row Labels` = c("X1", "X2", "X3", "X4"), `2019-01-01` = c(37,
  2. 36, 45, 53), `2019-02-01` = c(3, 19, 14, 46), `2019-03-01` = c(28,
  3. 2, 28, 28), `2019-04-01` = c(48, 70, 18, 16), `2019-05-01` = c(83,
  4. 71, 58, 26), `2019-06-01` = c(85, 28, 83, 46), `2019-07-01` = c(60,
  5. 20, 12, 77), `2019-08-01` = c(44, 66, 30, 99), `2019-09-01` = c(21,
  6. 14, 31, 21), `2019-10-01` = c(26, 72, 72, 16), `2019-11-01` = c(15,
  7. 96, 23, 100), `2019-12-01` = c(65, 0, 98, 66)), row.names = c(NA,
  8. -4L), class = c("tbl_df", "tbl", "data.frame"))

我编写的代码如下:

  1. Book1 <- read_excel("X:/X/X/X - X/X/Book1.xlsx")
  2. First_Date <- "2019-01-01"
  3. Last_Date <- "2019-12-01"
  4. Book1 <- Book1 %>%
  5. mutate(Sum = rowSums(pick(any_of(First_Date):any_of(Last_Date)))) %>%
  6. mutate(across(pick(any_of(First_Date):any_of(Last_Date), ~./rowSums(pick(any_of(First_Date):any_of(Last_Date)))),.names = "{.col}_%"))

运行此代码时,我收到以下错误:

  1. Error in `mutate()`: In argument: `across(...)`. Caused by error in `pick()`: ! Formula shorthand must be wrapped in `where()`.
  2. # Bad data %>% select(~./rowSums(pick(any_of(First_Date):any_of(Last_Date))))
  3. # Good data %>% select(where(~./rowSums(pick(any_of(First_Date):any_of(Last_Date))))))

以上的代码应该获得以下图像的输出:

在R中计算数据框中行内的百分比贡献的方法

但如果输出以这种格式呈现,我不介意:

在R中计算数据框中行内的百分比贡献的方法

有人可以告诉我在计算贡献时出了什么问题吗?这将会很有帮助。是否有更简单的方法来做这个?

英文:

I wanted to know, if we can calculate the percentage contribution of the values within a row in a data frame.

The data frame I am working with is:

  1. structure(list(`Row Labels` = c(&quot;X1&quot;, &quot;X2&quot;, &quot;X3&quot;, &quot;X4&quot;), `2019-01-01` = c(37,
  2. 36, 45, 53), `2019-02-01` = c(3, 19, 14, 46), `2019-03-01` = c(28,
  3. 2, 28, 28), `2019-04-01` = c(48, 70, 18, 16), `2019-05-01` = c(83,
  4. 71, 58, 26), `2019-06-01` = c(85, 28, 83, 46), `2019-07-01` = c(60,
  5. 20, 12, 77), `2019-08-01` = c(44, 66, 30, 99), `2019-09-01` = c(21,
  6. 14, 31, 21), `2019-10-01` = c(26, 72, 72, 16), `2019-11-01` = c(15,
  7. 96, 23, 100), `2019-12-01` = c(65, 0, 98, 66)), row.names = c(NA,
  8. -4L), class = c(&quot;tbl_df&quot;, &quot;tbl&quot;, &quot;data.frame&quot;))

The code which I wrote for this is given below:

  1. Book1 &lt;- read_excel(&quot;X:/X/X/X - X/X/Book1.xlsx&quot;)
  2. First_Date &lt;- &quot;2019-01-01&quot;
  3. Last_Date &lt;- &quot;2019-12-01&quot;
  4. Book1 &lt;- Book1 %&gt;%
  5. mutate(Sum = rowSums(pick(any_of(First_Date):any_of(Last_Date)))) %&gt;%
  6. mutate(across(pick(any_of(First_Date):any_of(Last_Date), ~./rowSums(pick(any_of(First_Date):any_of(Last_Date)))),.names = &quot;{.col}_%&quot;))

When I am running this code, the error I get is:

  1. Error in `mutate()`: In argument: `across(...)`. Caused by error in `pick()`: ! Formula shorthand must be wrapped in `where()`.
  2. # Bad data %&gt;% select(~./rowSums(pick(any_of(First_Date):any_of(Last_Date))))
  3. # Good data %&gt;% select(where(~./rowSums(pick(any_of(First_Date):any_of(Last_Date)))))

The code above should get the output of the below image

在R中计算数据框中行内的百分比贡献的方法

but I dont mind if the output is in this format as well

在R中计算数据框中行内的百分比贡献的方法

Can someone let me know what is it that i am getting wrong to find the contribution? It would be helpful. Is there a simpler way of doing it?

答案1

得分: 1

对于使用dplyr的第二个输出,您可以:

  1. library(dplyr)
  2. df %>%
  3. rowwise() %>%
  4. mutate(across(where(is.numeric), ~ .x / sum(across(where(is.numeric)))))
  5. # A tibble: 4 × 13
  6. # Rowwise:
  7. `Row Labels` `2019-01-01` `2019-02-01` `2019-03-01` `2019-04-01` `2019-05-01` `2019-06-01` `2019-07-01` `2019-08-01` `2019-09-01` `2019-10-01` `2019-11-01`
  8. <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
  9. 1 X1 0.0718 0.00583 0.0544 0.0932 0.161 0.165 0.117 0.0854 0.0408 0.0505 0.0291
  10. 2 X2 0.0729 0.0385 0.00405 0.142 0.144 0.0567 0.0405 0.134 0.0283 0.146 0.194
  11. 3 X3 0.0879 0.0273 0.0547 0.0352 0.113 0.162 0.0234 0.0586 0.0605 0.141 0.0449
  12. 4 X4 0.0892 0.0774 0.0471 0.0269 0.0438 0.0774 0.130 0.167 0.0354 0.0269 0.168
  13. # ℹ 1 more variable: `2019-12-01` <dbl>
  14. 乘以100
  15. df %>%
  16. rowwise() %>%
  17. mutate(across(
  18. where(is.numeric), ~ .x / sum(across(where(is.numeric))) * 100
  19. ))
英文:

For the second output with dplyr you can:

  1. library(dplyr)
  2. df %&gt;%
  3. rowwise() %&gt;%
  4. mutate(across(where(is.numeric), ~ .x / sum(across(where(is.numeric)))))
  5. # A tibble: 4 &#215; 13
  6. # Rowwise:
  7. `Row Labels` `2019-01-01` `2019-02-01` `2019-03-01` `2019-04-01` `2019-05-01` `2019-06-01` `2019-07-01` `2019-08-01` `2019-09-01` `2019-10-01` `2019-11-01`
  8. &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
  9. 1 X1 0.0718 0.00583 0.0544 0.0932 0.161 0.165 0.117 0.0854 0.0408 0.0505 0.0291
  10. 2 X2 0.0729 0.0385 0.00405 0.142 0.144 0.0567 0.0405 0.134 0.0283 0.146 0.194
  11. 3 X3 0.0879 0.0273 0.0547 0.0352 0.113 0.162 0.0234 0.0586 0.0605 0.141 0.0449
  12. 4 X4 0.0892 0.0774 0.0471 0.0269 0.0438 0.0774 0.130 0.167 0.0354 0.0269 0.168
  13. # ℹ 1 more variable: `2019-12-01` &lt;dbl&gt;

Multiply with 100:

  1. df %&gt;%
  2. rowwise() %&gt;%
  3. mutate(across(
  4. where(is.numeric), ~ .x / sum(across(where(is.numeric))) * 100
  5. ))

huangapple
  • 本文由 发表于 2023年6月8日 20:04:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/76431681.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定