How to perform internal dataframe or matrix calculations as a dataframe or matrix is being generated inside a Tidyverse map() function?

huangapple go评论70阅读模式
英文:

How to perform internal dataframe or matrix calculations as a dataframe or matrix is being generated inside a Tidyverse map() function?

问题

I can provide the translation of the code-related parts you've mentioned. Here it is:

我开始使用tidyverse的`map()`函数来构建包含嵌套数据框和矩阵的列表。下面的代码创建了`createBucketMap()`函数,用于计算渲染的多个数据框的前两列("Inflow"和"Due"),如下面第一个图像所示,用于第一个渲染的数据框,其计算依赖于`createBucketMap()`函数外部的列表和向量,代码行为`alc <- allocate[[matL_name]]`和`rate <- rates[[matL_name]]`所示。

现在,我正在尝试将代码插入到`createBucketMap()`函数中,以在构建函数生成的数据框中填充其余列的计算值。在下面的代码中,您可以看到我的注释中描述了我要使每个渲染的数据框列执行的操作。我的初学者想法是在`map()`之外使用一系列`dplyr`的`mutate()`语句来添加从"Cover_due"到"Outflow"的相关计算列,但看起来有些笨拙。是否有一种干净的方法可以使用`map()`或任何其他与下面代码中使用的`map()`结构一致的`tidyverse`函数来填充这些数据框列?

我试图包括的这些计算列是内部数据框计算,因为它们参考了同一数据框中的其他列,而不是外部数据框。这正是`mutate()`擅长的事情。

只举一个"Cover_due"列的示例计算将给我一个起点。"Cover_due"应该等于同一行中"Inflow"和"Due"的较小值。

下面是在运行此帖子中的`map()`代码时的第一个渲染列表/数据框项,其中仅计算和填充了前两列:

[![进入图像描述][1]][1]

下面是根据长而笨拙的for循环解决方案的正确输出,它不使用`map()`并且所有字段都被正确计算和填充:

[![进入图像描述][2]][2]

代码:

```R
seriesVector <- function() {c("mat_One", "mat_Two")}
matList <- list(mat_One = c("Boy", "Cat"), mat_Two = c("Boy", "Bat"))
allocate <- list(mat_One = c(0.6, 0.5, 0.4), mat_Two = c(0.4, 0.5, 0.6))
flowVector <- c(6, 5, 600) # 曾为4
balVector <- c(1000, 900, 800)
rates <- list(mat_One = 0.10, mat_Two = 0.20)

library(tidyverse)

createBucketMap <- function() {
  imap(matList, \(matL, matL_name) {
    map(matL, \(x) {
      alc <- allocate[[matL_name]]
      rate <- rates[[matL_name]]
      data.frame(
        Inflow = flowVector * alc,
        Due = balVector * alc * rate,
        Cover_due = 0, # 同一行中"Inflow"和"Due"的最小值
        Shortfall = 0, # 等于同一行中"Due"减去"Cover_due"
        Begin_cum_shortfall = 0, # 等于前一行(滞后)的“End_cum_shortfall”列,除了在第1行中,“Begin_cum_shortfall”为零
        Cover_cum_sfall = 0, # 等于(a)“Inflow”减去“Cover_due”和(b)“Begin cum_shortfall”的较小值,都在同一行中
        End_cum_shortfall = 0, # 等于同一行中的“Shortfall”加上“Begin_cum_shortfall”减去“Cover_cum_sfall”
        Outflow = 0 # 等于同一行中的“Inflow”减去“Cover_due”减去“Cover_cum_sfall”
      ) |>
        as.matrix()
    }) |>
      set_names(matL)
  })
}
createBucketMap()

Please note that this is a translation of the code portion of your text. If you have any specific questions about the code or need further assistance, feel free to ask.
<details>
<summary>英文:</summary>
I started using the tidyverse `map()` function for building lists with embedded dataframes and matrices. The code below creates the `createBucketMap()` function which works fine for calculating the first two columns (&quot;Inflow&quot; and &quot;Due&quot;) of the rendered multiple dataframes as shown in the first image below for the first rendered dataframe, whose calculations draw on lists and vectors external to the `createBucketMap()` function per lines of code `alc &lt;- allocate[[matL_name]]` and `rate &lt;- rates[[matL_name]]`.
Now I&#39;m trying to insert into the `createBucketMap()` function the code for populating the rest of the dataframe columns with calculated values as the dataframes are built by the function. In the code below you can see in my comments what I&#39;m trying to get each rendered dataframe column to do. My novice temptation is to use a series of `dplyr` `mutate()` statements outside of `map()` to add the columns with related calculations starting with &quot;Cover_due&quot; through &quot;Outflow&quot;, but it appears clumsy. Is there a clean way to do this using `map()` or any other `tidyverse` function consistent with the `map()` structure used in the below code to populate these dataframe columns?
These calculated columns I&#39;m trying to include are internal dataframe calculations in that they refer to other columns in the same dataframe, and not outside the dataframe. The kind of thing `mutate()` excels at.
Just one example calculation for the &quot;Cover_due&quot; column will give me the start I need. &quot;Cover_due&quot; should be equal to the lesser of the values in the same row of &quot;Inflow&quot; and &quot;Due&quot;.
Below is the first rendered list/dataframe item when running the `map()` code in this OP, where only the first two columns are calculated and populated:
[![enter image description here][1]][1]
Below is the correct output per a long and clumsy for-loop solution that doesn&#39;t use `map()` and all fields correctly calculated and populated:
[![enter image description here][2]][2]
Code:
seriesVector &lt;- function() {c(&quot;mat_One&quot;, &quot;mat_Two&quot;)}
matList &lt;- list(mat_One = c(&quot;Boy&quot;, &quot;Cat&quot;), mat_Two = c(&quot;Boy&quot;, &quot;Bat&quot;))
allocate &lt;- list(mat_One = c(0.6, 0.5, 0.4), mat_Two = c(0.4, 0.5, 0.6))
flowVector &lt;- c(6, 5, 600) # was 4
balVector &lt;- c(1000, 900, 800)
rates &lt;- list(mat_One = 0.10, mat_Two = 0.20)
library(tidyverse)
createBucketMap &lt;- function() {
imap(matList, \(matL, matL_name) {
map(matL, \(x) {
alc &lt;- allocate[[matL_name]]
rate &lt;- rates[[matL_name]]
data.frame(
Inflow = flowVector * alc,
Due = balVector * alc * rate,
Cover_due = 0, # minimum of Inflow and Due in the same row
Shortfall = 0, # equal to “Due” minus “Cover_due” in the same row
Begin_cum_shortfall = 0, # equal to prior (lagged) row of col “End_cum_shortfall” except that in row 1 “Begin_cum_shortfall” is zero
Cover_cum_sfall = 0, # equal to lesser of (a) “Inflow” minus “Cover_due” and (b) “Begin cum_shortfall”, all in the same row
End_cum_shortfall = 0, # equal to “Shortfall” plus “Begin_cum_shortfall” minus “Cover_cum_sfall” all in the same row
Outflow = 0 # equal to “Inflow” minus “Cover_due” minus “Cover_cum_sfall”, all in the same row
) |&gt;
as.matrix()
}) |&gt;
set_names(matL)
})
}
createBucketMap()
[1]: https://i.stack.imgur.com/hhet7.png
[2]: https://i.stack.imgur.com/sk6sy.png
</details>
# 答案1
**得分**: 3
以下是您要翻译的代码部分:
```R
seriesVector <- function() {
c("mat_One", "mat_Two")
}
matList <- list(mat_One = c("Boy", "Cat"), mat_Two = c("Boy", "Bat"))
allocate <- list(mat_One = c(0.6, 0.5, 0.4), mat_Two = c(0.4, 0.5, 0.6))
flowVector <- c(6, 5, 600) # was 4
balVector <- c(1000, 900, 800)
rates <- list(mat_One = 0.10, mat_Two = 0.20)
library(tidyverse)
createBucketMap <- function() {
imap(matList, \(matL, matL_name) {
map(matL, \(x) {
alc <- allocate[[matL_name]]
rate <- rates[[matL_name]]
{ # the various calculations 
Inflow <- flowVector * alc
Due <- balVector * alc * rate
Cover_due <- pmin(Inflow,Due)
}
#collected 
tibble(
Inflow,
Due,
Cover_due
) |>;
as.matrix()
}) |>;
set_names(matL)
})
}
createBucketMap()

希望这有所帮助!

英文:

You could simply define the variable as free standing before collecting them in the frame.

seriesVector &lt;- function() {
c(&quot;mat_One&quot;, &quot;mat_Two&quot;)
}
matList &lt;- list(mat_One = c(&quot;Boy&quot;, &quot;Cat&quot;), mat_Two = c(&quot;Boy&quot;, &quot;Bat&quot;))
allocate &lt;- list(mat_One = c(0.6, 0.5, 0.4), mat_Two = c(0.4, 0.5, 0.6))
flowVector &lt;- c(6, 5, 600) # was 4
balVector &lt;- c(1000, 900, 800)
rates &lt;- list(mat_One = 0.10, mat_Two = 0.20)
library(tidyverse)
createBucketMap &lt;- function() {
imap(matList, \(matL, matL_name) {
map(matL, \(x) {
alc &lt;- allocate[[matL_name]]
rate &lt;- rates[[matL_name]]
{ # the various calculations 
Inflow &lt;- flowVector * alc
Due &lt;- balVector * alc * rate
Cover_due &lt;- pmin(Inflow,Due)
}
#collected 
tibble(
Inflow,
Due,
Cover_due
) |&gt;
as.matrix()
}) |&gt;
set_names(matL)
})
}
createBucketMap()
</details>

huangapple
  • 本文由 发表于 2023年6月12日 23:12:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/76458016.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定