在R中如何创建一个函数,只有在达到阈值时才调用一个名称?

huangapple go评论97阅读模式
英文:

How to create a function in R that call a name only when a threshold is reached?

问题

我有一个每天增加1单位的变量(我们称之为cumulative date)。你要计算从第1天到第10天的cumulative date。我想要创建一个名为phase的第二个变量。阶段分别是“phase1”,“phase2”和“phase3”。这些阶段分别在cumulative date大于或等于2.6、6.3和8.3时达到。我想要在R中创建一个函数来实现这一点,但关键是只有在满足条件时才添加这些阶段。

这是我的代码:

  1. create_phase <- function(cumulative_date) {
  2. phase <- case_when(
  3. cumulative_date >= 8.3 ~ "phase3",
  4. cumulative_date >= 6.3 ~ "phase2",
  5. cumulative_date >= 2.6 ~ "phase1"
  6. )
  7. return(phase)
  8. }
  9. cumulative_date <- 1:10
  10. phase <- cbind(cumulative_date, create_phase(cumulative_date))

这是接近我所需的结果:

在R中如何创建一个函数,只有在达到阈值时才调用一个名称?

但我真正需要的是这个:

在R中如何创建一个函数,只有在达到阈值时才调用一个名称?

英文:

I have a variable that increase 1 unit every day (let's call it cumulative date). You calculate this cumulative date for days 1 to 10. I want to create a second variable that is called phase. The phases are "phase1", "phase2", and "phase3". These phases are reached when cumulative date is 2.6, 6.3, and 8.3 or larger, respectively. I want to create a function in R that does this, but the trick is that I want the phases only be added when the criteria is met.

This is my code:

  1. create_phase &lt;- function(cumulative_date) {
  2. phase &lt;- case_when(
  3. cumulative_date &gt;= 8.3 ~ &quot;phase3&quot;,
  4. cumulative_date &gt;= 6.3 ~ &quot;phase2&quot;,
  5. cumulative_date &gt;= 2.6 ~ &quot;phase1&quot;
  6. )
  7. return(phase)
  8. }
  9. cumulative_date &lt;- 1:10
  10. phase &lt;- cbind(cumulative_date, create_phase(cumulative_date))

This is the result, which is close to what I need:

在R中如何创建一个函数,只有在达到阈值时才调用一个名称?

But what I really need is this:

在R中如何创建一个函数,只有在达到阈值时才调用一个名称?

答案1

得分: 0

你可以根据 phase 对数据进行分组,并将除了第一个 phase 条目之外的所有条目都变为 NA

  1. df1 %>%
  2. mutate(phase = case_when(date >= 8.3 ~ "phase3",
  3. date >= 6.3 ~ "phase2",
  4. date >= 2.6 ~ "phase1")) %>%
  5. group_by(phase) %>%
  6. mutate(phase = c(phase[1], rep(NA, n()-1)))

虚拟数据: df1 <- tibble(date = 1:10)

结果:

  1. # A tibble: 10 × 2
  2. # Groups: phase [4]
  3. date phase
  4. <int> <chr>
  5. 1 1 NA
  6. 2 2 NA
  7. 3 3 phase1
  8. 4 4 NA
  9. 5 5 NA
  10. 6 6 NA
  11. 7 7 phase2
  12. 8 8 NA
  13. 9 9 phase3
  14. 10 10 NA
英文:

You can group your data by phase, and turn every but the first entry of phase into NA:

  1. df1 %&gt;%
  2. mutate(phase = case_when(date &gt;= 8.3 ~ &quot;phase3&quot;,
  3. date &gt;= 6.3 ~ &quot;phase2&quot;,
  4. date &gt;= 2.6 ~ &quot;phase1&quot;)) %&gt;%
  5. group_by(phase) %&gt;%
  6. mutate(phase = c(phase[1], rep(NA, n()-1)))

Dummy data: df1 &lt;- tibble(date = 1:10)

Result:

  1. # A tibble: 10 &#215; 2
  2. # Groups: phase [4]
  3. date phase
  4. &lt;int&gt; &lt;chr&gt;
  5. 1 1 NA
  6. 2 2 NA
  7. 3 3 phase1
  8. 4 4 NA
  9. 5 5 NA
  10. 6 6 NA
  11. 7 7 phase2
  12. 8 8 NA
  13. 9 9 phase3
  14. 10 10 NA

答案2

得分: 0

也许您可以考虑使用lag函数,然后检查何时跨越阈值以标记阶段。

  1. library(tidyverse)
  2. create_phase <- function(cumulative_date) {
  3. phase <- case_when(
  4. lag(cumulative_date) < 8.3 & cumulative_date >= 8.3 ~ "phase3",
  5. lag(cumulative_date) < 6.3 & cumulative_date >= 6.3 ~ "phase2",
  6. lag(cumulative_date) < 2.6 & cumulative_date >= 2.6 ~ "phase1"
  7. )
  8. return(phase)
  9. }
  10. cumulative_date <- 1:10
  11. data.frame(cumulative_date, phase = create_phase(cumulative_date))

输出

  1. cumulative_date phase
  2. 1 1 <NA>
  3. 2 2 <NA>
  4. 3 3 phase1
  5. 4 4 <NA>
  6. 5 5 <NA>
  7. 6 6 <NA>
  8. 7 7 phase2
  9. 8 8 <NA>
  10. 9 9 phase3
  11. 10 10 <NA>
英文:

Perhaps you might consider using lag and then check when crosses threshold to label the phase.

  1. library(tidyverse)
  2. create_phase &lt;- function(cumulative_date) {
  3. phase &lt;- case_when(
  4. lag(cumulative_date) &lt; 8.3 &amp; cumulative_date &gt;= 8.3 ~ &quot;phase3&quot;,
  5. lag(cumulative_date) &lt; 6.3 &amp; cumulative_date &gt;= 6.3 ~ &quot;phase2&quot;,
  6. lag(cumulative_date) &lt; 2.6 &amp; cumulative_date &gt;= 2.6 ~ &quot;phase1&quot;
  7. )
  8. return(phase)
  9. }
  10. cumulative_date &lt;- 1:10
  11. data.frame(cumulative_date, phase = create_phase(cumulative_date))

Output

  1. cumulative_date phase
  2. 1 1 &lt;NA&gt;
  3. 2 2 &lt;NA&gt;
  4. 3 3 phase1
  5. 4 4 &lt;NA&gt;
  6. 5 5 &lt;NA&gt;
  7. 6 6 &lt;NA&gt;
  8. 7 7 phase2
  9. 8 8 &lt;NA&gt;
  10. 9 9 phase3
  11. 10 10 &lt;NA&gt;

huangapple
  • 本文由 发表于 2023年5月25日 03:40:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/76326906.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定