在R中如何创建一个函数,只有在达到阈值时才调用一个名称?

huangapple go评论72阅读模式
英文:

How to create a function in R that call a name only when a threshold is reached?

问题

我有一个每天增加1单位的变量(我们称之为cumulative date)。你要计算从第1天到第10天的cumulative date。我想要创建一个名为phase的第二个变量。阶段分别是“phase1”,“phase2”和“phase3”。这些阶段分别在cumulative date大于或等于2.6、6.3和8.3时达到。我想要在R中创建一个函数来实现这一点,但关键是只有在满足条件时才添加这些阶段。

这是我的代码:

create_phase <- function(cumulative_date) {
  phase <- case_when(
    cumulative_date >= 8.3 ~ "phase3",
    cumulative_date >= 6.3 ~ "phase2",
    cumulative_date >= 2.6 ~ "phase1"
  )
  return(phase)
}

cumulative_date <- 1:10
phase <- cbind(cumulative_date, create_phase(cumulative_date))

这是接近我所需的结果:

在R中如何创建一个函数,只有在达到阈值时才调用一个名称?

但我真正需要的是这个:

在R中如何创建一个函数,只有在达到阈值时才调用一个名称?

英文:

I have a variable that increase 1 unit every day (let's call it cumulative date). You calculate this cumulative date for days 1 to 10. I want to create a second variable that is called phase. The phases are "phase1", "phase2", and "phase3". These phases are reached when cumulative date is 2.6, 6.3, and 8.3 or larger, respectively. I want to create a function in R that does this, but the trick is that I want the phases only be added when the criteria is met.

This is my code:

create_phase &lt;- function(cumulative_date) {
  phase &lt;- case_when(
    cumulative_date &gt;= 8.3 ~ &quot;phase3&quot;,
    cumulative_date &gt;= 6.3 ~ &quot;phase2&quot;,
    cumulative_date &gt;= 2.6 ~ &quot;phase1&quot;
  )
  return(phase)
}

cumulative_date &lt;- 1:10
phase &lt;- cbind(cumulative_date, create_phase(cumulative_date)) 

This is the result, which is close to what I need:

在R中如何创建一个函数,只有在达到阈值时才调用一个名称?

But what I really need is this:

在R中如何创建一个函数,只有在达到阈值时才调用一个名称?

答案1

得分: 0

你可以根据 phase 对数据进行分组,并将除了第一个 phase 条目之外的所有条目都变为 NA

df1 %>%
  mutate(phase = case_when(date >= 8.3 ~ "phase3",
                           date >= 6.3 ~ "phase2",
                           date >= 2.6 ~ "phase1")) %>%
  group_by(phase) %>%
  mutate(phase = c(phase[1], rep(NA, n()-1)))

虚拟数据: df1 <- tibble(date = 1:10)

结果:

# A tibble: 10 × 2
# Groups:   phase [4]
    date phase 
   <int> <chr> 
 1     1 NA    
 2     2 NA    
 3     3 phase1
 4     4 NA    
 5     5 NA    
 6     6 NA    
 7     7 phase2
 8     8 NA    
 9     9 phase3
10    10 NA
英文:

You can group your data by phase, and turn every but the first entry of phase into NA:

df1 %&gt;%
  mutate(phase = case_when(date &gt;= 8.3 ~ &quot;phase3&quot;,
                           date &gt;= 6.3 ~ &quot;phase2&quot;,
                           date &gt;= 2.6 ~ &quot;phase1&quot;)) %&gt;%
  group_by(phase) %&gt;%
  mutate(phase = c(phase[1], rep(NA, n()-1)))

Dummy data: df1 &lt;- tibble(date = 1:10)

Result:

# A tibble: 10 &#215; 2
# Groups:   phase [4]
    date phase 
   &lt;int&gt; &lt;chr&gt; 
 1     1 NA    
 2     2 NA    
 3     3 phase1
 4     4 NA    
 5     5 NA    
 6     6 NA    
 7     7 phase2
 8     8 NA    
 9     9 phase3
10    10 NA 

答案2

得分: 0

也许您可以考虑使用lag函数,然后检查何时跨越阈值以标记阶段。

library(tidyverse)

create_phase <- function(cumulative_date) {
  phase <- case_when(
    lag(cumulative_date) < 8.3 & cumulative_date >= 8.3 ~ "phase3",
    lag(cumulative_date) < 6.3 & cumulative_date >= 6.3 ~ "phase2",
    lag(cumulative_date) < 2.6 & cumulative_date >= 2.6 ~ "phase1"
  )
  return(phase)
}

cumulative_date <- 1:10

data.frame(cumulative_date, phase = create_phase(cumulative_date))

输出

   cumulative_date  phase
1                1   <NA>
2                2   <NA>
3                3 phase1
4                4   <NA>
5                5   <NA>
6                6   <NA>
7                7 phase2
8                8   <NA>
9                9 phase3
10              10   <NA>
英文:

Perhaps you might consider using lag and then check when crosses threshold to label the phase.

library(tidyverse)

create_phase &lt;- function(cumulative_date) {
  phase &lt;- case_when(
    lag(cumulative_date) &lt; 8.3 &amp; cumulative_date &gt;= 8.3 ~ &quot;phase3&quot;,
    lag(cumulative_date) &lt; 6.3 &amp; cumulative_date &gt;= 6.3 ~ &quot;phase2&quot;,
    lag(cumulative_date) &lt; 2.6 &amp; cumulative_date &gt;= 2.6 ~ &quot;phase1&quot;
  )
  return(phase)
}

cumulative_date &lt;- 1:10

data.frame(cumulative_date, phase = create_phase(cumulative_date))

Output

   cumulative_date  phase
1                1   &lt;NA&gt;
2                2   &lt;NA&gt;
3                3 phase1
4                4   &lt;NA&gt;
5                5   &lt;NA&gt;
6                6   &lt;NA&gt;
7                7 phase2
8                8   &lt;NA&gt;
9                9 phase3
10              10   &lt;NA&gt;

huangapple
  • 本文由 发表于 2023年5月25日 03:40:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/76326906.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定