英文:
How to create a function in R that call a name only when a threshold is reached?
问题
我有一个每天增加1单位的变量(我们称之为cumulative date
)。你要计算从第1天到第10天的cumulative date
。我想要创建一个名为phase
的第二个变量。阶段分别是“phase1”,“phase2”和“phase3”。这些阶段分别在cumulative date
大于或等于2.6、6.3和8.3时达到。我想要在R中创建一个函数来实现这一点,但关键是只有在满足条件时才添加这些阶段。
这是我的代码:
create_phase <- function(cumulative_date) {
phase <- case_when(
cumulative_date >= 8.3 ~ "phase3",
cumulative_date >= 6.3 ~ "phase2",
cumulative_date >= 2.6 ~ "phase1"
)
return(phase)
}
cumulative_date <- 1:10
phase <- cbind(cumulative_date, create_phase(cumulative_date))
这是接近我所需的结果:
但我真正需要的是这个:
英文:
I have a variable that increase 1 unit every day (let's call it cumulative date
). You calculate this cumulative date
for days 1 to 10. I want to create a second variable that is called phase
. The phases are "phase1", "phase2", and "phase3". These phases are reached when cumulative date
is 2.6, 6.3, and 8.3 or larger, respectively. I want to create a function in R that does this, but the trick is that I want the phases only be added when the criteria is met.
This is my code:
create_phase <- function(cumulative_date) {
phase <- case_when(
cumulative_date >= 8.3 ~ "phase3",
cumulative_date >= 6.3 ~ "phase2",
cumulative_date >= 2.6 ~ "phase1"
)
return(phase)
}
cumulative_date <- 1:10
phase <- cbind(cumulative_date, create_phase(cumulative_date))
This is the result, which is close to what I need:
But what I really need is this:
答案1
得分: 0
你可以根据 phase
对数据进行分组,并将除了第一个 phase
条目之外的所有条目都变为 NA
:
df1 %>%
mutate(phase = case_when(date >= 8.3 ~ "phase3",
date >= 6.3 ~ "phase2",
date >= 2.6 ~ "phase1")) %>%
group_by(phase) %>%
mutate(phase = c(phase[1], rep(NA, n()-1)))
虚拟数据: df1 <- tibble(date = 1:10)
结果:
# A tibble: 10 × 2
# Groups: phase [4]
date phase
<int> <chr>
1 1 NA
2 2 NA
3 3 phase1
4 4 NA
5 5 NA
6 6 NA
7 7 phase2
8 8 NA
9 9 phase3
10 10 NA
英文:
You can group your data by phase
, and turn every but the first entry of phase
into NA
:
df1 %>%
mutate(phase = case_when(date >= 8.3 ~ "phase3",
date >= 6.3 ~ "phase2",
date >= 2.6 ~ "phase1")) %>%
group_by(phase) %>%
mutate(phase = c(phase[1], rep(NA, n()-1)))
Dummy data: df1 <- tibble(date = 1:10)
Result:
# A tibble: 10 × 2
# Groups: phase [4]
date phase
<int> <chr>
1 1 NA
2 2 NA
3 3 phase1
4 4 NA
5 5 NA
6 6 NA
7 7 phase2
8 8 NA
9 9 phase3
10 10 NA
答案2
得分: 0
也许您可以考虑使用lag
函数,然后检查何时跨越阈值以标记阶段。
library(tidyverse)
create_phase <- function(cumulative_date) {
phase <- case_when(
lag(cumulative_date) < 8.3 & cumulative_date >= 8.3 ~ "phase3",
lag(cumulative_date) < 6.3 & cumulative_date >= 6.3 ~ "phase2",
lag(cumulative_date) < 2.6 & cumulative_date >= 2.6 ~ "phase1"
)
return(phase)
}
cumulative_date <- 1:10
data.frame(cumulative_date, phase = create_phase(cumulative_date))
输出
cumulative_date phase
1 1 <NA>
2 2 <NA>
3 3 phase1
4 4 <NA>
5 5 <NA>
6 6 <NA>
7 7 phase2
8 8 <NA>
9 9 phase3
10 10 <NA>
英文:
Perhaps you might consider using lag
and then check when crosses threshold to label the phase.
library(tidyverse)
create_phase <- function(cumulative_date) {
phase <- case_when(
lag(cumulative_date) < 8.3 & cumulative_date >= 8.3 ~ "phase3",
lag(cumulative_date) < 6.3 & cumulative_date >= 6.3 ~ "phase2",
lag(cumulative_date) < 2.6 & cumulative_date >= 2.6 ~ "phase1"
)
return(phase)
}
cumulative_date <- 1:10
data.frame(cumulative_date, phase = create_phase(cumulative_date))
Output
cumulative_date phase
1 1 <NA>
2 2 <NA>
3 3 phase1
4 4 <NA>
5 5 <NA>
6 6 <NA>
7 7 phase2
8 8 <NA>
9 9 phase3
10 10 <NA>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论