我如下翻译: 如何在R中计算变量在一段时间内更改之间的平均天数?

huangapple go评论77阅读模式
英文:

How can I get the average number of days between changes in a variable over a period of time in R?

问题

我有一个数据集,其中包含日期和另一个变量(银行利率)。以下是数据的一部分:

我如下翻译:
如何在R中计算变量在一段时间内更改之间的平均天数?

我想计算银行利率连续变化之间的平均天数。例如,要获得如下输出:

我如下翻译:
如何在R中计算变量在一段时间内更改之间的平均天数?

基本上,我试图计算利率在变化之前保持不变的平均天数。

我可以使用通常的difftime()函数,但我需要它专门在利率发生变化时计算差异,然后求平均值。我是R的新手,不知道如何处理这个问题。

英文:

I have a dataset that has date and another variable (bank rate). Here is a snippet of the data:

我如下翻译:
如何在R中计算变量在一段时间内更改之间的平均天数?

I want to calculate the average number of days between each consecutive change in Bank rate. For example to get an output as such:

我如下翻译:
如何在R中计算变量在一段时间内更改之间的平均天数?

Essentially, I am trying to calculate the average number of days a rate remains for before it changes.

I am able to do the usual difftime() function, However I need it to specifically only calculate the difference when there are changes in the rate, and then average it out. I am new to R and unable to figure out how to go about this

答案1

得分: 1

我已经生成了与上述时间范围相符的一系列随机日期,并使用了上面的bank_rate,然后将它们放入了一个数据框(DF)中。

这个DF按date排序。然后使用filter将不显示任何bank_rate更改的数据删除(查看连续的bank_rate为2的情况)。创建一个新变量days_from_before,它计算了连续日期的天数。

平均值计算为days_from_beforemean

library(dplyr)

set.seed(123)
date <- sample(seq(as.Date("2018/02/07"), as.Date("2023/01/15"), by = "day"), 14)
bank_rate <- c(1.5, 1.5, rep(2, 6), 0.5, 1.25, 4.5, 4.5, 4.75, 4.75)

df <- data.frame(date, bank_rate)

df
#>          date bank_rate
#> 1  2019-03-28      1.50
#> 2  2019-05-15      1.50
#> 3  2018-08-04      2.00
#> 4  2019-07-17      2.00
#> 5  2018-08-20      2.00
#> 6  2020-09-01      2.00
#> 7  2021-03-24      2.00
#> 8  2021-09-21      2.00
#> 9  2021-07-13      0.50
#> 10 2021-07-28      1.25
#> 11 2020-12-10      4.50
#> 12 2021-12-05      4.50
#> 13 2019-12-03      4.75
#> 14 2019-10-01      4.75

ddf <- df |>
  arrange(date) |>
  filter(bank_rate != dplyr::lag(bank_rate, default = 0)) |>
  mutate(
    days_from_before = as.numeric(difftime(date, dplyr::lag(date))),
    days_from_before = ifelse(is.na(days_from_before), 0, days_from_before)
  )

ddf
#>          date bank_rate days_from_before
#> 1  2018-08-04      2.00                0
#> 2  2019-03-28      1.50              236
#> 3  2019-07-17      2.00              111
#> 4  2019-10-01      4.75               76
#> 5  2020-09-01      2.00              336
#> 6  2020-12-10      4.50              100
#> 7  2021-03-24      2.00              104
#> 8  2021-07-13      0.50              111
#> 9  2021-07-28      1.25               15
#> 10 2021-09-21      2.00               55
#> 11 2021-12-05      4.50               75

mean(ddf$days_from_before)
#> [1] 110.8182
英文:

I have a made a random sequence of dates in the timeframe as above and have used bank_rate from above and put them in a DF.

This DF is ordered for date.
Data which do not show any change in bank_rate are then removed by filter. (see consecutive bank_rates of 2). A new variable days_from_before is created which calculates the number of days of consecutive dates.

The average is calculated as the mean from days_from_before.

library(dplyr)

set.seed(123)
date &lt;- sample(seq(as.Date(&quot;2018/02/07&quot;), as.Date(&quot;2023/01/15&quot;), by = &quot;day&quot;), 14)
bank_rate &lt;- c(1.5, 1.5, rep(2, 6), 0.5, 1.25, 4.5, 4.5, 4.75, 4.75)

df &lt;- data.frame(date, bank_rate)

df
#&gt;          date bank_rate
#&gt; 1  2019-03-28      1.50
#&gt; 2  2019-05-15      1.50
#&gt; 3  2018-08-04      2.00
#&gt; 4  2019-07-17      2.00
#&gt; 5  2018-08-20      2.00
#&gt; 6  2020-09-01      2.00
#&gt; 7  2021-03-24      2.00
#&gt; 8  2021-09-21      2.00
#&gt; 9  2021-07-13      0.50
#&gt; 10 2021-07-28      1.25
#&gt; 11 2020-12-10      4.50
#&gt; 12 2021-12-05      4.50
#&gt; 13 2019-12-03      4.75
#&gt; 14 2019-10-01      4.75

ddf &lt;- df |&gt;
  arrange(date) |&gt;
  filter(bank_rate != dplyr::lag(bank_rate, default = 0)) |&gt; 
  mutate(
    days_from_before = as.numeric(difftime(date, dplyr::lag(date))),
    days_from_before = ifelse(is.na(days_from_before), 0, days_from_before)
  )

ddf
#&gt;          date bank_rate days_from_before
#&gt; 1  2018-08-04      2.00                0
#&gt; 2  2019-03-28      1.50              236
#&gt; 3  2019-07-17      2.00              111
#&gt; 4  2019-10-01      4.75               76
#&gt; 5  2020-09-01      2.00              336
#&gt; 6  2020-12-10      4.50              100
#&gt; 7  2021-03-24      2.00              104
#&gt; 8  2021-07-13      0.50              111
#&gt; 9  2021-07-28      1.25               15
#&gt; 10 2021-09-21      2.00               55
#&gt; 11 2021-12-05      4.50               75

mean(ddf$days_from_before)
#&gt; [1] 110.8182

huangapple
  • 本文由 发表于 2023年2月24日 13:24:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/75552884.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定