2023年5月10日 19:06:20go评论87阅读模式

英文:

Applying Bayesian Changepoint Detection algorithm using bcp to a grouped data frame in R

问题

我有一个分组的数据框，我想应用bcp函数来计算每个点的变化后验概率。

我的数据如下：

# 安装PACMAN
if (!require("pacman", character.only = TRUE)) {
  install.packages("pacman")
}
pacman::p_load(bcp,tidyverse)
df <- data.frame(
  date = c(seq(Sys.Date(), by = -1, length.out = 1000), seq(Sys.Date(), by = -1, length.out = 1000)),
  value = c(rnorm(200, mean = 20, sd = 1), rnorm(800, mean = 17, sd = 2), rnorm(400, mean = 200, sd = 3), rnorm(600, mean = 150, sd = 4)),
  product = c(rep("A", 1000), rep("B", 1000))
)

通过将我的数据框筛选为单个变量，并将其分配给新变量，然后应用bcp()函数，我看到它返回一个包含12个列表的结果：

x <- df %>% 
  filter(product == "A")
y <- bcp(x$value)

我尝试使用group_map，它返回了只有两列的结果，这可能不是理想的。不过，我不知道为什么只返回两列：

df %>% 
  group_by(product) %>% 
  group_map(~ bcp(.x$value))

我还尝试了group_modify，但无法得到正确的语法来解析出正确的字段：

df %>% 
  group_by(product) %>% 
  group_modify(~ {
    bcp::bcp(.x$value) %>% 
      tibble::enframe(name = "name", value = "value")
    })

以及：

df %>% 
  group_by(product) %>% 
  group_modify(~ bcp::bcp(.x$value) %>% 
      pluck("posterior.prob"))

关于如何在每个分组的基础上将bcp函数的'posterior.prob'附加到我的原始数据框上，如果有任何指导，将不胜感激。

英文:

I have a grouped dataframe which I would like to apply the bcp function to calculate for each point the posterior probability of there being a change at each point.

My data looks as follows:

# INSTALL PACMAN
if (!require(&quot;pacman&quot;, character.only = TRUE)) {
  install.packages(&quot;pacman&quot;)
}
pacman::p_load(bcp,tidyverse)
df &lt;- data.frame(
  date = c(seq(Sys.Date(), by = -1, length.out = 1000), seq(Sys.Date(), by = -1, length.out = 1000)),
  value = c(rnorm(200, mean = 20, sd = 1), rnorm(800, mean = 17, sd = 2), rnorm(400, mean = 200, sd = 3), rnorm(600, mean = 150, sd = 4)),
  product = c(rep(&quot;A&quot;, 1000), rep(&quot;B&quot;, 1000))
)

By filtering my df to a single variable and assigning it to a new variable and applying bcp() I see it returns a list of 12

x &lt;- df %&gt;% 
  filter(product == &quot;A&quot;)
 y &lt;- bcp(x$value)

I've tried using group_map which returns only two columns which is not ideal, I've no idea why only two columns are returned:

df %&gt;% 
  group_by(product) %&gt;% 
  group_map(~ bcp(.x$value))

I've also tried group_modify but I can't get the syntax correct to parse out the correct fields:

df %&gt;% 
  group_by(product) %&gt;% 
  group_modify(~ {
    bcp::bcp(.x$value) %&gt;% 
      tibble::enframe(name = &quot;name&quot;, value = &quot;value&quot;)
    })

As well as:

df %&gt;% 
  group_by(product) %&gt;% 
  group_modify(~ bcp::bcp(.x$value) %&gt;% 
      pluck(&quot;posterior.prob&quot;))

Any guidance on how I can append the 'posterior.prob' from the bcp function to my original df on a per group basis would be greatly appreciated.

答案1

得分: 1

I'm not familiar with the bcp package but does this give you what you want?

posterior_prob <- map(df %>%
  group_by(product) %>%
  group_map(~ bcp(.x$value)), pluck("posterior.prob")) %>%
  unlist()
df$posterior_prob_var <- posterior_prob
head(df)
#         date    value product posterior_prob_var
# 1 2023-05-10 21.90542       A              0.002
# 2 2023-05-09 19.61293       A              0.000
# 3 2023-05-08 20.46336       A              0.002
# 4 2023-05-07 21.22534       A              0.000
# 5 2023-05-06 19.37578       A              0.000
# 6 2023-05-05 18.94408       A              0.002

(Note: This is the translated code section. If you have any further questions or need additional assistance, please let me know.)

英文:

I'm not familiar with the bcp package but does this give you what you want?

posterior_prob &lt;- map(df %&gt;% 
  group_by(product) %&gt;% 
  group_map(~ bcp(.x$value)), pluck(&quot;posterior.prob&quot;)) %&gt;% 
  unlist()
df$posterior_prob_var &lt;- posterior_prob
head(df)
#         date    value product posterior_prob_var
# 1 2023-05-10 21.90542       A              0.002
# 2 2023-05-09 19.61293       A              0.000
# 3 2023-05-08 20.46336       A              0.002
# 4 2023-05-07 21.22534       A              0.000
# 5 2023-05-06 19.37578       A              0.000
# 6 2023-05-05 18.94408       A              0.002

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

应用贝叶斯变点检测算法使用bcp到R中的分组数据帧

问题

答案1

rvest – 浏览网站并下载加拿大水文数据

使用`geom_hex`突出显示特定的十六进制单元并更改突出显示单元的线宽。

在ggplot2中，只标注唯一重复的键值在一个发散的条形图上。

无法从GitHub安装特定的R包。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。