如何在R中使用大数据集运行狄利克雷回归?

huangapple go评论48阅读模式
英文:

How to run Dirichlet Regression with a big data set in R?

问题

我想在R中使用DirichReg包对一个大数据集运行狄利克雷回归。我目前有一个包含37列和约13,000,000行的数据框。

然而,在所有数据上运行这个模型会立刻导致R崩溃。我正在使用一个具有16个核心和128 GB内存的Linux机器。即使仅削减数据到1000个数据点,R也几乎立刻崩溃并重新启动。

我是否做错了什么?有没有办法并行化此操作以使该模型运行?

我正在使用以下语法运行模型:

data.2 <- data

data.2$y_variable <- DR_data(data[,c(33:35)])

model <- DirichReg(y_variable ~ x_variable, data.2)

我必须在一个单独的data.2数据框中创建y_variable,因为运行data$y_variable <- DR_data(data[,c(33:35)])会导致R崩溃。我不知道为什么会这样。

英文:

I would like to run a Dirichlet regression on a large data set using the DirichReg Package in R. I currently have data.frame with 37 columns and ~13,000,000 rows.

However, running this model on all of my data instantly crashes R. I am using a Linux machine with 16 cores and 128 GB of memory. Even just cutting down my data to only 1000 points still causes R to almost immediately crash and restart.

Am I doing something wrong? Is there any way I can parallelize this operation to get this model to run?

I am running a model with the following syntax:

data.2 &lt;- data

data.2$y_variable &lt;- DR_data(data[,c(33:35)])

model &lt;- DirichReg(y_variable ~ x_variable, data.2)

I have to create the y_variable in a separate data.2 data.frame, because running data$y_variable &lt;- DR_data(data[,c(33:35)]) will crash R. I have no idea why this is.

答案1

得分: 1

Bit of a guess why it's 'crashing' R, but if it's due to RAM issues then you can update the table by reference, rather than making a shallow copy of the entire data:

library(data.table)
setDT(data)
dat[, y := DR_data(data[,c(33:35)])]
英文:

Bit of a guess why it's 'crashing' R, but if it's due to RAM issues then you can update the table by reference, rather than making a shallow copy of the entire data:

library(data.table)
setDT(data)
dat[, y := DR_data(data[,c(33:35)])]

huangapple
  • 本文由 发表于 2023年2月10日 04:49:42
  • 转载请务必保留本文链接:https://go.coder-hub.com/75404296.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定