这两个库中的分布是否相同?

huangapple go评论72阅读模式
英文:

Are these the same distributions in the statsmodels and scipy libraries?

问题

我想用 statsmodels 在Python中创建一个GLM模型。statsmodels.glm 支持以下分布:

  • 单参数指数族
  • Gamma
  • 高斯
  • 逆高斯
  • 二项分布
  • 负二项分布
  • 泊松
  • Tweedie

我想在scipy库中找到相同的分布。我认为它们对应如下:

statsmodels (文档) scipy (文档)
家族 expon
二项分布 binom
Gamma gamma
高斯 norm
逆高斯 invgauss
负二项分布 nbinom
泊松 poisson
Tweedie tweedie

我不确定"expon"是否与"家族"完全相同。我该如何验证这一点?我认为最好检查所有分布,因为我正在使用scipy包确定最合适的分布,然后应用于statsmodels的glm。

更新:我以为"家族"是一种分布(为了自我辩护,这个表格不太清晰)。

英文:

I want to make a GLM model using statsmodels in python. statsmodels.glm supports the following distrubutions:

  • <s>One-parameter exponential family</s>
  • Gamma
  • Gaussian
  • Inverse Gaussian
  • Binomial
  • Negative Binomial
  • Poisson
  • Tweedie

I want to find the same distributions in the scipy library. I think these are the same:

statsmodels (docs) scipy (docs)
<s>Family</s> <s>expon</s>
Binomial binom
Gamma gamma
Gaussian norm
InverseGaussian invgauss
NegativeBinomial nbinom
Poisson poisson
Tweedie tweedie

<s>I am not entirely sure if "expon" is exactly the same as "Family". How could I check if this is the case?</s> Also, I think it is wise to check them all, because I am determining the best fitting distribution using the scipy package, but then apply the statsmodels glm.

Update: I thought "Family" was a distribution (in my defense, the table is quite unclear).

答案1

得分: 1

Family 是父类,而不是一个特定的分布族。

在 statsmodels 中,GLM 的分布与 scipy.stats 中的分布对应(也许 tweedie 除外)。然而,GLM 分布和分布的常见参数化之间存在差异,比如 scipy 中的参数化。

在最新版本中,GLM 和大多数离散模型都有一个 get_distribution 方法,该方法返回与相应的 scipy 或兼容 scipy 的分布对应的实例。

例如:
https://www.statsmodels.org/devel/_modules/statsmodels/genmod/families/family.html#Binomial.get_distribution
具有相同的参数化,
https://www.statsmodels.org/devel/_modules/statsmodels/genmod/families/family.html#NegativeBinomial.get_distribution
在这里,GLM 参数需要转换以对应于 scipy 分布。

英文:

Family is the parent class and not a specific distribution family.

The families for GLM in statsmodels correspond to distributions in scipy.stats (except maybe for tweedie). However the parameterization is different between GLM families and the common parameterization of the distributions, such as those in scipy.

In the latest release, the GLM and most discrete models have a get_distribution method that returns and instance of the corresponding scipy or scipy-compatible distribution.

For example
https://www.statsmodels.org/devel/_modules/statsmodels/genmod/families/family.html#Binomial.get_distribution
which has the same parameterization, and
https://www.statsmodels.org/devel/_modules/statsmodels/genmod/families/family.html#NegativeBinomial.get_distribution
where GLM parameters need to be transformed to correspond to the scipy distribution.

huangapple
  • 本文由 发表于 2023年5月10日 23:12:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/76220082.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定