2023年2月18日 04:07:48go评论89阅读模式

英文:

Implementing a particular approach to calculating a log-likelihood using matrix operations

问题

我看到一个有关对数似然(log-likelihood)的数学表达式，它出现在CrossValidated.com的一个回答中，但我不清楚如何在R中实现它。我不确定SO是否可以像CV一样表示MathML，但这是第二个回答中的第一个方程式（未被接受的回答）：

$$
\begin{eqnarray}
    \ell(\mu, \Sigma) &=& C - \frac{m}{2}\log|\Sigma|-\frac{1}{2} \sum_{i=1}^m \text{tr}\left[(\mathbf{x}^{(i)}-\mu)^T \Sigma^{-1} (\mathbf{x}^{(i)}-\mu)\right]\\
$$

我关注该方程式中的第三项，我认为另一页上的另一个回答表明不需要迹运算。我可以查看各种包中存在的几种实现之一，但我认为它们使用的方法更经济，不明确遵循该方程的步骤，就像这个回答中的@onyambu一样。

我从早期的SO示例中提取了一些代码：

library(MASS)
# 创建协方差矩阵。请参考上面关于使用相关矩阵的影响的注意事项。
S = matrix(c(1.0, 0.2, 0.1, 0.35, 0.0,
0.2, 1.0, 0.0, 0.4, 0.0,
0.1, 0.0, 1.0, 0.0, 0.4,
0.35, 0.4, 0.0, 1.0, 0.6,
0.0, 0.0, 0.4, 0.6, 1.0), ncol = 5)
colnames(S) = c("Y1", "X1", "X2", "Z1", "Z2")
rownames(S) = colnames(S)
# 创建均值向量
mus = c(1, 2, 3, 4, 5); names(mus) = colnames(S)
# 生成5347个观测值
obs = mvrnorm(n = 200, mu = mus, Sigma = S)

这个工作是回应一个问题，现在已经得到了正确的回答，但没有使用矩阵表达式的求和。我认为可以使用for循环为每个数据点创建单独的贡献：

llmat.term3 <- matrix(NA, 200, 1)
for (n in 1:200) {
    llmat.term3[n] <- t(obs[n,] - mus) %*% solve(S) %*% (obs[n,] - mus)
}
sum(llmat.term3)
#[1] 982.7356

... 但我想知道是否有更紧凑的矩阵方法？或者，也许填补我线性代数知识中的空白，解释为什么 sum(u * solve(sig, u) 与 sum{i=1,N} ( t(obs[n,]-mu) %*% S^-1 %*% (obs[n,]-mu) ) 相同。

英文:

I cam across a mathematical expression for log-likelihood in a CrossValidated.com answer and am unclear how I should implement in R. I'm not sure if SO can represent MathML the same as CV, but this is the first equation in the second (not accepted) anser:

$$
\begin{eqnarray}
    \ell(\mu, \Sigma) &amp;=&amp; C - \frac{m}{2}\log|\Sigma|-\frac{1}{2} \sum_{i=1}^m \text{tr}\left[(\mathbf{x}^{(i)}-\mu)^T \Sigma^{-1} (\mathbf{x}^{(i)}-\mu)\right]\\
$$

I focusing on the 3rd term in that equation and I do not think the trace operation is necessary according to another answer on that page. I suppose I could look at one of the several implementations in the various packages that exist, but I'm thinking they use more economical approaches that don't clearly follow that equation's procedure, as did @onyambu in the answer here:

I'm ripping out code from an earlier SO example:

library(MASS)
# Make covariance matrix. See note above re the implications of using a correlation matrix.
S = matrix(c(1.0, 0.2, 0.1, 0.35, 0.0,
0.2, 1.0, 0.0, 0.4, 0.0,
0.1, 0.0, 1.0, 0.0, 0.4,
0.35, 0.4, 0.0, 1.0, 0.6,
0.0, 0.0, 0.4, 0.6, 1.0), ncol = 5)
colnames(S) = c(&quot;Y1&quot;, &quot;X1&quot;, &quot;X2&quot;, &quot;Z1&quot; ,&quot;Z2&quot;)
rownames(S) = colnames(S)
# Make mean vector
mus = c(1, 2, 3, 4, 5); names(mus) = colnames(S)
# Generate 5347 observations
obs = mvrnorm(n = 200, mu = mus, Sigma = S)

This effort was in response to a question correctly answered now but not using a summation of a matrix expression. I think I can do it with a for-loop to create individual contributions for each data point:

llmat.term3 &lt;- matrix(NA, 200,1)
for(n in 1:200) {
     llmat.term3[n] &lt;- t(obs[n,]-mus) %*% solve(S) %*% (obs[n,]-mus) }
sum(llmat.term3)
#[1] 982.7356

.... but I'm wondering if there is a more compact matrix approach? Or I suppose, filled in the gaps in my linear algebra knowledge that explains why sum(u * solve(sig, u) is the same as sum{i=1,N} ( t(obs[n,]-mu) %*% S^-1 %*% (obs[n,]-mu) ).

答案1

得分: 1

在你的代码中，有以下部分需要翻译：

S = matrix(c(1.0, 0.2, 0.1, 0.35, 0.0,
             0.2, 1.0, 0.0, 0.4, 0.0,
             0.1, 0.0, 1.0, 0.0, 0.4,
             0.35, 0.4, 0.0, 1.0, 0.6,
             0.0, 0.0, 0.4, 0.6, 1.0), ncol = 5)
colnames(S) = c("Y1", "X1", "X2", "Z1", "Z2")
rownames(S) = colnames(S)
# 创建均值向量
mus = c(1, 2, 3, 4, 5); names(mus) = colnames(S)
# 生成5347个观测值
set.seed(123)
obs = MASS::mvrnorm(n = 200, mu = mus, Sigma = S)
llmat.term3 <- matrix(NA, 200, 1)
for (n in 1:200) {
  llmat.term3[n] <- t(obs[n,] - mus) %*% solve(S) %*% (obs[n,] - mus)
}
sum(llmat.term3)
#[1] 982.7356
与更紧凑的方法进行比较：
u <- t(obs) - mus
sum(diag(solve(S, tcrossprod(u))))
#>[1] 982.7356    
sum(u * solve(S, u))
#>[1] 982.7356
尽管这两个表达式给出了类似的结果，但第一个似乎比第二个更快。我不知道为什么，因为在第一个表达式中有一个n * n矩阵的计算。for循环需要很长时间来计算。
单位：毫秒
 expr       min        lq       mean    median        uq       max  neval
    a 4532.6753 4679.4043 5470.94765 4815.1294 6061.3284 7789.5116     10
    b    2.8991    3.2693    3.73495    3.3675    3.7777    6.9719     10
    c    7.8176    8.5473   12.03060    9.2542   16.4089   20.1742     10
---
set.seed(123)
n <- 200000
obs = MASS::mvrnorm(n = n, mu = mus, Sigma = S)
u <- t(obs) - mus
microbenchmark::microbenchmark(a = {
  llmat.term3 <- matrix(NA, n, 1)
  for (i in seq(n)) {
    llmat.term3[i] <- t(obs[i,] - mus) %*% solve(S) %*% (obs[i,] - mus)
  }
  sum(llmat.term3)
},
b = sum(diag(solve(S, tcrossprod(u)))),
c = sum(u * solve(S, u)),
check = 'equal', times = 10)
注意：花了一些时间才找到你使用的随机种子。下次请将其包含在数据生成中。
<details>
<summary>英文:</summary>
in your code you have 
    S = matrix(c(1.0, 0.2, 0.1, 0.35, 0.0,
                 0.2, 1.0, 0.0, 0.4, 0.0,
                 0.1, 0.0, 1.0, 0.0, 0.4,
                 0.35, 0.4, 0.0, 1.0, 0.6,
                 0.0, 0.0, 0.4, 0.6, 1.0), ncol = 5)
    colnames(S) = c(&quot;Y1&quot;, &quot;X1&quot;, &quot;X2&quot;, &quot;Z1&quot; ,&quot;Z2&quot;)
    rownames(S) = colnames(S)
    # Make mean vector
    mus = c(1, 2, 3, 4, 5); names(mus) = colnames(S)
    # Generate 5347 observations
    set.seed(123)
    obs = MASS::mvrnorm(n = 200, mu = mus, Sigma = S)
    llmat.term3 &lt;- matrix(NA, 200,1)
    for(n in 1:200) {
      llmat.term3[n] &lt;- t(obs[n,]-mus) %*% solve(S) %*% (obs[n,]-mus) }
    sum(llmat.term3)
    #[1] 982.7356
compare to more compact approaches:
    u &lt;- t(obs) - mus
    sum(diag(solve(S, tcrossprod(u))))
    #&gt; [1] 982.7356    
    sum(u * solve(S, u))
    #&gt; [1] 982.7356
Though the two expressions give similar results, The first one seems to be quicker than the second. I do not know why since in the first one there is a computation of n * n matrix.  The for loop takes for-ever to compute.
    Unit: milliseconds
     expr       min        lq       mean    median        uq       max  neval
        a 4532.6753 4679.4043 5470.94765 4815.1294 6061.3284 7789.5116     10
        b    2.8991    3.2693    3.73495    3.3675    3.7777    6.9719     10
        c    7.8176    8.5473   12.03060    9.2542   16.4089   20.1742     10
---
    set.seed(123)
    n &lt;- 200000
    
    obs = MASS::mvrnorm(n = n, mu = mus, Sigma = S)
    u &lt;- t(obs) -mus
    
    microbenchmark::microbenchmark(a = {
      llmat.term3 &lt;- matrix(NA, n,1)
      for(i in seq(n)) {
       llmat.term3[i] &lt;- t(obs[i,]-mus) %*% solve(S) %*% (obs[i,]-mus) }
      sum(llmat.term3)
      },
      b = sum(diag(solve(S, tcrossprod(u)))),
      c = sum(u * solve(S, u)),
      check = &#39;equal&#39;, times = 10)
NB: took me a while to get the seed you used. Next time include it in your data generation
</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用矩阵操作来实现计算对数似然的特定方法。

问题

答案1

获取每个组的所有级别组合。

R dplyr：如何在grep()中筛选存储在向量中的列名？

如果模式在gsub中匹配，则仅返回值。

R: 根据当前顺序将数据集分成部分

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。