英文:
Fast matrix operations in R
问题
- 我想要构建一个类型如下的矩阵
X
:
N = 1000
A = seq(1, N, 1)
B = A
X = A %*% t(rep(1,N)) - rep(1,N) %*% t(B)
处理这个矩阵 X
速度有点慢:对于较大的 N
,需要大量内存,并且对其进行矩阵乘法操作似乎没有充分利用线性结构,这可能会有所帮助(?)。
因此,我想要避免这种情况,问题是:在需要计算 X
时,最快的计算方法是什么,可能不需要创建矩阵 A %*% t(rep(1,N))
和 rep(1,N) %*% t(B)
?
- 目前,不同类型的矩阵乘法的最快方法是什么?有一些新的包,据说速度很快,但似乎基本的 R 仍然占主导地位?
library(microbenchmark)
A = array(rnorm(10^4), c(100,100))
B = array(rnorm(10^4), c(100,100))
microbenchmark(fastmatrix::hadamard(A,B), A*B, Rfast::mat.mult(A,B), A %*%B, Rfast::Crossprod(A,B), t(A)%*%B)
Unit: microseconds
expr min lq mean median uq max neval
fastmatrix::hadamard(A, B) 174.626 264.2185 303.55052 277.3510 301.7825 1052.063 100
A * B 3.711 29.9485 28.81842 31.5185 33.9670 73.038 100
Rfast::mat.mult(A, B) 760.087 839.6020 918.76155 891.4790 944.5455 1356.274 100
A %*% B 500.756 536.6595 602.09226 565.5765 601.5310 1502.310 100
Rfast::Crossprod(A, B) 716.953 786.3320 854.55307 813.4165 874.3930 1589.097 100
t(A) %*% B 520.544 585.2830 666.63137 631.3350 703.5430 1322.987 100
英文:
A couple of simple questions:
- I want to build a matrix
X
of the following type:
N = 1000
A = seq(1, N, 1)
B = A
X = A %*% t(rep(1,N)) - rep(1,N) %*% t(B)
Working with this matrix X
is kind of slow: it takes a lot of memory for large N
, and doing matrix multiplications for it does not use the linear structure, which I suppose could be helpful(?).
Therefore, I want to avoid this, and the question is: what is the fastest way of calculating X
(whenever I need it), possibly without creating the matrices A %*% t(rep(1,N))
and rep(1,N) %*% t(B)
?
- What's the fastest approach for different sorts of matrix multiplications nowadays? There are some new packages, which are supposedly fast, but it seems that base R still dominates?
library(microbenchmark)
A = array(rnorm(10^4), c(100,100))
B = array(rnorm(10^4), c(100,100))
microbenchmark(fastmatrix::hadamard(A,B), A*B,Rfast::mat.mult(A,B), A %*%B, Rfast::Crossprod(A,B), t(A)%*%B)
Unit: microseconds
expr min lq mean median uq max neval
fastmatrix::hadamard(A, B) 174.626 264.2185 303.55052 277.3510 301.7825 1052.063 100
A * B 3.711 29.9485 28.81842 31.5185 33.9670 73.038 100
Rfast::mat.mult(A, B) 760.087 839.6020 918.76155 891.4790 944.5455 1356.274 100
A %*% B 500.756 536.6595 602.09226 565.5765 601.5310 1502.310 100
Rfast::Crossprod(A, B) 716.953 786.3320 854.55307 813.4165 874.3930 1589.097 100
t(A) %*% B 520.544 585.2830 666.63137 631.3350 703.5430 1322.987 100
答案1
得分: 2
以下是翻译好的部分:
To answer your first question:
NN <- c(N, N)
Y <- .row(NN) - .col(NN)
identical(`storage.mode<-`(Y, "double"), X)
## [1] TRUE
Probably not much slower or faster than your code (depending on your BLAS implementation), but certainly more transparent and typically more efficient, giving a result of type integer
, not double
. The "gotcha" here is that BLAS routines want double
, so computing with the result can require a coercion ...
To answer your second question, note that Y' = -Y
, hence Y
is "skew-symmetric":
identical(t(Y), -Y)
## [1] TRUE
Depending on what you are trying to do, you can take advantage of the fact that (IIRC) skew-symmetric matrices can be factorized as LDL'
. Whether there is a package implementing that factorization (using it to solve linear systems, etc.) remains to be seen, but I am doubtful.
If what you are trying to do is just matrix multiplication, then I would not expect skew-symmetry to help at all, unless you are wanting to write specialized Fortran routines of your own, modeled after DSYMM
...
英文:
To answer your first question:
NN <- c(N, N)
Y <- .row(NN) - .col(NN)
identical(`storage.mode<-`(Y, "double"), X)
## [1] TRUE
Probably not much slower or faster than your code (depending on your BLAS implementation), but certainly more transparent and typically more efficient, giving a result of type integer
, not double
. The "gotcha" here is that BLAS routines want double
, so computing with the result can require a coercion ...
To answer your second question, note that Y' = -Y
, hence Y
is "skew-symmetric":
identical(t(Y), -Y)
## [1] TRUE
Depending on what you are trying to do, you can take advantage of the fact that (IIRC) skew-symmetric matrices can be factorized as LDL'
. Whether there is a package implementing that factorization (using it to solve linear systems, etc.) remains to be seen, but I am doubtful.
If what you are trying to do is just matrix multiplication, then I would not expect skew-symmetry to help at all, unless you are wanting to write specialized Fortran routines of your own, modeled after DSYMM
...
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论