英文:
MATLAB: Processing specific columns of a LARGE matrix using another matrix as the column keys/indices
问题
我有一个矩阵 A,它是 6 行 x 40 列。它填充有随机数。
我使用 nchoosek(1:40, 6) 来创建一个矩阵 B,其中包含所有可能的线性组合的列的索引;一个可能的组合是 (1,2,3,4,5,6),另一个是 (1,2,3,4,5,40)。这个矩阵的大小是 3838380 x 6。
然而,矩阵 B 只提供了我想要创建的矩阵的索引,这个矩阵也是 3838380 x 6,并包含了 A 的实际值,使用 B 的值作为索引。例如,B 的第 4 行是
[1 2 3 4 5 9]。
对于矩阵 B 的所有 3838380 行,我需要使用 A 创建一个矩阵 C。这个新矩阵 C 使用 B 的一行作为索引来检索 A 的列,生成一个 6x6 的矩阵:
C 用于解 Cx = d,其中 d 是一个 6x1 向量。
我正在使用 For 循环来做这个 3838380 次:
for j = 1:length(B) %
C = A(:,B(j,:))
x = C\d % d 是一个 6x1 向量 - x 也返回一个 1x6 向量
end
目前大约需要 3 分钟来完成这个过程。我想知道是否有更快或向量化的方法,也许创建一个 3 维矩阵,它是 6 x 3838380 x 6(每个条目都是一个 6x6 的矩阵)?我仍然需要分别处理每个 6x6 的矩阵,并将返回的向量存储到另一个矩阵中。
失败 / 我该怎么做? 实际上,我尝试创建 6 x 3838380 x 6 矩阵,完全使用 C = A(:,B)。这对我的计算机和我的大脑都不太好 -- 我很难弄清楚如何分解这个巨大的矩阵与我以前做的有什么不同。
我觉得有一种方法可以做到这一点,而不需要使用 for 循环。
英文:
I have a matrix A which is 6 rows x 40 columns. It is populated with random numbers.
I used nchoosek(1:40, 6) to create a matrix B of the indices of all possible linear combinations of the columns; one possible combination is (1,2,3,4,5,6), and (1,2,3,4,5,40). The size of this matrix is 3838380 x 6.
However, matrix B only gives the indices of the matrix I want to make, which is also 3838380 x 6, and contains the actual values of A, indexed using the values of B. For example, the 4th row of B is
[1 2 3 4 5 9].
For all 3838380 rows of matrix B, I need to create a matrix C using A. This new matrix C uses a row of B as indices to retrieve the columns of A, producing a 6x6 matrix:
[1 2 3 4 5 9] --> [0.4178 0.6562 -0.8633 0.7979 -0.9162 0.8720
0.4864 0.0149 -0.8301 -0.2927 -0.7153 -0.2214
0.7994 -0.2677 -0.8633 -0.7596 -0.8468 -0.7657
-0.8695 -0.5467 -0.1804 0.1382 0.4811 -0.5192
-0.3282 0.0697 -0.7532 0.7501 -0.0869 0.3698
-0.9913 -0.4210 -0.1140 -0.3029 0.3365 0.6785].
C is used to solve for x in Cx = d, where d is a 6x1 vector.
I am using a For loop to do this 3838380 times:
for j = 1:length(B) %
C = A(:,B(j,:))
x = C\d % d is a 6x1 vector - x returns a 1x6 vector as well
end
It currently takes roughly 3 minutes to do this. I want to know if there is a faster or vectorized way, perhaps creating a 3-dimensional matrix that is 6 x 3838380 x 6 (every entry is a 6x6 matrix)? I would still need to process each 6x6 matrix individually and store the returned vector to another matrix.
Failure / What do I do? I actually tried to create the 6 x 3838380 x 6 matrix, using C = A(:,B) entirely. It did not go well with my computer, nor did it go well with my brain -- I'm having trouble figuring out how breaking up the huge matrix would be any different from what I was doing before.
I feel that there is a way to do this without for loops.
Cheers!
答案1
得分: 1
这部分内容的中文翻译如下:
这可以通过创建一个大小为6×6×3838380的3D数组,其中包含所有6×6矩阵作为“页面”,然后使用pagemldivide
(在R2022a中引入)一次解决所有线性系统:
% 示例数据
A = rand(6, 40);
d = rand(size(A,1), 1);
B = nchoosek(1:size(A,2), size(A,1));
% 未矢量化方法
tic
x = NaN(size(A,1), size(B,1)); % 初始化
for jj = 1:length(B) %
C = A(:, B(jj,:));
x(:, jj) = C\d;
end
toc
% 矢量化方法
tic
CC = reshape(A(:, B.'), size(A,1), size(A,1), []);
xx = permute(pagemldivide(CC, d), [1 3 2]);
toc
% 检查
isequal(x, xx)
矢量化版本似乎确实更快。我在R2022b中(使用Matlab Online)得到了以下结果:
经过的时间为10.310424秒。
经过的时间为0.812846秒。
ans =
logical
1
英文:
This can be vectorized by creating a 3D array of size 6×6×3838380 containing all the 6×6 matrices as "pages", and then using pagemldivide
(introduced in R2022a) to solve all the linear systems at once:
% Example data
A = rand(6, 40);
d = rand(size(A,1), 1);
B = nchoosek(1:size(A,2), size(A,1));
% Non-vectorized approach
tic
x = NaN(size(A,1), size(B,1)); % initiallize
for jj = 1:length(B) %
C = A(:, B(jj,:));
x(:, jj) = C\d;
end
toc
% Vectorized approach
tic
CC = reshape(A(:, B.'), size(A,1), size(A,1), []);
xx = permute(pagemldivide(CC, d), [1 3 2]);
toc
% Check
isequal(x, xx)
The vectorized version seems to be faster indeed. I got these results in R2022b (using Matlab Online):
Elapsed time is 10.310424 seconds.
Elapsed time is 0.812846 seconds.
ans =
logical
1
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论