MATLAB:使用另一个矩阵作为列键/索引来处理大矩阵的特定列

huangapple go评论66阅读模式
英文:

MATLAB: Processing specific columns of a LARGE matrix using another matrix as the column keys/indices

问题

我有一个矩阵 A,它是 6 行 x 40 列。它填充有随机数。

我使用 nchoosek(1:40, 6) 来创建一个矩阵 B,其中包含所有可能的线性组合的列的索引;一个可能的组合是 (1,2,3,4,5,6),另一个是 (1,2,3,4,5,40)。这个矩阵的大小是 3838380 x 6。

然而,矩阵 B 只提供了我想要创建的矩阵的索引,这个矩阵也是 3838380 x 6,并包含了 A 的实际值,使用 B 的值作为索引。例如,B 的第 4 行是
[1 2 3 4 5 9]。

对于矩阵 B 的所有 3838380 行,我需要使用 A 创建一个矩阵 C。这个新矩阵 C 使用 B 的一行作为索引来检索 A 的列,生成一个 6x6 的矩阵:

C 用于解 Cx = d,其中 d 是一个 6x1 向量。

我正在使用 For 循环来做这个 3838380 次:

for j = 1:length(B) % 
    C = A(:,B(j,:))
    x = C\d % d 是一个 6x1 向量 - x 也返回一个 1x6 向量
end

目前大约需要 3 分钟来完成这个过程。我想知道是否有更快或向量化的方法,也许创建一个 3 维矩阵,它是 6 x 3838380 x 6(每个条目都是一个 6x6 的矩阵)?我仍然需要分别处理每个 6x6 的矩阵,并将返回的向量存储到另一个矩阵中。

失败 / 我该怎么做? 实际上,我尝试创建 6 x 3838380 x 6 矩阵,完全使用 C = A(:,B)。这对我的计算机和我的大脑都不太好 -- 我很难弄清楚如何分解这个巨大的矩阵与我以前做的有什么不同。

我觉得有一种方法可以做到这一点,而不需要使用 for 循环。

英文:

I have a matrix A which is 6 rows x 40 columns. It is populated with random numbers.
I used nchoosek(1:40, 6) to create a matrix B of the indices of all possible linear combinations of the columns; one possible combination is (1,2,3,4,5,6), and (1,2,3,4,5,40). The size of this matrix is 3838380 x 6.

However, matrix B only gives the indices of the matrix I want to make, which is also 3838380 x 6, and contains the actual values of A, indexed using the values of B. For example, the 4th row of B is
[1 2 3 4 5 9].

For all 3838380 rows of matrix B, I need to create a matrix C using A. This new matrix C uses a row of B as indices to retrieve the columns of A, producing a 6x6 matrix:

[1     2     3     4     5     9] --> [0.4178    0.6562   -0.8633    0.7979   -0.9162    0.8720
    0.4864    0.0149   -0.8301   -0.2927   -0.7153   -0.2214
    0.7994   -0.2677   -0.8633   -0.7596   -0.8468   -0.7657
   -0.8695   -0.5467   -0.1804    0.1382    0.4811   -0.5192
   -0.3282    0.0697   -0.7532    0.7501   -0.0869    0.3698
   -0.9913   -0.4210   -0.1140   -0.3029    0.3365    0.6785].

C is used to solve for x in Cx = d, where d is a 6x1 vector.

I am using a For loop to do this 3838380 times:

for j = 1:length(B) %
    C = A(:,B(j,:))
    x = C\d % d is a 6x1 vector - x returns a 1x6 vector as well
end

It currently takes roughly 3 minutes to do this. I want to know if there is a faster or vectorized way, perhaps creating a 3-dimensional matrix that is 6 x 3838380 x 6 (every entry is a 6x6 matrix)? I would still need to process each 6x6 matrix individually and store the returned vector to another matrix.

Failure / What do I do? I actually tried to create the 6 x 3838380 x 6 matrix, using C = A(:,B) entirely. It did not go well with my computer, nor did it go well with my brain -- I'm having trouble figuring out how breaking up the huge matrix would be any different from what I was doing before.

I feel that there is a way to do this without for loops.

Cheers!

答案1

得分: 1

这部分内容的中文翻译如下:

这可以通过创建一个大小为6×6×3838380的3D数组,其中包含所有6×6矩阵作为“页面”,然后使用pagemldivide(在R2022a中引入)一次解决所有线性系统:

% 示例数据
A = rand(6, 40);
d = rand(size(A,1), 1);
B = nchoosek(1:size(A,2), size(A,1));

% 未矢量化方法
tic
x = NaN(size(A,1), size(B,1)); % 初始化
for jj = 1:length(B) %
    C = A(:, B(jj,:));
    x(:, jj) = C\d;
end
toc

% 矢量化方法
tic
CC = reshape(A(:, B.'), size(A,1), size(A,1), []);
xx = permute(pagemldivide(CC, d), [1 3 2]);
toc

% 检查
isequal(x, xx)

矢量化版本似乎确实更快。我在R2022b中(使用Matlab Online)得到了以下结果:

经过的时间为10.310424秒。
经过的时间为0.812846秒。
ans =
  logical
   1
英文:

This can be vectorized by creating a 3D array of size 6×6×3838380 containing all the 6×6 matrices as "pages", and then using pagemldivide (introduced in R2022a) to solve all the linear systems at once:

% Example data
A = rand(6, 40);
d = rand(size(A,1), 1);
B = nchoosek(1:size(A,2), size(A,1));

% Non-vectorized approach
tic
x = NaN(size(A,1), size(B,1)); % initiallize
for jj = 1:length(B) %
    C = A(:, B(jj,:));
    x(:, jj) = C\d;
end
toc

% Vectorized approach
tic
CC = reshape(A(:, B.'), size(A,1), size(A,1), []);
xx = permute(pagemldivide(CC, d), [1 3 2]);
toc

% Check
isequal(x, xx)

The vectorized version seems to be faster indeed. I got these results in R2022b (using Matlab Online):

Elapsed time is 10.310424 seconds.
Elapsed time is 0.812846 seconds.
ans =
  logical
   1

huangapple
  • 本文由 发表于 2023年2月27日 08:10:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/75575812.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定