2023年2月27日 08:10:23go评论81阅读模式

英文:

MATLAB: Processing specific columns of a LARGE matrix using another matrix as the column keys/indices

问题

我有一个矩阵 A，它是 6 行 x 40 列。它填充有随机数。

我使用 nchoosek(1:40, 6) 来创建一个矩阵 B，其中包含所有可能的线性组合的列的索引；一个可能的组合是 (1,2,3,4,5,6)，另一个是 (1,2,3,4,5,40)。这个矩阵的大小是 3838380 x 6。

然而，矩阵 B 只提供了我想要创建的矩阵的索引，这个矩阵也是 3838380 x 6，并包含了 A 的实际值，使用 B 的值作为索引。例如，B 的第 4 行是
[1 2 3 4 5 9]。

对于矩阵 B 的所有 3838380 行，我需要使用 A 创建一个矩阵 C。这个新矩阵 C 使用 B 的一行作为索引来检索 A 的列，生成一个 6x6 的矩阵：

C 用于解 Cx = d，其中 d 是一个 6x1 向量。

我正在使用 For 循环来做这个 3838380 次：

for j = 1:length(B) % 
    C = A(:,B(j,:))
    x = C\d % d 是一个 6x1 向量 - x 也返回一个 1x6 向量
end

目前大约需要 3 分钟来完成这个过程。我想知道是否有更快或向量化的方法，也许创建一个 3 维矩阵，它是 6 x 3838380 x 6（每个条目都是一个 6x6 的矩阵）？我仍然需要分别处理每个 6x6 的矩阵，并将返回的向量存储到另一个矩阵中。

失败 / 我该怎么做？ 实际上，我尝试创建 6 x 3838380 x 6 矩阵，完全使用 C = A(:,B)。这对我的计算机和我的大脑都不太好 -- 我很难弄清楚如何分解这个巨大的矩阵与我以前做的有什么不同。

我觉得有一种方法可以做到这一点，而不需要使用 for 循环。

英文:

I have a matrix A which is 6 rows x 40 columns. It is populated with random numbers.
I used nchoosek(1:40, 6) to create a matrix B of the indices of all possible linear combinations of the columns; one possible combination is (1,2,3,4,5,6), and (1,2,3,4,5,40). The size of this matrix is 3838380 x 6.

However, matrix B only gives the indices of the matrix I want to make, which is also 3838380 x 6, and contains the actual values of A, indexed using the values of B. For example, the 4th row of B is
[1 2 3 4 5 9].

For all 3838380 rows of matrix B, I need to create a matrix C using A. This new matrix C uses a row of B as indices to retrieve the columns of A, producing a 6x6 matrix:

[1     2     3     4     5     9] --&gt; [0.4178    0.6562   -0.8633    0.7979   -0.9162    0.8720
    0.4864    0.0149   -0.8301   -0.2927   -0.7153   -0.2214
    0.7994   -0.2677   -0.8633   -0.7596   -0.8468   -0.7657
   -0.8695   -0.5467   -0.1804    0.1382    0.4811   -0.5192
   -0.3282    0.0697   -0.7532    0.7501   -0.0869    0.3698
   -0.9913   -0.4210   -0.1140   -0.3029    0.3365    0.6785].

C is used to solve for x in Cx = d, where d is a 6x1 vector.

I am using a For loop to do this 3838380 times:

for j = 1:length(B) %
    C = A(:,B(j,:))
    x = C\d % d is a 6x1 vector - x returns a 1x6 vector as well
end

It currently takes roughly 3 minutes to do this. I want to know if there is a faster or vectorized way, perhaps creating a 3-dimensional matrix that is 6 x 3838380 x 6 (every entry is a 6x6 matrix)? I would still need to process each 6x6 matrix individually and store the returned vector to another matrix.

Failure / What do I do? I actually tried to create the 6 x 3838380 x 6 matrix, using C = A(:,B) entirely. It did not go well with my computer, nor did it go well with my brain -- I'm having trouble figuring out how breaking up the huge matrix would be any different from what I was doing before.

I feel that there is a way to do this without for loops.

Cheers!

答案1

得分: 1

这部分内容的中文翻译如下：

这可以通过创建一个大小为6×6×3838380的3D数组，其中包含所有6×6矩阵作为“页面”，然后使用pagemldivide（在R2022a中引入）一次解决所有线性系统：

% 示例数据
A = rand(6, 40);
d = rand(size(A,1), 1);
B = nchoosek(1:size(A,2), size(A,1));

% 未矢量化方法
tic
x = NaN(size(A,1), size(B,1)); % 初始化
for jj = 1:length(B) %
    C = A(:, B(jj,:));
    x(:, jj) = C\d;
end
toc

% 矢量化方法
tic
CC = reshape(A(:, B.'), size(A,1), size(A,1), []);
xx = permute(pagemldivide(CC, d), [1 3 2]);
toc

% 检查
isequal(x, xx)

矢量化版本似乎确实更快。我在R2022b中（使用Matlab Online）得到了以下结果：

经过的时间为10.310424秒。
经过的时间为0.812846秒。
ans =
  logical
   1

英文:

This can be vectorized by creating a 3D array of size 6×6×3838380 containing all the 6×6 matrices as "pages", and then using pagemldivide (introduced in R2022a) to solve all the linear systems at once:

% Example data
A = rand(6, 40);
d = rand(size(A,1), 1);
B = nchoosek(1:size(A,2), size(A,1));

% Non-vectorized approach
tic
x = NaN(size(A,1), size(B,1)); % initiallize
for jj = 1:length(B) %
    C = A(:, B(jj,:));
    x(:, jj) = C\d;
end
toc

% Vectorized approach
tic
CC = reshape(A(:, B.&#39;), size(A,1), size(A,1), []);
xx = permute(pagemldivide(CC, d), [1 3 2]);
toc

% Check
isequal(x, xx)

The vectorized version seems to be faster indeed. I got these results in R2022b (using Matlab Online):

Elapsed time is 10.310424 seconds.
Elapsed time is 0.812846 seconds.
ans =
  logical
   1

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

MATLAB：使用另一个矩阵作为列键/索引来处理大矩阵的特定列

问题

答案1

如何分配一个由两个向量定义大小的数组？

如何在输出持续运行时进行输入

如何添加条件，以便我不进入循环内部。

你应该使用哪个4×4矩阵乘法函数的变体？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论