为什么使用枢轴选取代码会导致LU分解失败?

huangapple go评论52阅读模式
英文:

Why does pivoting code bring failure to my LU decomposition?

问题

Sure, here's the translated content without the code parts:

Before speaking, I am writing this with a translator, not an English speaker. Therefore, I ask for your understanding even if my English is terrible.

After learning how to write a non-pivoting LU-decomposition code, I decided to try to write an LU-decomposition code that uses pivoting for homework. Since my non-pivoting code went well as below, so I thought it would be easy to make a pivoting LU-decomposition code. But it went bad, really bad...

My code is following. Please let me know where this code fails. I don't know why such a serious failure occurs even though I only put in the pivot part.

I tested with a 4x4 matrix A like this, and the answer comes like this. Really bad.

Tell me which part is the problem. All kinds of help are welcome. I will thank you in advance.

英文:

Before speaking, I am writing this with a translator, not an English speaker. Therefore, I ask for your understanding even if my English is terrible.

After learning how to write a non-pivoting LU-decomposition code, I decided to try to write a LU-decomposition code that uses pivoting for homework.
Since my non-pivoting code went well as below, so I thought it would be easy to make a pivoting LU-decomposition code.

But it went bad, really bad...

% non-pivoting code
function x = mylu_np(A)
    [m,~] = size(A);
    L = eye(m); U = A(:,:);
    for i = 1:m-1
        for j = i+1:m
            L(j,i) = U(j,i)/U(i,i);
            U(j,i) = 0;
            U(j,i+1:m) = U(j,i+1:m) - L(j,i)*U(i,i+1:m);
        end
    end
    x = [L U];
end

My code is following. Please let me know where this code fails. I don't know why such a serious failure occurs even though I only put in the pivot part.

function x = mylu_pp(A)
    [m,~] = size(A);
    P = eye(m); L = eye(m); U = A(:,:);
    for i = 1:m-1
        for j = i+1:m
            if abs(U(i,i)) < abs(U(j,i))
                U([i j],:) = U([j i],:);
                P([i j],:) = P([j i],:);
            end
        end
        for j = i+1:m
            L(j,i) = U(j,i)/U(i,i);
            U(j,i) = 0;
            U(j,i+1:m) = U(j,i+1:m) - L(j,i)*U(i,i+1:m);
        end
    end
    x = [P L U];
end

I tested with 4*4 matrix A like this

A = [0.0001 -5.0300 5.8090 7.8320;
    2.2660 1.9950 1.2120 8.0080;
    8.8500 5.6810 4.5520 1.3020;
    6.7750 -2.2530 2.9080 3.9700];
W = mylu_pp(A)
P = W(1:4,1:4); L = W(1:4,5:8); U = W(1:4,9:12);
P*A - L*U

and the answer comes like this. Really bad.

ans = [0.0000 0.0000 0.0000 0.0000;
    6.7749 4.3489 3.4847 0.9967;
    -2.2659 -7.0250 -1.6521 2.1754
    -4.5090 2.6761 -1.8326 -3.1721]

Tell me which part is the problem. All kinds of help are welcome. I will thank you in advance.

答案1

得分: 2

The issue is related to the matrix L. When swapping rows i and j in matrix U, we also have to swap the elements in the rows i and j of matrix L, but only the elements that were updated earlier (not the entire rows):

if i > 1
    L([i j], 1:i-1) = L([j i], 1:i-1);
end

The way I am thinking about it is equivalent to renaming the variables in the equation system. Assume we have the equation system: A*v = b. Using LU decomposition, we are solving the equation as: L*U*v = b.

Assume:

[x
 v =  y
      z
      w]

After swapping two rows in matrix U, we have to keep equivalence to the original equation system. The way to keep equivalence is renaming the elements in v.

When swapping row 2 with row 4, we may swap the names of y and w and get new v:

[x
 v0 =  w
      z
      y]

The equivalence to name swapping is:

  • Swapping the two rows in matrix U.
  • Swapping the two rows in matrix P.
    P*A reorders the rows of A to the new "swapped order".
  • We also have to swap the elements in the rows of matrix L, but only the elements that were updated earlier (not the entire row). Swapping the relevant elements in rows 2 and 4 of L follows the equivalence of swapping the names of y and w.

Code sample:

A = [0.0001 -5.0300 5.8090 7.8320;
     2.2660  1.9950 1.2120 8.0080;
     8.8500  5.6810 4.5520 1.3020;
     6.7750 -2.2530 2.9080 3.9700];
    
[P, L, U] = mylu_pp(A);
P*A - L*U

function [P, L, U] = mylu_pp(A)
    m = size(A, 1);
    P = eye(m); L = eye(m); U = A;
    for i = 1:m-1
        for j = i+1:m
            if abs(U(i,i)) < abs(U(j,i))
                U([i j],:) = U([j i],:);
                P([i j],:) = P([j i],:);
                    
                if i > 1
                    % Swap the relevant elements "1:i-1" in the row of matrix L (only the elements that were updated until now).
                    L([i j], 1:i-1) = L([j i], 1:i-1);
                end
            end
        end
        for j = i+1:m
            L(j,i) = U(j,i)/U(i,i);
            U(j,i) = 0;
            U(j,i+1:m) = U(j,i+1:m) - L(j,i)*U(i,i+1:m);
        end
    end
    %x = [P L U];
end

Output P*A - L*U:

ans =
    
     0     0     0     0
     0     0     0     0
     0     0     0     0
     0     0     0     0
英文:

The issue is related to the matrix L.

When swapping rows i and j in matrix U, we also have to swap the elements in the rows i and j of matrix L, but only the elements that were updated earlier (not the entire rows):

if i &gt; 1
    L([i j], 1:i-1) = L([j i], 1:i-1);
end

The way I am thinking about it is equivalent to renaming the variables in the equation system.
I am not a mathematician, and my explanation is not going to be very good...

Assume we have the equation system: A*v = b.
Using LU decomposition we are solving the equation as: L*U*v = b.

Assume:

    [x
v =  y
     z
     w]

After swapping two rows in matrix U, we have to keep equivalence to the original equation system.
The way of keep equivalence is renaming the elements in v.

When swapping row 2 with row 4, we may swap the names of y and w, and get new v:

     [x
v0 =  w
      z
      y]

The equivalence to names swapping is:

  • Swapping the two rows in matrix U.
  • Swapping the two rows in matrix P.
    P*A reorder the rows of A to the new "swapped order".
  • We also have to swap the elements in the rows of matrix L, but only the elements that were updated earlier (not the entire row).
    Swapping the relevant elements in rows 2 and 4 of L follows the equivalence of swapping the names of y and w.

Code sample:

A = [0.0001 -5.0300 5.8090 7.8320;
     2.2660  1.9950 1.2120 8.0080;
     8.8500  5.6810 4.5520 1.3020;
     6.7750 -2.2530 2.9080 3.9700];

[P, L, U] = mylu_pp(A);
P*A - L*U

function [P, L, U] = mylu_pp(A)
    m = size(A, 1);
    P = eye(m); L = eye(m); U = A;
    for i = 1:m-1
        for j = i+1:m
            if abs(U(i,i)) &lt; abs(U(j,i))
                U([i j],:) = U([j i],:);
                P([i j],:) = P([j i],:);
                
                if i &gt; 1
                    % Swap the relevant elements &quot;1:i-1&quot; in the row of matrix L (only the elements that were updated until now).
                    L([i j], 1:i-1) = L([j i], 1:i-1);
                end
            end
        end
        for j = i+1:m
            L(j,i) = U(j,i)/U(i,i);
            U(j,i) = 0;
            U(j,i+1:m) = U(j,i+1:m) - L(j,i)*U(i,i+1:m);
        end
    end
    %x = [P L U];
end

Output P*A - L*U:

<!-- language: lang-none -->

ans =

     0     0     0     0
     0     0     0     0
     0     0     0     0
     0     0     0     0

huangapple
  • 本文由 发表于 2023年4月4日 11:30:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/75925305.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定