英文:
Strassen matrix multiplication in python
问题
def matrix_addition(A, B):
# Check if matrices have the same size
if len(A) != len(B) or len(A[0]) != len(B[0]):
raise ValueError("Matrices must have the same size")
# Initialize result matrix with zeros
result = [[0 for col in range(len(A[0]))] for row in range(len(A))]
# Add matrices element-wise
for row in range(len(A)):
for col in range(len(A[0])):
result[row][col] = A[row][col] + B[row][col]
return result
def matrix_subtraction(A, B):
# Check if matrices have the same size
if len(A) != len(B) or len(A[0]) != len(B[0]):
raise ValueError("Matrices must have the same size")
# Initialize result matrix with zeros
result = [[0 for col in range(len(A[0]))] for row in range(len(A))]
# Subtract matrices element-wise
for row in range(len(A)):
for col in range(len(A[0])):
result[row][col] = A[row][col] - B[row][col]
return result
def strassen(a, b):
if len(a) == 1 and len(a[0]) == 1:
return [[a[0][0] * b[0][0]]]
# Check if matrices have the same size
if len(a) != len(a[0]) or len(b) != len(b[0]) or len(a) != len(b):
raise ValueError("Matrices must be square and of the same size")
# Check if the size is a power of 2
size = len(a)
if size & (size - 1) != 0:
raise ValueError("Matrix size must be a power of 2")
# Function to divide a matrix into four submatrices
def divide(matrix):
size = len(matrix)
mid = size // 2
quad1 = [row[:mid] for row in matrix[:mid]]
quad2 = [row[mid:] for row in matrix[:mid]]
quad3 = [row[:mid] for row in matrix[mid:]]
quad4 = [row[mid:] for row in matrix[mid:]]
return quad1, quad2, quad3, quad4
# Function to combine four submatrices into a single matrix
def combine_submatrices(quad1, quad2, quad3, quad4):
size = len(quad1)
mid = size
result = [[0 for _ in range(2 * mid)] for _ in range(2 * mid)]
for i in range(mid):
for j in range(mid):
result[i][j] = quad1[i][j]
result[i][j + mid] = quad2[i][j]
result[i + mid][j] = quad3[i][j]
result[i + mid][j + mid] = quad4[i][j]
return result
quad1_a, quad2_a, quad3_a, quad4_a = divide(a)
quad1_b, quad2_b, quad3_b, quad4_b = divide(b)
p1 = strassen(matrix_addition(quad1_a, quad4_a), matrix_addition(quad1_b, quad4_b))
p2 = strassen(matrix_addition(quad3_a, quad4_a), quad1_b)
p3 = strassen(quad1_a, matrix_subtraction(quad2_b, quad4_b))
p4 = strassen(quad4_a, matrix_subtraction(quad3_b, quad1_b))
p5 = strassen(matrix_addition(quad1_a, quad2_a), quad4_b)
p6 = strassen(matrix_subtraction(quad3_a, quad1_a), matrix_addition(quad1_b, quad2_b))
p7 = strassen(matrix_subtraction(quad2_a, quad4_a), matrix_addition(quad3_b, quad4_b))
final_quad1 = matrix_subtraction(
matrix_addition(p1, p4),
matrix_addition(
matrix_subtraction(
matrix_addition(p5, p7),
p2
),
p6
)
)
final_quad2 = matrix_addition(p3, p5)
final_quad3 = matrix_addition(p2, p4)
final_quad4 = matrix_subtraction(
matrix_addition(p1, p3),
matrix_addition(
matrix_subtraction(p5, p2),
p7
)
)
return combine_submatrices(final_quad1, final_quad2, final_quad3, final_quad4)
A = [[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16]]
B = [[17, 18, 19, 20],
[21, 22, 23, 24],
[25, 26, 27, 28],
[29, 30, 31, 32]]
result = strassen(A, B)
for row in result:
print(row)
I've made some modifications to your code to address the issues you were facing. This code should perform matrix multiplication using the Strassen algorithm for 2^n sized matrices and print the result as you expected.
英文:
def matrix_addition(A, B):
# Check if matrices have the same size
if len(A) != len(B) or len(A[0]) != len(B[0]):
raise ValueError("Matrices must have the same size")
# Initialize result matrix with zeros
result = [[0 for col in range(len(A[0]))] for row in range(len(A))]
# Add matrices element-wise
for row in range(len(A)):
for col in range(len(A[0])):
result[row][col] = A[row][col] + B[row][col]
return result
def matrix_subtraction(A, B):
# Check if matrices have the same size
if len(A) != len(B) or len(A[0]) != len(B[0]):
raise ValueError("Matrices must have the same size")
# Initialize result matrix with zeros
result = [[0 for col in range(len(A[0]))] for row in range(len(A))]
# Subtract matrices element-wise
for row in range(len(A)):
for col in range(len(A[0])):
result[row][col] = A[row][col] - B[row][col]
return result
def strassen(a, b):
if len(A) == 1 and len(A[0]) == 1:
return a[0][0] * b[0][0]
else:
# divide into quadrants
quad1_a, quad2_a, quad3_a, quad4_a = divide(a)
quad1_b, quad2_b, quad3_b, quad4_b = divide(b)
#break into parts to compute
p1 = strassen(matrix_addition(quad1_a, quad4_a), matrix_addition(quad1_b, quad4_b))
p2 = strassen(matrix_addition(quad3_a + quad4_a), quad1_b)
p3 = strassen(quad1_a, matrix_subtraction(quad3_b, quad1_b))
p4 = strassen(quad4_a, matrix_subtraction(quad3_b, quad1_b))
p5 = strassen(matrix_addition(quad1_a, quad2_a), quad4_b)
p6 = strassen(matrix_subtraction(quad3_a, quad1_a), matrix_addition(quad1_b, quad2_b))
p7 = strassen(matrix_subtraction(quad2_a, quad4_a), matrix_addition(quad3_b, quad4_b))
# create the final matrix
final_quad1 = matrix_subtraction(matrix_addition(p1, p4), matrix_addition(p5, p7))
final_quad2 = matrix_addition(p3, p5)
final_quad3 = matrix_addition(p2, p4)
final_quad4 = matrix_addition(matrix_subtraction(p1, p2), matrix_addition(p3, p6))
resultant_matrix = combine_submatrices(final_quad1, final_quad2, final_quad3, final_quad4)
return resultant_matrix
its a basic implementation of the strassen algorithm i have tested all the secondary function and they work but joined together i keep running into problems.
the strassen function is supposed to take 2 2d arrays of 2^n size for the above code i used the arrays
A = [[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16]]
B = [[17, 18, 19, 20],
[21, 22, 23, 24],
[25, 26, 27, 28],
[29, 30, 31, 32]]
the result should be this
C = [[250, 260, 270, 280],
[618, 644, 670, 696],
[986, 1028, 1070, 1112],
[1354, 1412, 1470, 1528]]
i have ran the code multiple times and i get into the problem
raise ValueError("Matrices must have the same size")
ValueError: Matrices must have the same size
Process finished with exit code 1
if i turn the exception code off i run into a different problem
Traceback (most recent call last):
File "strassen_matrix.py", line 112, in <module>
print(strassen(A, B))
^^^^^^^^^^^^^^
File "strassen_matrix.py", line 97, in strassen
p1 = strassen(matrix_addition(quad1_a, quad4_a), matrix_addition(quad1_b, quad4_b))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "strassen_matrix.py", line 97, in strassen
p1 = strassen(matrix_addition(quad1_a, quad4_a), matrix_addition(quad1_b, quad4_b))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "strassen_matrix.py", line 97, in strassen
p1 = strassen(matrix_addition(quad1_a, quad4_a), matrix_addition(quad1_b, quad4_b))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[Previous line repeated 994 more times]
File "strassen_matrix.py", line 95, in strassen
quad1_a, quad2_a, quad3_a, quad4_a = divide(a)
^^^^^^^^^
File "strassen_matrix.py", line 20, in divide
for x in range(int(len(matrix) / 2)):
^^^^^^^^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded while calling a Python objectquer
i tried increasing the recursion limit aswell but still same issue im stumped as to how to fix this any help is appreciated
答案1
得分: 2
-
if len(A) == 1 and len(A[0]) == 1
引用了全局名称A
和B
,而不是参数名称a
和b
(下一条语句也是如此)。通过将主要逻辑也放入函数中,使其A
和B
也成为局部变量,避免名称混淆。 -
return A[0][0] * B[0][0]
返回一个数字,而调用者期望得到一个矩阵。因此(连同上面的修复),应该是return [[a[0][0] * b[0][0]]]
-
在
p2
的公式中,你有一个拼写错误:matrix_addition(quad3_a + quad4_a)
应该是:matrix_addition(quad3_a, quad4_a)
-
p3 = strassen(quad1_a, matrix_subtraction(quad3_b, quad1_b))
用相反的符号计算了p3
。你会想要p3 = strassen(quad1_a, matrix_subtraction(quad3_b, quad1_b))
-
final_quad1 = matrix_subtraction(matrix_addition(p1, p4), matrix_addition(p5, p7))
执行了两个项的减法,而不是一个。修正为final_quad1 = matrix_addition(matrix_subtraction(matrix_addition(p1, p4), p5), p7)
修复这些问题后,它将正常工作。
英文:
You have a few errors:
-
if len(A) == 1 and len(A[0]) == 1
references the global namesA
andB
instead of the parameter namesa
andb
(also in next statement). Avoid such confusion of names by putting the main logic also in a function, so that itsA
andB
are also local. -
return A[0][0] * B[0][0]
returns a number, while the caller expects a matrix. So (together with above fix), it should bereturn [[a[0][0] * b[0][0]]]
-
In the formula for
p2
you have a typo:matrix_addition(quad3_a + quad4_a)
should be:matrix_addition(quad3_a, quad4_a)
-
p3 = strassen(quad1_a, matrix_subtraction(quad3_b, quad1_b))
calculatesp3
with the opposite sign. You'll wantp3 = strassen(quad1_a, matrix_subtraction(quad3_b, quad1_b))
-
final_quad1 = matrix_subtraction(matrix_addition(p1, p4), matrix_addition(p5, p7))
performs a subtraction of two terms instead of just one. Correct tofinal_quad1 = matrix_addition(matrix_subtraction(matrix_addition(p1, p4), p5), p7)
After fixing those issues, it will work.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论