英文:
Numeric calculation of nullspace with Sympy is slow
问题
I want to obtain the bases for kernel (nullspace) of a matrix using the Row Reduced Echelon Form (RREF).
Although scipy has the function to calculate nullspace, it does not give me sparse vectors because it uses singular value decomposition (SVD).
# make an example
import random
import numpy as np
random.seed(1)
m=100
n=120
matrix = [1 if random.uniform(0, 1) > 0.9 else 0 for i in range (m*n)]
matrix =np.array (matrix).reshape((m, n))
# calculate nullspace using scipy
from scipy import linalg
ns2=linalg.null_space(matrix)
print (ns2) # This is not sparse, which is not what I want to obtain.
Instead, I used the function in sympy.
import sympy
ns2=sympy.Matrix(matrix).nullspace()
if ns2:
ns2=np.concatenate([np.array(vec, dtype=float) for vec in ns2], axis=1)
print (ns2)
However, this takes a lot of time, probably because Sympy is struggling to find the algebraic solution.
So, I tried to use lambdify, which converts SymPy expressions into NumPy functions. This can be wrapped with numba to make them faster, as suggested here: https://github.com/numba/numba/issues/4795
I tried to convert SymPy expressions into NumPy functions
f = sympy.lambdify(x, sympy.Matrix(np.array(x)).nullspace(), "numpy")
but this did not work.
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last) in ----> 1 f = sympy.lambdify(x, sympy.Matrix(np.array(x)).nullspace(), "numpy")
/opt/anaconda3/lib/python3.8/site-packages/sympy/matrices/dense.py in new(cls, *args, **kwargs) 297 298 def new(cls, *args, **kwargs): --> 299 return cls._new(*args, **kwargs) 300 301 @classmethod
/opt/anaconda3/lib/python3.8/site-packages/sympy/matrices/dense.py in _new(cls, copy, *args, **kwargs) 308 rows, cols, flat_list = args 309 else: --> 310 rows, cols, flat_list = cls._handle_creation_inputs(*args, **kwargs) 311 flat_list = list(flat_list) # create a shallow copy 312 self = object.new(cls)
/opt/anaconda3/lib/python3.8/site-packages/sympy/matrices/matrices.py in _handle_creation_inputs(cls, *args, **kwargs) 982 # Matrix(numpy.ones((2, 2))) 983 elif hasattr(args[0], "array"): --> 984 return cls._handle_ndarray(args[0]) 985 986 # Matrix([1, 2, 3]) or Matrix([[1, 2], [3, 4]])
/opt/anaconda3/lib/python3.8/site-packages/sympy/matrices/matrices.py in _handle_ndarray(cls, arg) 906 return arr.shape[0], 1, flat_list 907 else: --> 908 raise NotImplementedError( 909 "SymPy supports just 1D and 2D matrices") 910
NotImplementedError: SymPy supports just 1D and 2D matrices
Is there any way to speed up the calculation?
<details>
<summary>英文:</summary>
I want to obtain the bases for kernel (nullspace) of a matrix using the Row Reduced Echelon Form (RREF).
Although scipy has the function to calculate nullspace, it does not give me sparse vectors because it uses singular value decomposition (SVD).
#make an example
import random
import numpy as np
random.seed(1)
m=100
n=120
matrix = [1 if random.uniform(0, 1)>0.9 else 0 for i in range (m*n)]
matrix =np.array (matrix).reshape((m, n))
#calculate nullspace using scipy
from scipy import linalg
ns2=linalg.null_space(matrix)
print (ns2)#This is not sparse, which is not what I want to obtain.
Instead, I used the function in sympy.
import sympy
ns2=sympy.Matrix(matrix).nullspace()
if ns2:
ns2=np.concatenate([np.array(vec, dtype=float) for vec in ns2], axis=1)
print (ns2)
However, this takes a lot of time, probably because Sympy are struggling to find the algebraic solution.
So, I tried to use lambdify, which converts SymPy expressions into NumPy functions. This canbe wrapped with numba to make them faster, as suggested here. https://github.com/numba/numba/issues/4795
I tried to convert SymPy expressions into NumPy functions
f = sympy.lambdify(x, sympy.Matrix(np.array(x)).nullspace() , "numpy")
but this did not work.
NotImplementedError Traceback (most recent call last) in ----> 1 f = sympy.lambdify(x, sympy.Matrix(np.array(x)).nullspace() , "numpy")
/opt/anaconda3/lib/python3.8/site-packages/sympy/matrices/dense.py in new(cls, *args, **kwargs) 297 298 def new(cls, *args, **kwargs): --> 299 return cls._new(*args, **kwargs) 300 301 @classmethod
/opt/anaconda3/lib/python3.8/site-packages/sympy/matrices/dense.py in _new(cls, copy, *args, **kwargs) 308 rows, cols, flat_list = args 309 else: --> 310 rows, cols, flat_list = cls._handle_creation_inputs(*args, **kwargs) 311 flat_list = list(flat_list) # create a shallow copy 312 self = object.new(cls)
/opt/anaconda3/lib/python3.8/site-packages/sympy/matrices/matrices.py in _handle_creation_inputs(cls, *args, **kwargs) 982 # Matrix(numpy.ones((2, 2))) 983 elif hasattr(args[0], "array"): --> 984 return cls._handle_ndarray(args[0]) 985 986 # Matrix([1, 2, 3]) or Matrix([[1, 2], [3, 4]])
/opt/anaconda3/lib/python3.8/site-packages/sympy/matrices/matrices.py in _handle_ndarray(cls, arg) 906 return arr.shape[0], 1, flat_list 907 else: --> 908 raise NotImplementedError( 909 "SymPy supports just 1D and 2D matrices") 910
NotImplementedError: SymPy supports just 1D and 2D matrices
Is there any ways to speed up the calculation?
</details>
# 答案1
**得分**: 2
SymPy具有更快的矩阵表示(DomainMatrix),可以更快地执行此操作。将来,它将自动用于像这样的操作,但现在您需要显式使用它。它提供与`Matrix.nullspace`方法相同的输出,只是速度更快。
以下是使用较小矩阵演示DomainMatrix的示例:
```python
In [72]: import sympy
In [73]: M = sympy.randMatrix(10, 12, min=0, max=1, percent=90)
In [74]: M
Out[74]:
⎡1 0 1 1 0 1 1 0 0 0 0 1⎤
⎢ ⎥
⎢0 1 1 0 1 1 0 1 0 1 0 1⎥
⎢ ⎥
⎢0 0 0 0 0 0 0 1 0 0 1 0⎥
⎢ ⎥
⎢1 0 0 0 0 1 1 1 0 0 0 0⎥
⎢ ⎥
⎢1 0 1 0 1 0 1 0 1 0 1 0⎥
⎢ ⎥
⎢0 0 0 0 0 0 1 1 1 0 1 0⎥
⎢ ⎥
⎢0 0 1 0 1 0 1 1 0 0 0 0⎥
⎢ ⎥
⎢1 1 0 0 0 0 0 0 0 1 0 0⎥
⎢ ⎥
⎢1 1 0 0 1 1 0 0 1 0 1 0⎥
⎢ ⎥
⎣0 1 0 0 1 0 1 0 0 1 0 1⎦
In [75]: M.nullspace()
Out[75]:
⎡⎡-3/4 ⎤ ⎡1/4 ⎤⎤
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢ 5/2 ⎥ ⎢3/2 ⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢ 7/4 ⎥ ⎢3/4 ⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢-11/4⎥ ⎢-7/4⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢ -2 ⎥ ⎢ -1 ⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢ 1/2 ⎥ ⎢-1/2⎥⎥
⎢⎢ ⎥, ⎢ ⎥⎥
⎢⎢ 5/4 ⎥ ⎢1/4 ⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢ -1 ⎥ ⎢ 0 ⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢-5/4 ⎥ ⎢-1/4⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢-7/4 ⎥ ⎢-7/4⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢ 1 ⎥ ⎢ 0 ⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎣⎣ 0 ⎦ ⎣ 1 ⎦⎦
In [76]: from sympy.polys.matrices import DomainMatrix
In [77]: dM = DomainMatrix.from_Matrix(M)
In [78]: dM.to_field().nullspace().transpose().to_Matrix()
Out[78]:
⎡-3/4 1/4 ⎤
⎢ ⎥
⎢ 5/2 3/2 ⎥
⎢ ⎥
⎢ 7/4 3/4 ⎥
⎢ ⎥
⎢-11/4 -7/4⎥
⎢ ⎥
⎢ -2 -1 ⎥
⎢ ⎥
⎢ 1/2 -1/2⎥
⎢ ⎥
⎢ 5/4 1/4 ⎥
⎢ ⎥
⎢ -1 0 ⎥
⎢ ⎥
⎢-5/4 -1/4⎥
⎢ ⎥
⎢-7/4 -7/4⎥
⎢ ⎥
⎢ 1 0 ⎥
⎢ ⎥
⎣ 0 1 ⎦
计时比
英文:
SymPy has a faster representation of matrices (DomainMatrix) that can do this much more quickly. In future it will be used automatically for an operation like this but for now you would need to use it explicitly. It gives the same output that you would get from the Matrix.nullspace
method except a lot faster.
Here is a demonstration of using DomainMatrix with a smaller matrix:
In [72]: import sympy
In [73]: M = sympy.randMatrix(10, 12, min=0, max=1, percent=90)
In [74]: M
Out[74]:
⎡1 0 1 1 0 1 1 0 0 0 0 1⎤
⎢ ⎥
⎢0 1 1 0 1 1 0 1 0 1 0 1⎥
⎢ ⎥
⎢0 0 0 0 0 0 0 1 0 0 1 0⎥
⎢ ⎥
⎢1 0 0 0 0 1 1 1 0 0 0 0⎥
⎢ ⎥
⎢1 0 1 0 1 0 1 0 1 0 1 0⎥
⎢ ⎥
⎢0 0 0 0 0 0 1 1 1 0 1 0⎥
⎢ ⎥
⎢0 0 1 0 1 0 1 1 0 0 0 0⎥
⎢ ⎥
⎢1 1 0 0 0 0 0 0 0 1 0 0⎥
⎢ ⎥
⎢1 1 0 0 1 1 0 0 1 0 1 0⎥
⎢ ⎥
⎣0 1 0 0 1 0 1 0 0 1 0 1⎦
In [75]: M.nullspace()
Out[75]:
⎡⎡-3/4 ⎤ ⎡1/4 ⎤⎤
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢ 5/2 ⎥ ⎢3/2 ⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢ 7/4 ⎥ ⎢3/4 ⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢-11/4⎥ ⎢-7/4⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢ -2 ⎥ ⎢ -1 ⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢ 1/2 ⎥ ⎢-1/2⎥⎥
⎢⎢ ⎥, ⎢ ⎥⎥
⎢⎢ 5/4 ⎥ ⎢1/4 ⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢ -1 ⎥ ⎢ 0 ⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢-5/4 ⎥ ⎢-1/4⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢-7/4 ⎥ ⎢-7/4⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎢⎢ 1 ⎥ ⎢ 0 ⎥⎥
⎢⎢ ⎥ ⎢ ⎥⎥
⎣⎣ 0 ⎦ ⎣ 1 ⎦⎦
In [76]: from sympy.polys.matrices import DomainMatrix
In [77]: dM = DomainMatrix.from_Matrix(M)
In [78]: dM.to_field().nullspace().transpose().to_Matrix()
Out[78]:
⎡-3/4 1/4 ⎤
⎢ ⎥
⎢ 5/2 3/2 ⎥
⎢ ⎥
⎢ 7/4 3/4 ⎥
⎢ ⎥
⎢-11/4 -7/4⎥
⎢ ⎥
⎢ -2 -1 ⎥
⎢ ⎥
⎢ 1/2 -1/2⎥
⎢ ⎥
⎢ 5/4 1/4 ⎥
⎢ ⎥
⎢ -1 0 ⎥
⎢ ⎥
⎢-5/4 -1/4⎥
⎢ ⎥
⎢-7/4 -7/4⎥
⎢ ⎥
⎢ 1 0 ⎥
⎢ ⎥
⎣ 0 1 ⎦
Timing comparisons:
In [89]: for m in [10, 20, 30, 40]:
...: M = sympy.randMatrix(m, int(m*1.2), min=0, max=1, percent=90)
...: dM = DomainMatrix.from_Matrix(M).to_field()
...: print()
...: print()
...: print('Size:', M.shape)
...: print()
...: print('DomainMatrix.nullspace():')
...: %time ok = dM.nullspace()
...: print()
...: print('Matrix.nullspace():')
...: %time ok = M.nullspace()
...:
Size: (10, 12)
DomainMatrix.nullspace():
CPU times: user 1.28 ms, sys: 37 µs, total: 1.32 ms
Wall time: 1.36 ms
Matrix.nullspace():
CPU times: user 30.7 ms, sys: 0 ns, total: 30.7 ms
Wall time: 30.3 ms
Size: (20, 24)
DomainMatrix.nullspace():
CPU times: user 1.88 ms, sys: 0 ns, total: 1.88 ms
Wall time: 1.89 ms
Matrix.nullspace():
CPU times: user 242 ms, sys: 3.94 ms, total: 246 ms
Wall time: 246 ms
Size: (30, 36)
DomainMatrix.nullspace():
CPU times: user 9.27 ms, sys: 0 ns, total: 9.27 ms
Wall time: 9.28 ms
Matrix.nullspace():
CPU times: user 1min 19s, sys: 133 ms, total: 1min 19s
Wall time: 1min 19s
Size: (40, 48)
DomainMatrix.nullspace():
CPU times: user 24.3 ms, sys: 4 µs, total: 24.3 ms
Wall time: 24.3 ms
Matrix.nullspace():
^C---------------------------------------------------------------------------
KeyboardInterrupt
I didn't wait for the last result here but you can see that for 30x36 the time difference is 9 milliseconds compared to 1 minute so it is much faster for larger matrices.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论