英文:
Determining left-eigenvectors and eigenvalues of 100 x 100 matrix
问题
I am here to provide the translated content only:
尝试确定一个100x100矩阵的(真实)左特征向量和特征值。矩阵T的元素是从0到1的概率。矩阵T的每一行都总和为1.0(100%概率)。列不必总和为任何特定值。
SciPy和NumPy库返回复数 - 这显然是因为不能将特征多项式求解为实数或算法被优化为这样做。我转向了SymPy库,它也返回复数作为解决方案。
按照此处描述的方式使用SymPy是有道理的:
https://stackoverflow.com/questions/65170764/is-it-possible-somehow-to-find-complex-eigenvalues-using-sympy
我理解这将强制将答案作为实数。
这是我的代码:
import numpy as np
from sympy import *
T = np.loadtxt('rep10_T_ij.dat', delimiter=' ')
T = np.asmatrix(T)
T = Matrix(T)
lam = symbols('λ')
char_poly = det(lam * eye(T.rows) - T)
roots_char_poly = solve(char_poly, lam)
display(char_poly)
display(roots(poly(char_poly, domain=CC)))
display([N(r) for r in roots_char_poly])
然而,无论投入多少硬件资源,这都在12小时内没有输出答案。
我在这里阅读到了这个问题:
https://github.com/sympy/sympy/issues/7200
一个100x100矩阵是否太昂贵?
我可以在集群上增加更多硬件资源,但尚未尝试。是否有解决方案?
英文:
Am trying to determine the (real) left-eigenvectors and eigenvalues of a 100 by 100 matrix. Elements of the matrix, T, are probabilities from 0 to 1. Each row of the matrix, T, sums to 1.0 (100% probability). Columns do not have to sum to anything in particular.
The scipy and numpy libraries return complex numbers - which is apparently due to not being able to solve the characteristic polynomial as real numbers or the algorithms being optimised to do this. I turned to the sympy library, which also returned complex numbers as the solution.
Using sympy as described here makes sense:
https://stackoverflow.com/questions/65170764/is-it-possible-somehow-to-find-complex-eigenvalues-using-sympy
My understanding is that this will force answers as real numbers.
Here is my code:
import numpy as np
from sympy import *
T = np.loadtxt('rep10_T_ij.dat', delimiter=' ')
T = np.asmatrix(T)
T = Matrix(T)
lam = symbols('\lambda')
char_poly = det(lam * eye(T.rows) - T)
roots_char_poly = solve(char_poly, lam)
display(char_poly)
display(roots(poly(char_poly, domain=CC)))
display([N(r) for r in roots_char_poly])
However this does not output an answer in 12 hours with any amount of hardware thrown at it.
I read this here:
https://github.com/sympy/sympy/issues/7200
Is a 100 by 100 matrix too expensive?
I can throw more hardware at it on a cluster, but have not been able to try this yet. Is there a solution?
答案1
得分: 2
在这种情况下,使用SymPy而不是NumPy或SciPy的优势在于,如果您想要进行精确或符号计算。通常情况下,确定特征值确切地是实数将需要精确的算术运算。然而,精确或符号计算要比固定精度浮点数慢得多,因此您应该预计与使用NumPy或SciPy相比,事情会变得慢得多。但如果您的初始矩阵已经是浮点数矩阵,那么已经是近似的情况下,使用SymPy可能没有什么好处。
无论如何,以下是一些代码,可以显示如何使用SymPy更高效地获取矩阵的精确实数特征值:
from sympy import *
def make_matrix(N):
"""Random matrix of rational numbers with all integer eigenvalues"""
V = randMatrix(N)._rep.to_field()
eigs = randMatrix(1, N)
D = diag(*eigs)._rep
Vinv = V.inv()
M = V*D*Vinv
return M.to_Matrix()
def eigenvals(M):
x = symbols('x')
p = M._rep.charpoly()
return Poly(p, x).real_roots()
这使用Berkowicz算法来计算特征多项式,这是操作中最慢的部分(charpoly
方法)。对于大型矩阵而言,通过这种方式找到特征多项式要比使用行列式找到特征多项式要高效得多,但对于大矩阵来说仍然很慢。如果底层系数域的算术成本是常数时间的话,以这种方式找到特征多项式的复杂性大致为O(N**4)
,对于N x N
的密集矩阵。不幸的是,精确的有理数并不具有常数时间的算术成本,因此实际上,随着矩阵规模的增加,复杂性会显著变差。
您可以对不同大小的计时情况有一个大致的了解:
In [80]: M = make_matrix(5)
In [81]: %time eigenvals(M)
CPU times: user 43.8 ms, sys: 4 ms, total: 47.8 ms
Wall time: 46.6 ms
Out[82]: [14, 26, 46, 56, 82]
# 更大的矩阵
# ...
In [90]: M = make_matrix(40)
In [91]: %time eigenvals(M)
CPU times: user 37.8 s, sys: 12 µs, total: 37.8 s
Wall time: 37.9 s
Out[90]: [1, 7, 13, 14, 15, 17, 17, 20, 26, 26, 27, 28, 29, 32, 33, 35, 37, 43, 43, 44, 44, 46, 50, 50, 52, 52, 53, 55, 58, 64, 67, 70, 71, 72, 79, 83, 83, 88, 88, 99]
因此,已经找到一个40x40矩阵的精确整数特征值需要30秒。扩展到100x100将会很慢,但可能是可能的。如果您想要这样做,我建议安装gmpy2
,因为它可以使SymPy中的精确有理数运算更快。
然而,最有可能的情况是,像这样计算精确特征值不是您想要做的事情,您应该使用NumPy/SciPy。如果它们给出带有小虚部的特征值,那么您可以丢弃那些,如果您知道特征值应该是实数的话。否则,只要初始矩阵只是近似的(即浮点数),那么像上面显示的这种慢速精确计算可能不值得。
英文:
The advantage of using SymPy rather than NumPy or SciPy in this context is just if you want to perform the calculation exactly or symbolically. Determining that the eigenvalues are precisely real will require exact arithmetic in general. Exact or symbolic calculations are a lot slower than fixed precision floating point though so you should expect things to slow down a lot compared to using NumPy or SciPy. If your initial matrix is a matrix of floats though then it is already approximate in which case there is a good chance that there will be no benefit in using SymPy.
In any case here is some code that can show how to get the exact real eigenvalues of a matrix more efficiently using SymPy. This is usually only worth doing if your original matrix has exact rational numbers (not floats) but it should obtain precisely the real eigenvalues:
from sympy import *
def make_matrix(N):
"""Random matrix of rational numbers with all integer eigenvalues"""
V = randMatrix(N)._rep.to_field()
eigs = randMatrix(1, N)
D = diag(*eigs)._rep
Vinv = V.inv()
M = V*D*Vinv
return M.to_Matrix()
def eigenvals(M):
x = symbols('x')
p = M._rep.charpoly()
return Poly(p, x).real_roots()
This uses the Berkowicz algorithm to compute the characteristic polynomial which is the slowest part of the operation (the charpoly
method). This is much more efficient than using determinants to find the characteristic polynomial but still slow for large matrices. Finding the characteristic polynomial this way is roughly O(N**4)
for an N x N
dense matrix if the underlying coefficient field has O(1)
arithmetic cost. Unfortunately exact rational numbers do not have O(1)
arithmetic cost so in fact the complexity is significantly worse than O(N**4)
as you scale up to larger matrices.
You can get a sense for the timings involved for different sizes:
In [80]: M = make_matrix(5)
In [81]: M
Out[81]:
⎡-23268248970 17481165584 6611878976 21688404260 -332363048 ⎤
⎢───────────── ─────────── ────────── ─────────── ───────────⎥
⎢ 6173287 6173287 6173287 6173287 6173287 ⎥
⎢ ⎥
⎢-11620321902 8914715596 3291699676 10654796030 -168333588 ⎥
⎢───────────── ────────── ────────── ─────────── ───────────⎥
⎢ 6173287 6173287 6173287 6173287 6173287 ⎥
⎢ ⎥
⎢-17978224844 13433529668 5243135950 16686586148 -354021332 ⎥
⎢───────────── ─────────── ────────── ─────────── ───────────⎥
⎢ 6173287 6173287 6173287 6173287 6173287 ⎥
⎢ ⎥
⎢-10824267464 8003053808 3040726208 10253515930 -120066032 ⎥
⎢───────────── ────────── ────────── ─────────── ───────────⎥
⎢ 6173287 6173287 6173287 6173287 6173287 ⎥
⎢ ⎥
⎢-2998114302 2232013094 828083788 2727588566 239697782 ⎥
⎢──────────── ────────── ───────── ────────── ───────── ⎥
⎣ 6173287 6173287 6173287 6173287 6173287 ⎦
In [82]: %time eigenvals(M)
CPU times: user 43.8 ms, sys: 4 ms, total: 47.8 ms
Wall time: 46.6 ms
Out[82]: [14, 26, 46, 56, 82]
In [83]: M = make_matrix(10)
In [84]: %time eigenvals(M)
CPU times: user 82.7 ms, sys: 10 µs, total: 82.7 ms
Wall time: 81.8 ms
Out[84]: [8, 18, 27, 39, 58, 67, 85, 95, 96, 98]
In [85]: M = make_matrix(20)
In [86]: %time eigenvals(M)
CPU times: user 652 ms, sys: 2 µs, total: 652 ms
Wall time: 651 ms
Out[86]: [9, 15, 21, 22, 26, 26, 36, 37, 38, 40, 52, 57, 63, 64, 65, 66, 69, 70, 80, 96]
In [87]: M = make_matrix(30)
In [88]: %time eigenvals(M)
CPU times: user 6.1 s, sys: 0 ns, total: 6.1 s
Wall time: 6.1 s
Out[88]:
[7, 8, 10, 13, 15, 18, 26, 33, 34, 35, 36, 42, 42, 43, 43, 47, 49, 50, 52, 54, 56, 62, 72, 79, 85,
87, 89, 92, 98, 98]
In [89]: M = make_matrix(40)
In [90]: %time eigenvals(M)
CPU times: user 37.8 s, sys: 12 µs, total: 37.8 s
Wall time: 37.9 s
Out[90]:
[1, 7, 13, 14, 15, 17, 17, 20, 26, 26, 27, 28, 29, 32, 33, 35, 37, 43, 43, 44, 44, 46, 50, 50, 52,
52, 53, 55, 58, 64, 67, 70, 71, 72, 79, 83, 83, 88, 88, 99]
So already finding the exact integer eigenvalues of a 40x40 matrix takes 30 seconds. Scaling up to 100x100 will be slow but might be possible. I recommend having gmpy2
installed if you want to do this because it makes exact rational numbers in SymPy a lot faster.
Most likely though computing exact eigenvalues like this is not what you want to do and you should instead use NumPy/SciPy. If they give eigenvalues with small imaginary parts then you can discard those if you know that the eigenvalues should be real. Otherwise as long as the initial matrix is only approximate (i.e. floats) then the kind of slow exact computing shown above is probably not worthwhile.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论