我想计算一个大型稀疏矩阵(数十万行)的频谱(所有特征值)。这很难。
我愿意定约。有近似方法可以做到这一点吗?
虽然我希望对此问题有一个一般性的答案,但在以下特定情况下,我也会感到满意。我的矩阵是大图的规范化拉普拉斯算子。特征值将在0到2之间,其中许多聚集在1周围。
我想计算一个大型稀疏矩阵(数十万行)的频谱(所有特征值)。这很难。
我愿意定约。有近似方法可以做到这一点吗?
虽然我希望对此问题有一个一般性的答案,但在以下特定情况下,我也会感到满意。我的矩阵是大图的规范化拉普拉斯算子。特征值将在0到2之间,其中许多聚集在1周围。
Answers:
如果您的图是无向的(我怀疑),则矩阵是对称的,并且您做不到比Lanczsos算法更好的事情(如果需要,可以进行选择性重新正交化,以保持稳定性)。由于整个频谱由100000个数字组成,因此,我主要对频谱密度感兴趣。
要获得近似的光谱密度,请获取尺寸约为100的前导Krylov子空间的光谱,并用平滑版本替换其离散密度。
领先的Krylov谱将具有几乎解析的良好隔离的特征值(应存在),近似于非隔离谱的末尾的特征值,且两者之间有些随机,其累积分布函数类似于真实光谱的分布。如果维数增加,它将以精确的算术收敛到它。(如果您的算子是无限维的,那么情况仍然会如此,您将获得连续光谱上真实光谱密度函数的积分。)
Lin Lin,Yousef Saad和Chao Yang(2016)在论文“大矩阵的近似光谱密度”的第3.2节中更详细地讨论了Arnold Neumaier的答案。
还讨论了其他一些方法,但本文末尾的数值分析表明,Lanczos方法优于这些方法。
这是表征光谱的另一种方法。
The above appears to weigh parts of the spectrum more evenly than a similarly smeared Krylov spectral density --- try diag(linspace(0, 1, 150000)) --- although maybe there is a way to correct for this?. This is somewhat similar to the pseudospectral approach, but the result indicates the (smeared) number of eigenvalues in the vicinity to point , rather than the inverse distance to the nearest eigenvalue.
EDIT: A better performing alternative for computing the above quantity is to compute Chebyshev moments (via similar stochastic evaluation as above) and then reconstruct the spectral density from them. This requires neither matrix inversions nor separate computations for each . See http://theorie2.physik.uni-greifswald.de/downloads/publications/LNP_chapter19.pdf and references therein.
See the paper "On Sampling-based Approximate Spectral Decomposition" by Sanjiv Kumar, Mehryar Mohri & Ameet Talwalkar (ICML 2009.). It uses sampling of columns of your matrix.
Since your matrix is symmetric you should do the following:
Let A be your n*n matrix. You want to reduce the computation of the eigenvalues of an n*n matrix to the computation of the eigenvalues of an k*k matrix. First choose your value of k. Let's say you choose k=500, since you can easily compute the eigenvalues of a 500*500 matrix. Then, randomly choose k columns of the matrix A. Contruct the matrix B that keeps only these columns, and the corresponding rows.
B = A(x,x) for a random set of k indexes x
B is now a k*k matrix. Compute the eigenvalues of B, and multiply them by (n/k). You now have k values which are approximately distributed like the n eigenvalues of A. Note that you get only k values, not n, but their distribution will be correct (up to the fact that they are an approximation).
You can always use the Gershgorin circle Theorem bounds to approximate the eigenvalues.
If the off-diagonal terms are small, the diagonal itself is a good approximation of the spectrum. Otherwise if you end up with an approximation of the eigenspace (by other methods) you could try to express the diagonal entries in this system. This will lead to a matrix with smaller off-diagonal terms and the new diagonal will be a better approximation of the spectrum.