Krylov processes
Krylov processes are the foundation of Krylov methods, they generate bases of Krylov subspaces. Depending on the Krylov subspaces generated, Krylov processes are more or less specialized for a subset of linear problems. The following table summarizes the most relevant processes for each linear problem.
Linear problems | Processes |
---|---|
Hermitian linear systems | Hermitian Lanczos |
Square Non-Hermitian linear systems | Non-Hermitian Lanczos – Arnoldi |
Least-squares problems | Golub-Kahan – Saunders-Simon-Yip |
Least-norm problems | Golub-Kahan – Saunders-Simon-Yip |
Saddle-point and Hermitian quasi-definite systems | Golub-Kahan – Saunders-Simon-Yip |
Generalized saddle-point and non-Hermitian partitioned systems | Montoison-Orban |
Notation
For a matrix $A$, $A^H$ denotes the conjugate transpose of $A$. It coincides with $A^T$, the transpose of $A$, for real matrices. Define $V_k := \begin{bmatrix} v_1 & \ldots & v_k \end{bmatrix} \enspace$ and $\enspace U_k := \begin{bmatrix} u_1 & \ldots & u_k \end{bmatrix}$.
For a matrix $C \in \mathbb{C}^{n \times n}$ and a vector $t \in \mathbb{C}^{n}$, the $k$-th Krylov subspace generated by $C$ and $t$ is
\[\mathcal{K}_k(C, t) := \left\{\sum_{i=0}^{k-1} \omega_i C^i t \, \middle \vert \, \omega_i \in \mathbb{C},~0 \le i \le k-1 \right\}.\]
For matrices $C \in \mathbb{C}^{n \times n} \enspace$ and $\enspace T \in \mathbb{C}^{n \times p}$, the $k$-th block Krylov subspace generated by $C$ and $T$ is
\[\mathcal{K}_k^{\square}(C, T) := \left\{\sum_{i=0}^{k-1} C^i T \, \Omega_i \, \middle \vert \, \Omega_i \in \mathbb{C}^{p \times p},~0 \le i \le k-1 \right\}.\]
Hermitian Lanczos
After $k$ iterations of the Hermitian Lanczos process, the situation may be summarized as
\[\begin{align*} A V_k &= V_k T_k + \beta_{k+1,k} v_{k+1} e_k^T = V_{k+1} T_{k+1,k}, \\ V_k^H V_k &= I_k, \end{align*}\]
where $V_k$ is an orthonormal basis of the Krylov subspace $\mathcal{K}_k(A,b)$,
\[T_k = \begin{bmatrix} \alpha_1 & \bar{\beta}_2 & & \\ \beta_2 & \alpha_2 & \ddots & \\ & \ddots & \ddots & \bar{\beta}_k \\ & & \beta_k & \alpha_k \end{bmatrix} , \qquad T_{k+1,k} = \begin{bmatrix} T_{k} \\ \beta_{k+1} e_{k}^T \end{bmatrix}.\]
Note that depending on how we normalize the vectors that compose $V_k$, $T_{k+1,k}$ can be a real tridiagonal matrix even if $A$ is a complex matrix.
The function hermitian_lanczos
returns $V_{k+1}$, $\beta_1$ and $T_{k+1,k}$.
Related methods: SYMMLQ
, CG
, CR
, CAR
, MINRES
, MINRES-QLP
, MINARES
, CGLS
, CRLS
, CGNE
, CRMR
, CG-LANCZOS
and CG-LANCZOS-SHIFT
.
Krylov.hermitian_lanczos
— MethodV, β, T = hermitian_lanczos(A, b, k)
Input arguments
A
: a linear operator that models an Hermitian matrix of dimension n;b
: a vector of length n;k
: the number of iterations of the Hermitian Lanczos process.
Output arguments
V
: a dense n × (k+1) matrix;β
: a coefficient such that βv₁ = b;T
: a sparse (k+1) × k tridiagonal matrix.
Reference
- C. Lanczos, An Iteration Method for the Solution of the Eigenvalue Problem of Linear Differential and Integral Operators, Journal of Research of the National Bureau of Standards, 45(4), pp. 225–280, 1950.
Non-Hermitian Lanczos
After $k$ iterations of the non-Hermitian Lanczos process (also named the Lanczos biorthogonalization process), the situation may be summarized as
\[\begin{align*} A V_k &= V_k T_k + \beta_{k+1} v_{k+1} e_k^T = V_{k+1} T_{k+1,k}, \\ A^H U_k &= U_k T_k^H + \bar{\gamma}_{k+1} u_{k+1} e_k^T = U_{k+1} T_{k,k+1}^H, \\ V_k^H U_k &= U_k^H V_k = I_k, \end{align*}\]
where $V_k$ and $U_k$ are bases of the Krylov subspaces $\mathcal{K}_k (A,b)$ and $\mathcal{K}_k (A^H,c)$, respectively,
\[T_k = \begin{bmatrix} \alpha_1 & \gamma_2 & & \\ \beta_2 & \alpha_2 & \ddots & \\ & \ddots & \ddots & \gamma_k \\ & & \beta_k & \alpha_k \end{bmatrix} , \qquad T_{k+1,k} = \begin{bmatrix} T_{k} \\ \beta_{k+1} e_{k}^T \end{bmatrix} , \qquad T_{k,k+1} = \begin{bmatrix} T_{k} & \gamma_{k+1} e_k \end{bmatrix}.\]
The function nonhermitian_lanczos
returns $V_{k+1}$, $\beta_1$, $T_{k+1,k}$, $U_{k+1}$, $\bar{\gamma}_1$ and $T_{k,k+1}^H$.
Related methods: BiLQ
, QMR
, BiLQR
, CGS
and BICGSTAB
.
The scaling factors used in our implementation are $\beta_k = |u_k^H v_k|^{\tfrac{1}{2}}$ and $\gamma_k = (u_k^H v_k) / \beta_k$. With these scaling factors, the non-Hermitian Lanczos process coincides with the Hermitian Lanczos process when $A = A^H$ and $b = c$.
Krylov.nonhermitian_lanczos
— MethodV, β, T, U, γᴴ, Tᴴ = nonhermitian_lanczos(A, b, c, k)
Input arguments
A
: a linear operator that models a square matrix of dimension n;b
: a vector of length n;c
: a vector of length n;k
: the number of iterations of the non-Hermitian Lanczos process.
Output arguments
V
: a dense n × (k+1) matrix;β
: a coefficient such that βv₁ = b;T
: a sparse (k+1) × k tridiagonal matrix;U
: a dense n × (k+1) matrix;γᴴ
: a coefficient such that γᴴu₁ = c;Tᴴ
: a sparse (k+1) × k tridiagonal matrix.
Reference
- C. Lanczos, An Iteration Method for the Solution of the Eigenvalue Problem of Linear Differential and Integral Operators, Journal of Research of the National Bureau of Standards, 45(4), pp. 225–280, 1950.
The non-Hermitian Lanczos process can be also implemented without $A^H$ (transpose-free variant). To derive it, we can observe that $\beta_{k+1} v_{k+1} = P_k(A) b~~$ and $\bar{\gamma}_{k+1} u_{k+1} = Q_k(A^H) c~~$ where $P_k$ and $Q_k$ are polynomials of degree $k$. The polynomials are defined from the recursions:
\[\begin{align*} P_0(A) &= I_n, \\ P_1(A) &= \left(\dfrac{A - \alpha_1 I_n}{\beta_1}\right) P_0(A), \\ P_k(A) &= \left(\dfrac{A - \alpha_k I_n}{\beta_k}\right) P_{k-1}(A) - \dfrac{\gamma_k}{\beta_{k-1}} P_{k-2}(A), \quad k \ge 2, \\ & \\ Q_0(A^H) &= I_n, \\ Q_1(A^H) &= \left(\dfrac{A^H - \bar{\alpha}_1 I_n}{\bar{\gamma}_1}\right) Q_0(A^H), \\ Q_k(A^H) &= \left(\dfrac{A^H - \bar{\alpha}_k I_n}{\bar{\gamma}_k}\right) Q_{k-1}(A^H) - \dfrac{\bar{\beta}_k}{\bar{\gamma}_{k-1}} Q_{k-2}(A^H), \quad k \ge 2. \end{align*}\]
Because $\alpha_k = u_k^H A v_k$ and $(\bar{\gamma}_{k+1} u_{k+1})^H (\beta_{k+1} v_{k+1}) = \gamma_{k+1} \beta_{k+1}$, we can determine the coefficients of $T_{k+1,k}$ and $T_{k,k+1}^H$ as follows:
\[\begin{align*} \alpha_k &= \dfrac{1}{\gamma_k \beta_k} \langle~Q_{k-1}(A^H) c \, , \, A P_{k-1}(A) b~\rangle \\ &= \dfrac{1}{\gamma_k \beta_k} \langle~c \, , \, \bar{Q}_{k-1}(A) A P_{k-1}(A) b~\rangle, \\ & \\ \beta_{k+1} \gamma_{k+1} &= \langle~Q_k(A^H) c \, , \, P_k(A) b~\rangle \\ &= \langle~c \, , \, \bar{Q}_k(A) P_k(A) b~\rangle. \end{align*}\]
Arnoldi
After $k$ iterations of the Arnoldi process, the situation may be summarized as
\[\begin{align*} A V_k &= V_k H_k + h_{k+1,k} v_{k+1} e_k^T = V_{k+1} H_{k+1,k}, \\ V_k^H V_k &= I_k, \end{align*}\]
where $V_k$ is an orthonormal basis of the Krylov subspace $\mathcal{K}_k (A,b)$,
\[H_k = \begin{bmatrix} h_{1,1}~ & h_{1,2}~ & \ldots & h_{1,k} \\ h_{2,1}~ & \ddots~ & \ddots & \vdots \\ & \ddots~ & \ddots & h_{k-1,k} \\ & & h_{k,k-1} & h_{k,k} \end{bmatrix} , \qquad H_{k+1,k} = \begin{bmatrix} H_{k} \\ h_{k+1,k} e_{k}^T \end{bmatrix}.\]
The function arnoldi
returns $V_{k+1}$, $\beta$ and $H_{k+1,k}$.
Related methods: DIOM
, FOM
, DQGMRES
, GMRES
and FGMRES
.
The Arnoldi process coincides with the Hermitian Lanczos process when $A$ is Hermitian.
Krylov.arnoldi
— MethodV, β, H = arnoldi(A, b, k; reorthogonalization=false)
Input arguments
A
: a linear operator that models a square matrix of dimension n;b
: a vector of length n;k
: the number of iterations of the Arnoldi process.
Keyword argument
reorthogonalization
: reorthogonalize the new vectors of the Krylov basis against all previous vectors.
Output arguments
V
: a dense n × (k+1) matrix;β
: a coefficient such that βv₁ = b;H
: a dense (k+1) × k upper Hessenberg matrix.
Reference
- W. E. Arnoldi, The principle of minimized iterations in the solution of the matrix eigenvalue problem, Quarterly of Applied Mathematics, 9, pp. 17–29, 1951.
Golub-Kahan
After $k$ iterations of the Golub-Kahan bidiagonalization process, the situation may be summarized as
\[\begin{align*} A V_k &= U_{k+1} B_k, \\ A^H U_{k+1} &= V_k B_k^H + \bar{\alpha}_{k+1} v_{k+1} e_{k+1}^T = V_{k+1} L_{k+1}^H, \\ V_k^H V_k &= U_k^H U_k = I_k, \end{align*}\]
where $V_k$ and $U_k$ are bases of the Krylov subspaces $\mathcal{K}_k (A^HA,A^Hb)$ and $\mathcal{K}_k (AA^H,b)$, respectively,
\[L_k = \begin{bmatrix} \alpha_1 & & & \\ \beta_2 & \alpha_2 & & \\ & \ddots & \ddots & \\ & & \beta_k & \alpha_k \end{bmatrix} , \qquad B_k = \begin{bmatrix} \alpha_1 & & & \\ \beta_2 & \alpha_2 & & \\ & \ddots & \ddots & \\ & & \beta_k & \alpha_k \\ & & & \beta_{k+1} \\ \end{bmatrix} = \begin{bmatrix} L_{k} \\ \beta_{k+1} e_{k}^T \end{bmatrix}.\]
Note that depending on how we normalize the vectors that compose $V_k$ and $U_k$, $L_k$ can be a real bidiagonal matrix even if $A$ is a complex matrix.
The function golub_kahan
returns $V_{k+1}$, $U_{k+1}$, $\beta_1$ and $L_{k+1}$.
Related methods: LNLQ
, CRAIG
, CRAIGMR
, LSLQ
, LSQR
and LSMR
.
The Golub-Kahan process coincides with the Hermitian Lanczos process applied to the normal equations $A^HA x = A^Hb$ and $AA^H x = b$. It is also related to the Hermitian Lanczos process applied to $\begin{bmatrix} 0 & A \\ A^H & 0 \end{bmatrix}$ with initial vector $\begin{bmatrix} b \\ 0 \end{bmatrix}$.
Krylov.golub_kahan
— MethodV, U, β, L = golub_kahan(A, b, k)
Input arguments
A
: a linear operator that models a matrix of dimension m × n;b
: a vector of length m;k
: the number of iterations of the Golub-Kahan process.
Output arguments
V
: a dense n × (k+1) matrix;U
: a dense m × (k+1) matrix;β
: a coefficient such that βu₁ = b;L
: a sparse (k+1) × (k+1) lower bidiagonal matrix.
References
- G. H. Golub and W. Kahan, Calculating the Singular Values and Pseudo-Inverse of a Matrix, SIAM Journal on Numerical Analysis, 2(2), pp. 225–224, 1965.
- C. C. Paige, Bidiagonalization of Matrices and Solution of Linear Equations, SIAM Journal on Numerical Analysis, 11(1), pp. 197–209, 1974.
Saunders-Simon-Yip
After $k$ iterations of the Saunders-Simon-Yip process (also named the orthogonal tridiagonalization process), the situation may be summarized as
\[\begin{align*} A U_k &= V_k T_k + \beta_{k+1} v_{k+1} e_k^T = V_{k+1} T_{k+1,k}, \\ A^H V_k &= U_k T_k^H + \bar{\gamma}_{k+1} u_{k+1} e_k^T = U_{k+1} T_{k,k+1}^H, \\ V_k^H V_k &= U_k^H U_k = I_k, \end{align*}\]
where $\begin{bmatrix} V_k & 0 \\ 0 & U_k \end{bmatrix}$ is an orthonormal basis of the block Krylov subspace $\mathcal{K}^{\square}_k \left(\begin{bmatrix} 0 & A \\ A^H & 0 \end{bmatrix}, \begin{bmatrix} b & 0 \\ 0 & c \end{bmatrix}\right)$,
\[T_k = \begin{bmatrix} \alpha_1 & \gamma_2 & & \\ \beta_2 & \alpha_2 & \ddots & \\ & \ddots & \ddots & \gamma_k \\ & & \beta_k & \alpha_k \end{bmatrix} , \qquad T_{k+1,k} = \begin{bmatrix} T_{k} \\ \beta_{k+1} e_{k}^T \end{bmatrix} , \qquad T_{k,k+1} = \begin{bmatrix} T_{k} & \gamma_{k+1} e_{k} \end{bmatrix}.\]
The function saunders_simon_yip
returns $V_{k+1}$, $\beta_1$, $T_{k+1,k}$, $U_{k+1}$, $\bar{\gamma}_1$ and $T_{k,k+1}^H$.
Related methods: USYMLQ
, USYMQR
, TriLQR
, TriCG
and TriMR
.
The Saunders-Simon-Yip is equivalent to the block-Lanczos process applied to $\begin{bmatrix} 0 & A \\ A^H & 0 \end{bmatrix}$ with initial matrix $\begin{bmatrix} b & 0 \\ 0 & c \end{bmatrix}$.
Krylov.saunders_simon_yip
— MethodV, β, T, U, γᴴ, Tᴴ = saunders_simon_yip(A, b, c, k)
Input arguments
A
: a linear operator that models a matrix of dimension m × n;b
: a vector of length m;c
: a vector of length n;k
: the number of iterations of the Saunders-Simon-Yip process.
Output arguments
V
: a dense m × (k+1) matrix;β
: a coefficient such that βv₁ = b;T
: a sparse (k+1) × k tridiagonal matrix;U
: a dense n × (k+1) matrix;γᴴ
: a coefficient such that γᴴu₁ = c;Tᴴ
: a sparse (k+1) × k tridiagonal matrix.
Reference
- M. A. Saunders, H. D. Simon, and E. L. Yip, Two Conjugate-Gradient-Type Methods for Unsymmetric Linear Equations, SIAM Journal on Numerical Analysis, 25(4), pp. 927–940, 1988.
Montoison-Orban
After $k$ iterations of the Montoison-Orban process (also named the orthogonal Hessenberg reduction process), the situation may be summarized as
\[\begin{align*} A U_k &= V_k H_k + h_{k+1,k} v_{k+1} e_k^T = V_{k+1} H_{k+1,k}, \\ B V_k &= U_k F_k + f_{k+1,k} u_{k+1} e_k^T = U_{k+1} F_{k+1,k}, \\ V_k^H V_k &= U_k^H U_k = I_k, \end{align*}\]
where $\begin{bmatrix} V_k & 0 \\ 0 & U_k \end{bmatrix}$ is an orthonormal basis of the block Krylov subspace $\mathcal{K}^{\square}_k \left(\begin{bmatrix} 0 & A \\ B & 0 \end{bmatrix}, \begin{bmatrix} b & 0 \\ 0 & c \end{bmatrix}\right)$,
\[H_k = \begin{bmatrix} h_{1,1}~ & h_{1,2}~ & \ldots & h_{1,k} \\ h_{2,1}~ & \ddots~ & \ddots & \vdots \\ & \ddots~ & \ddots & h_{k-1,k} \\ & & h_{k,k-1} & h_{k,k} \end{bmatrix} , \qquad F_k = \begin{bmatrix} f_{1,1}~ & f_{1,2}~ & \ldots & f_{1,k} \\ f_{2,1}~ & \ddots~ & \ddots & \vdots \\ & \ddots~ & \ddots & f_{k-1,k} \\ & & f_{k,k-1} & f_{k,k} \end{bmatrix},\]
\[H_{k+1,k} = \begin{bmatrix} H_{k} \\ h_{k+1,k} e_{k}^T \end{bmatrix} , \qquad F_{k+1,k} = \begin{bmatrix} F_{k} \\ f_{k+1,k} e_{k}^T \end{bmatrix}.\]
The function montoison_orban
returns $V_{k+1}$, $\beta$, $H_{k+1,k}$, $U_{k+1}$, $\gamma$ and $F_{k+1,k}$.
Related methods: GPMR
.
The Montoison-Orban is equivalent to the block-Arnoldi process applied to $\begin{bmatrix} 0 & A \\ B & 0 \end{bmatrix}$ with initial matrix $\begin{bmatrix} b & 0 \\ 0 & c \end{bmatrix}$. It also coincides with the Saunders-Simon-Yip process when $B = A^H$.
Krylov.montoison_orban
— MethodV, β, H, U, γ, F = montoison_orban(A, B, b, c, k; reorthogonalization=false)
Input arguments
A
: a linear operator that models a matrix of dimension m × n;B
: a linear operator that models a matrix of dimension n × m;b
: a vector of length m;c
: a vector of length n;k
: the number of iterations of the Montoison-Orban process.
Keyword argument
reorthogonalization
: reorthogonalize the new vectors of the Krylov basis against all previous vectors.
Output arguments
V
: a dense m × (k+1) matrix;β
: a coefficient such that βv₁ = b;H
: a dense (k+1) × k upper Hessenberg matrix;U
: a dense n × (k+1) matrix;γ
: a coefficient such that γu₁ = c;F
: a dense (k+1) × k upper Hessenberg matrix.
Reference
- A. Montoison and D. Orban, GPMR: An Iterative Method for Unsymmetric Partitioned Linear Systems, SIAM Journal on Matrix Analysis and Applications, 44(1), pp. 293–311, 2023.