Detailed examples may be found here.
Krylov methods
Krylov.cg — Function(x, stats) = cg(A, b; M, atol, rtol, itmax, radius, linesearch, verbose)The conjugate gradient method to solve the symmetric linear system Ax=b.
The method does not abort if A is not definite.
A preconditioner M may be provided in the form of a linear operator and is assumed to be symmetric and positive definite. M also indicates the weighted norm in which residuals are measured.
Krylov.cr — Function(x, stats) = cr(A, b; M, atol, rtol, γ, itmax, radius, verbose, linesearch)A truncated version of Stiefel’s Conjugate Residual method to solve the symmetric linear system Ax=b. The matrix A must be positive semi-definite.
A preconditioner M may be provided in the form of a linear operator and is assumed to be symmetric and positive definite. M also indicates the weighted norm in which residuals are measured.
In a linesearch context, 'linesearch' must be set to 'true'.
Krylov.symmlq — Function(x, stats) = symmlq(A, b; M, λ, transfer_to_cg, λest, atol, rtol, etol, window, itmax, conlim, verbose)Solve the shifted linear system
(A + λ I) x = busing the SYMMLQ method, where λ is a shift parameter, and A is square and symmetric.
SYMMLQ produces monotonic errors ‖x*-x‖₂.
A preconditioner M may be provided in the form of a linear operator and is assumed to be symmetric and positive definite.
Krylov.cg_lanczos — Function(x, stats) = cg_lanczos(A, b; M, atol, rtol, itmax, check_curvature, verbose)The Lanczos version of the conjugate gradient method to solve the symmetric linear system
Ax = bThe method does not abort if A is not definite.
A preconditioner M may be provided in the form of a linear operator and is assumed to be symmetric and positive definite.
Krylov.cg_lanczos_shift_seq — Function(x, stats) = cg_lanczos_shift_seq(A, b, shifts; M, atol, rtol, itmax, check_curvature, verbose)The Lanczos version of the conjugate gradient method to solve a family of shifted systems
(A + αI) x = b (α = α₁, ..., αₙ)The method does not abort if A + αI is not definite.
A preconditioner M may be provided in the form of a linear operator and is assumed to be symmetric and positive definite.
Krylov.minres — Function(x, stats) = minres(A, b; M, λ, atol, rtol, etol, window, itmax, conlim, verbose)Solve the shifted linear least-squares problem
minimize ‖b - (A + λ I)x‖₂²or the shifted linear system
(A + λ I) x = busing the MINRES method, where λ ≥ 0 is a shift parameter, where A is square and symmetric.
MINRES is formally equivalent to applying CR to Ax=b when A is positive definite, but is typically more stable and also applies to the case where A is indefinite.
MINRES produces monotonic residuals ‖r‖₂ and optimality residuals ‖Aᵀr‖₂.
A preconditioner M may be provided in the form of a linear operator and is assumed to be symmetric and positive definite.
Krylov.minres_qlp — Function(x, stats) = minrres_qlp(A, b; M, atol, rtol, λ, itmax, verbose)MINRES-QLP is the only method based on the Lanczos process that returns the minimum-norm solution on singular inconsistent systems (A + λI)x = b, where λ is a shift parameter. It is significantly more complex but can be more reliable than MINRES when A is ill-conditioned.
A preconditioner M may be provided in the form of a linear operator and is assumed to be symmetric and positive definite. M also indicates the weighted norm in which residuals are measured.
Krylov.diom — Function(x, stats) = diom(A, b; M, N, atol, rtol, itmax, memory, pivoting, verbose)Solve the consistent linear system Ax = b using direct incomplete orthogonalization method.
DIOM is similar to CG with partial reorthogonalization.
An advantage of DIOM is that nonsymmetric or symmetric indefinite or both nonsymmetric and indefinite systems of linear equations can be handled by this single algorithm.
This implementation allows a left preconditioner M and a right preconditioner N.
- Left preconditioning : M⁻¹Ax = M⁻¹b
- Right preconditioning : AN⁻¹u = b with x = N⁻¹u
- Split preconditioning : M⁻¹AN⁻¹u = M⁻¹b with x = N⁻¹u
Krylov.dqgmres — Function(x, stats) = dqgmres(A, b; M, N, atol, rtol, itmax, memory, verbose)Solve the consistent linear system Ax = b using DQGMRES method.
DQGMRES algorithm is based on the incomplete Arnoldi orthogonalization process and computes a sequence of approximate solutions with the quasi-minimal residual property.
This implementation allows a left preconditioner M and a right preconditioner N.
- Left preconditioning : M⁻¹Ax = M⁻¹b
- Right preconditioning : AN⁻¹u = b with x = N⁻¹u
- Split preconditioning : M⁻¹AN⁻¹u = M⁻¹b with x = N⁻¹u
Krylov.usymlq — Function(x, stats) = usymlq(A, b, c; atol, rtol, transfer_to_usymcg, itmax, verbose)Solve the linear system Ax = b using the USYMLQ method.
USYMLQ is based on a tridiagonalization process for unsymmetric matrices. The error norm ‖x - x*‖ monotonously decreases in USYMLQ. It's considered as a generalization of SYMMLQ.
It can also be applied to under-determined and over-determined problems. In all cases, problems must be consistent.
An option gives the possibility of transferring to the USYMCG point, when it exists. The transfer is based on the residual norm.
Krylov.usymqr — Function(x, stats) = usymqr(A, b, c; atol, rtol, itmax, verbose)Solve the linear system Ax = b using the USYMQR method.
USYMQR is based on a tridiagonalization process for unsymmetric matrices. The residual norm ‖b - Ax‖ monotonously decreases in USYMQR. It's considered as a generalization of MINRES.
It can also be applied to under-determined and over-determined problems. USYMQR finds the minimum-norm solution if problems are inconsistent.
Krylov.tricg — Function(x, y, stats) = tricg(A, b, c; M, N, atol, rtol, spd, snd, flip, τ, ν, itmax, verbose)TriCG solves the symmetric linear system
[ τE A ] [ x ] = [ b ]
[ Aᵀ νF ] [ y ] [ c ],where τ and ν are real numbers, E = M⁻¹ ≻ 0 and F = N⁻¹ ≻ 0. TriCG could breakdown if τ = 0 or ν = 0. It's recommended to use TriMR in these cases.
By default, TriCG solves symmetric and quasi-definite linear systems with τ = 1 and ν = -1. If flip = true, TriCG solves another known variant of SQD systems where τ = -1 and ν = 1. If spd = true, τ = ν = 1 and the associated symmetric and positive definite linear system is solved. If snd = true, τ = ν = -1 and the associated symmetric and negative definite linear system is solved. τ and ν are also keyword arguments that can be directly modified for more specific problems.
TriCG is based on the preconditioned orthogonal tridiagonalization process and its relation with the preconditioned block-Lanczos process.
[ M O ]
[ 0 N ]indicates the weighted norm in which residuals are measured. It's the Euclidean norm when M and N are identity operators.
TriCG stops when itmax iterations are reached or when ‖rₖ‖ ≤ atol + ‖r₀‖ * rtol. atol is an absolute tolerance and rtol is a relative tolerance.
Additional details can be displayed if the verbose mode is enabled.
Krylov.trimr — Function(x, y, stats) = trimr(A, b, c; M, N, atol, rtol, spd, snd, flip, sp, τ, ν, itmax, verbose)TriMR solves the symmetric linear system
[ τE A ] [ x ] = [ b ]
[ Aᵀ νF ] [ y ] [ c ],where τ and ν are real numbers, E = M⁻¹ ≻ 0 and F = N⁻¹ ≻ 0. TriMR handles saddle-point systems (τ = 0 or ν = 0) and adjoint systems (τ = 0 and ν = 0) without any risk of breakdown.
By default, TriMR solves symmetric and quasi-definite linear systems with τ = 1 and ν = -1. If flip = true, TriMR solves another known variant of SQD systems where τ = -1 and ν = 1. If spd = true, τ = ν = 1 and the associated symmetric and positive definite linear system is solved. If snd = true, τ = ν = -1 and the associated symmetric and negative definite linear system is solved. If sp = true, τ = 1, ν = 0 and the associated saddle-point linear system is solved. τ and ν are also keyword arguments that can be directly modified for more specific problems.
TriMR is based on the preconditioned orthogonal tridiagonalization process and its relation with the preconditioned block-Lanczos process.
[ M O ]
[ 0 N ]indicates the weighted norm in which residuals are measured. It's the Euclidean norm when M and N are identity operators.
TriMR stops when itmax iterations are reached or when ‖rₖ‖ ≤ atol + ‖r₀‖ * rtol. atol is an absolute tolerance and rtol is a relative tolerance.
Additional details can be displayed if the verbose mode is enabled.
Krylov.trilqr — Function(x, t, stats) = trilqr(A, b, c; atol, rtol, transfer_to_usymcg, itmax, verbose)Combine USYMLQ and USYMQR to solve adjoint systems.
[0 A] [t] = [b]
[Aᵀ 0] [x] [c]USYMLQ is used for solving primal system Ax = b. USYMQR is used for solving dual system Aᵀt = c.
An option gives the possibility of transferring from the USYMLQ point to the USYMCG point, when it exists. The transfer is based on the residual norm.
Krylov.bilq — Function(x, stats) = bilq(A, b; c, atol, rtol, transfer_to_bicg, itmax, verbose)Solve the square linear system Ax = b using the BiLQ method.
BiLQ is based on the Lanczos biorthogonalization process. When A is symmetric and b = c, BiLQ is equivalent to SYMMLQ.
An option gives the possibility of transferring to the BiCG point, when it exists. The transfer is based on the residual norm.
Krylov.cgs — Function(x, stats) = cgs(A, b; c, M, N, atol, rtol, itmax, verbose)Solve the consistent linear system Ax = b using conjugate gradient squared algorithm.
From "Iterative Methods for Sparse Linear Systems (Y. Saad)" :
«The method is based on a polynomial variant of the conjugate gradients algorithm. Although related to the so-called bi-conjugate gradients (BCG) algorithm, it does not involve adjoint matrix-vector multiplications, and the expected convergence rate is about twice that of the BCG algorithm.
The Conjugate Gradient Squared algorithm works quite well in many cases. However, one difficulty is that, since the polynomials are squared, rounding errors tend to be more damaging than in the standard BCG algorithm. In particular, very high variations of the residual vectors often cause the residual norms computed to become inaccurate.
TFQMR and BICGSTAB were developed to remedy this difficulty.»
This implementation allows a left preconditioner M and a right preconditioner N.
Krylov.bicgstab — Function(x, stats) = bicgstab(A, b; c, M, N, atol, rtol, itmax, verbose)Solve the square linear system Ax = b using the BICGSTAB method.
The Biconjugate Gradient Stabilized method is a variant of BiCG, like CGS, but using different updates for the Aᵀ-sequence in order to obtain smoother convergence than CGS.
If BICGSTAB stagnates, we recommend DQGMRES and BiLQ as alternative methods for unsymmetric square systems.
BICGSTAB stops when itmax iterations are reached or when ‖rₖ‖ ≤ atol + ‖b‖ * rtol. atol is an absolute tolerance and rtol is a relative tolerance.
Additional details can be displayed if the verbose mode is enabled.
This implementation allows a left preconditioner M and a right preconditioner N.
Krylov.qmr — Function(x, stats) = qmr(A, b; c, atol, rtol, itmax, verbose)Solve the square linear system Ax = b using the QMR method.
QMR is based on the Lanczos biorthogonalization process. When A is symmetric and b = c, QMR is equivalent to MINRES.
Krylov.bilqr — Function(x, t, stats) = bilqr(A, b, c; atol, rtol, transfer_to_bicg, itmax, verbose)Combine BiLQ and QMR to solve adjoint systems.
[0 A] [t] = [b]
[Aᵀ 0] [x] [c]BiLQ is used for solving primal system Ax = b. QMR is used for solving dual system Aᵀt = c.
An option gives the possibility of transferring from the BiLQ point to the BiCG point, when it exists. The transfer is based on the residual norm.
Krylov.cgls — Function(x, stats) = cgls(A, b; M, λ, atol, rtol, radius, itmax, verbose)Solve the regularized linear least-squares problem
minimize ‖b - Ax‖₂² + λ ‖x‖₂²using the Conjugate Gradient (CG) method, where λ ≥ 0 is a regularization parameter. This method is equivalent to applying CG to the normal equations
(AᵀA + λI) x = Aᵀbbut is more stable.
CGLS produces monotonic residuals ‖r‖₂ but not optimality residuals ‖Aᵀr‖₂. It is formally equivalent to LSQR, though can be slightly less accurate, but simpler to implement.
Krylov.crls — Function(x, stats) = crls(A, b; M, λ, atol, rtol, radius, itmax, verbose)Solve the linear least-squares problem
minimize ‖b - Ax‖₂² + λ ‖x‖₂²using the Conjugate Residuals (CR) method. This method is equivalent to applying MINRES to the normal equations
(AᵀA + λI) x = Aᵀb.This implementation recurs the residual r := b - Ax.
CRLS produces monotonic residuals ‖r‖₂ and optimality residuals ‖Aᵀr‖₂. It is formally equivalent to LSMR, though can be substantially less accurate, but simpler to implement.
Krylov.cgne — Function(x, stats) = cgne(A, b; M, λ, atol, rtol, itmax, verbose)Solve the consistent linear system
Ax + √λs = busing the Conjugate Gradient (CG) method, where λ ≥ 0 is a regularization parameter. This method is equivalent to applying CG to the normal equations of the second kind
(AAᵀ + λI) y = bbut is more stable. When λ = 0, this method solves the minimum-norm problem
min ‖x‖₂ s.t. Ax = b.
When λ > 0, it solves the problem
min ‖(x,s)‖₂ s.t. Ax + √λs = b.CGNE produces monotonic errors ‖x-x*‖₂ but not residuals ‖r‖₂. It is formally equivalent to CRAIG, though can be slightly less accurate, but simpler to implement. Only the x-part of the solution is returned.
A preconditioner M may be provided in the form of a linear operator.
Krylov.crmr — Function(x, stats) = crmr(A, b; M, λ, atol, rtol, itmax, verbose)Solve the consistent linear system
Ax + √λs = busing the Conjugate Residual (CR) method, where λ ≥ 0 is a regularization parameter. This method is equivalent to applying CR to the normal equations of the second kind
(AAᵀ + λI) y = bbut is more stable. When λ = 0, this method solves the minimum-norm problem
min ‖x‖₂ s.t. x ∈ argmin ‖Ax - b‖₂.When λ > 0, this method solves the problem
min ‖(x,s)‖₂ s.t. Ax + √λs = b.CGMR produces monotonic residuals ‖r‖₂. It is formally equivalent to CRAIG-MR, though can be slightly less accurate, but simpler to implement. Only the x-part of the solution is returned.
A preconditioner M may be provided in the form of a linear operator.
Krylov.lslq — Function(x_lq, x_cg, err_lbnds, err_ubnds_lq, err_ubnds_cg, stats) = lslq(A, b; M, N, sqd, λ, atol, btol, etol, window, utol, itmax, σ, conlim, verbose)Solve the regularized linear least-squares problem
minimize ‖b - Ax‖₂² + λ² ‖x‖₂²using the LSLQ method, where λ ≥ 0 is a regularization parameter. LSLQ is formally equivalent to applying SYMMLQ to the normal equations
(AᵀA + λ² I) x = Aᵀbbut is more stable.
Main features
- the solution estimate is updated along orthogonal directions
- the norm of the solution estimate ‖xᴸₖ‖₂ is increasing
- the error ‖eₖ‖₂ := ‖xᴸₖ - x*‖₂ is decreasing
- it is possible to transition cheaply from the LSLQ iterate to the LSQR iterate if there is an advantage (there always is in terms of error)
- if
Ais rank deficient, identify the minimum least-squares solution
Input arguments
A::AbstractLinearOperatorb::Vector{Float64}
Optional arguments
M::AbstractLinearOperator=opEye(): a symmetric and positive definite dual preconditionerN::AbstractLinearOperator=opEye(): a symmetric and positive definite primal preconditionersqd::Bool=falseindicates whether or not we are solving a symmetric and quasi-definite augmented system Ifsqd = true, we solve the symmetric and quasi-definite system[ E A ] [ r ] [ b ] [ Aᵀ -F ] [ x ] = [ 0 ],where E = M⁻¹ and F = N⁻¹.
If
sqd = false, we solve the symmetric and indefinite system[ E A ] [ r ] [ b ] [ Aᵀ 0 ] [ x ] = [ 0 ].In this case,
Ncan still be specified and indicates the norm in whichxand the forward error should be measured.λ::Float64=0.0is a regularization parameter (see the problem statement above)σ::Float64=0.0is an underestimate of the smallest nonzero singular value ofA–-settingσtoo large will result in an error in the course of the iterationsatol::Float64=1.0e-8is a stopping tolerance based on the residualbtol::Float64=1.0e-8is a stopping tolerance used to detect zero-residual problemsetol::Float64=1.0e-8is a stopping tolerance based on the lower bound on the errorwindow::Int=5is the number of iterations used to accumulate a lower bound on the errorutol::Float64=1.0e-8is a stopping tolerance based on the upper bound on the erroritmax::Int=0is the maximum number of iterations (0 means no imposed limit)conlim::Float64=1.0e+8is the limit on the estimated condition number ofAbeyond which the solution will be abandonedverbose::Bool=falsedetermines verbosity.
Return values
lslq() returns the tuple (x_lq, x_cg, err_lbnds, err_ubnds_lq, err_ubnds_cg, stats) where
x_lq::Vector{Float64}is the LQ solution estimatex_cg::Vector{Float64}is the CG solution estimate (i.e., the LSQR point)err_lbnds::Vector{Float64}is a vector of lower bounds on the LQ error–-the vector is empty ifwindowis set to zeroerr_ubnds_lq::Vector{Float64}is a vector of upper bounds on the LQ error–-the vector is empty ifσ == 0is left at zeroerr_ubnds_cg::Vector{Float64}is a vector of upper bounds on the CG error–-the vector is empty ifσ == 0is left at zerostats::SimpleStatscollects other statistics on the run.
Stopping conditions
The iterations stop as soon as one of the following conditions holds true:
- the optimality residual is sufficiently small (
stats.status = "found approximate minimum least-squares solution") in the sense that either- ‖Aᵀr‖ / (‖A‖ ‖r‖) ≤ atol, or
- 1 + ‖Aᵀr‖ / (‖A‖ ‖r‖) ≤ 1
- an approximate zero-residual solution has been found (
stats.status = "found approximate zero-residual solution") in the sense that either- ‖r‖ / ‖b‖ ≤ btol + atol ‖A‖ * ‖xᴸ‖ / ‖b‖, or
- 1 + ‖r‖ / ‖b‖ ≤ 1
- the estimated condition number of
Ais too large in the sense that either- 1/cond(A) ≤ 1/conlim (
stats.status = "condition number exceeds tolerance"), or - 1 + 1/cond(A) ≤ 1 (
stats.status = "condition number seems too large for this machine")
- 1/cond(A) ≤ 1/conlim (
- the lower bound on the LQ forward error is less than etol * ‖xᴸ‖
- the upper bound on the CG forward error is less than utol * ‖xᶜ‖
References
- R. Estrin, D. Orban and M. A. Saunders, Estimates of the 2-Norm Forward Error for SYMMLQ and CG, Cahier du GERAD G-2016-70, GERAD, Montreal, 2016. DOI http://dx.doi.org/10.13140/RG.2.2.19581.77288.
- R. Estrin, D. Orban and M. A. Saunders, LSLQ: An Iterative Method for Linear Least-Squares with an Error Minimization Property, Cahier du GERAD G-2017-xx, GERAD, Montreal, 2017.
Krylov.lsqr — Function(x, stats) = lsqr(A, b; M, N, sqd, λ, axtol, btol, atol, rtol, etol, window, itmax, conlim, radius, verbose)Solve the regularized linear least-squares problem
minimize ‖b - Ax‖₂² + λ² ‖x‖₂²using the LSQR method, where λ ≥ 0 is a regularization parameter. LSQR is formally equivalent to applying CG to the normal equations
(AᵀA + λ² I) x = Aᵀb(and therefore to CGLS) but is more stable.
LSQR produces monotonic residuals ‖r‖₂ but not optimality residuals ‖Aᵀr‖₂. It is formally equivalent to CGLS, though can be slightly more accurate.
Preconditioners M and N may be provided in the form of linear operators and are assumed to be symmetric and positive definite. If sqd is set to true, we solve the symmetric and quasi-definite system
[ E A ] [ r ] [ b ]
[ Aᵀ -F ] [ x ] = [ 0 ],where E = M⁻¹ and F = N⁻¹.
If sqd is set to false (the default), we solve the symmetric and indefinite system
[ E A ] [ r ] [ b ]
[ Aᵀ 0 ] [ x ] = [ 0 ].In this case, N can still be specified and indicates the norm in which x should be measured.
Krylov.lsmr — Function(x, stats) = lsmr(A, b; M, N, sqd, λ, axtol, btol, atol, rtol, etol, window, itmax, conlim, radius, verbose)Solve the regularized linear least-squares problem
minimize ‖b - Ax‖₂² + λ² ‖x‖₂²using the LSMR method, where λ ≥ 0 is a regularization parameter. LSQR is formally equivalent to applying MINRES to the normal equations
(AᵀA + λ² I) x = Aᵀb(and therefore to CRLS) but is more stable.
LSMR produces monotonic residuals ‖r‖₂ and optimality residuals ‖Aᵀr‖₂. It is formally equivalent to CRLS, though can be substantially more accurate.
Preconditioners M and N may be provided in the form of linear operators and are assumed to be symmetric and positive definite. If sqd is set to true, we solve the symmetric and quasi-definite system
[ E A ] [ r ] [ b ]
[ Aᵀ -F ] [ x ] = [ 0 ],where E = M⁻¹ and F = N⁻¹.
If sqd is set to false (the default), we solve the symmetric and indefinite system
[ E A ] [ r ] [ b ]
[ Aᵀ 0 ] [ x ] = [ 0 ].In this case, N can still be specified and indicates the norm in which x should be measured.
Krylov.craig — Function(x, y, stats) = craig(A, b; M, N, sqd, λ, atol, btol, rtol, conlim, itmax, verbose, transfer_to_lsqr)Find the least-norm solution of the consistent linear system
Ax + λs = busing the Golub-Kahan implementation of Craig's method, where λ ≥ 0 is a regularization parameter. This method is equivalent to CGNE but is more stable.
For a system in the form Ax = b, Craig's method is equivalent to applying CG to AAᵀy = b and recovering x = Aᵀy. Note that y are the Lagrange multipliers of the least-norm problem
minimize ‖x‖ s.t. Ax = b.Preconditioners M⁻¹ and N⁻¹ may be provided in the form of linear operators and are assumed to be symmetric and positive definite. If sqd = true, CRAIG solves the symmetric and quasi-definite system
[ -N Aᵀ ] [ x ] [ 0 ]
[ A M ] [ y ] = [ b ],which is equivalent to applying CG to (AN⁻¹Aᵀ + M)y = b with Nx = Aᵀy.
If sqd = false, CRAIG solves the symmetric and indefinite system
[ -N Aᵀ ] [ x ] [ 0 ]
[ A 0 ] [ y ] = [ b ].In this case, M⁻¹ can still be specified and indicates the weighted norm in which residuals are measured.
In this implementation, both the x and y-parts of the solution are returned.
Krylov.craigmr — Function(x, y, stats) = craigmr(A, b; M, N, λ, atol, rtol, itmax, verbose)Solve the consistent linear system
Ax + √λs = busing the CRAIG-MR method, where λ ≥ 0 is a regularization parameter. This method is equivalent to applying the Conjugate Residuals method to the normal equations of the second kind
(AAᵀ + λI) y = bbut is more stable. When λ = 0, this method solves the minimum-norm problem
min ‖x‖₂ s.t. x ∈ argmin ‖Ax - b‖₂.When λ > 0, this method solves the problem
min ‖(x,s)‖₂ s.t. Ax + √λs = b.Preconditioners M⁻¹ and N⁻¹ may be provided in the form of linear operators and are assumed to be symmetric and positive definite. Afterward CRAIGMR solves the symmetric and quasi-definite system
[ -N Aᵀ ] [ x ] [ 0 ]
[ A M ] [ y ] = [ b ],which is equivalent to applying MINRES to (M + AN⁻¹Aᵀ)y = b.
CRAIGMR produces monotonic residuals ‖r‖₂. It is formally equivalent to CRMR, though can be slightly more accurate, and intricate to implement. Both the x- and y-parts of the solution are returned.