In-place methods
All solvers in Krylov.jl have an in-place variant implemented in a method whose name ends with !
. A workspace (KrylovWorkspace
or BlockKrylovWorkspace
), which contains the storage needed by a Krylov method, can be used to solve multiple linear systems with the same dimensions and the same floating-point precision. It should also be the same number of right-hand sides in block Krylov methods. The section storage requirements specifies the memory needed for each Krylov method.
Each KrylovWorkspace
has three constructors with consistent argument patterns:
XyzWorkspace(A, b)
XyzWorkspace(m, n, S)
XyzWorkspace(kc::KrylovConstructor)
The only exceptions are CgLanczosShiftWorkspace
and CglsLanczosShiftWorkspace
, which require an additional argument nshifts
. Additionally, some constructors accept keyword arguments.
Xyz
represents the name of the Krylov method, written in lowercase except for its first letter (such as Cg
, Minres
, Lsmr
or Bicgstab
). If the name of the Krylov method contains an underscore (e.g., minres_qlp
or cgls_lanczos_shift
), the workspace constructor transforms it by capitalizing each word and removing underscores, resulting in names like MinresQlpWorkspace
or CglsLanczosShiftWorkspace
.
Given an operator A
and a right-hand side b
, you can create a KrylovWorkspace
based on the size of A
and the type of b
, or explicitly provide the dimensions (m, n)
and the storage type S
.
We assume that S(undef, 0)
, S(undef, n)
, and S(undef, m)
are well-defined for the storage type S
. For more advanced vector types, workspaces can also be created with the help of a KrylovConstructor
.
Krylov.KrylovConstructor
— TypeKrylovConstructor(vm; vm_empty=vm)
KrylovConstructor(vm, vn; vm_empty=vm, vn_empty=vn)
Krylov methods require a workspace containing vectors of length m
and n
to solve linear problems of size m × n
. The KrylovConstructor
facilitates the allocation of these vectors using similar
.
For square problems (m == n
), use the first constructor with a single vector vm
. For rectangular problems (m ≠ n
), use the second constructor with vm
and vn
.
Input arguments
vm
: a vector of lengthm
;vn
: a vector of lengthn
.
Keyword arguments
vm_empty
: an empty vector that may be replaced with a vector of lengthm
;vn_empty
: an empty vector that may be replaced with a vector of lengthn
.
Note
Empty vectors vm_empty
and vn_empty
reduce storage requirements when features such as warm-start or preconditioners are unused. These empty vectors will be replaced within a KrylovWorkspace
only if required, such as when preconditioners are provided.
See the section custom workspaces for an example where this constructor is the only applicable option.
For example, use S = Vector{Float64}
if you want to solve linear systems in double precision on the CPU and S = CuVector{Float32}
if you want to solve linear systems in single precision on an Nvidia GPU.
The workspace is always the first argument of the in-place methods:
minres_workspace = MinresWorkspace(m, n, Vector{Float64})
minres!(minres_workspace, A1, b1)
bicgstab_workspace = BicgstabWorkspace(m, n, Vector{ComplexF64})
bicgstab!(bicgstab_workspace, A2, b2)
gmres_workspace = GmresWorkspace(m, n, Vector{BigFloat})
gmres!(gmres_workspace, A3, b3)
lsqr_workspace = LsqrWorkspace(m, n, CuVector{Float32})
lsqr!(lsqr_workspace, A4, b4)
Workspace accessors
In-place solvers update the workspace, from which solutions and statistics can be retrieved. The following functions are available for post-solve analysis.
These functions are not exported and must be accessed using the prefix Krylov.
, e.g. Krylov.solution(workspace)
.
Krylov.results
— Functionresults(workspace)
Return a tuple containing the solution(s) and the statistics associated with workspace
. This allows retrieving the output arguments of an out-of-place method from an in-place method.
For example, instead of x, stats = cg(A, b)
, you can use:
workspace = CgWorkspace(A, b)
cg!(workspace, A, b)
x, stats = results(workspace)
Krylov.solution
— Functionsolution(workspace)
Return the solution(s) stored in the workspace
.
Optionally you can specify which solution you want to recover, solution(workspace, 1)
returns x
and solution(workspace, 2)
returns y
.
Krylov.statistics
— Functionstatistics(workspace)
Return the statistics stored in workspace
.
Krylov.elapsed_time
— Functionelapsed_time(workspace)
Return the elapsed time (in seconds) during the last call to the Krylov method associated with workspace
.
Krylov.issolved
— Functionissolved(workspace)
Return a boolean indicating whether the Krylov method associated with workspace
has succeeded.
For the Krylov methods bilqr
and trilqr
, you can use issolved_primal(workspace)
and issolved_dual(workspace)
to separately check whether the solver has converged on the primal or dual system.
Krylov.solution_count
— Functionsolution_count(workspace)
Return the number of outputs of solution(workspace)
.
Krylov.iteration_count
— Functioniteration_count(workspace)
Return the number of iterations performed by the Krylov method associated with workspace
.
The number of iterations alone is not a reliable basis for comparing different Krylov methods, since the work performed in each iteration can vary significantly. For a fairer performance comparison, use the total number of operator-vector products with A
and A'
(see Aprod_count and Atprod_count).
Krylov.Aprod_count
— FunctionAprod_count(workspace)
Return the number of operator-vector products with A
performed by the Krylov method associated with workspace
.
This function can also be used to determine the number of operator-vector products with B
in gpmr
, since it is the same as for A
.
Krylov.Atprod_count
— FunctionAtprod_count(workspace)
Return the number of operator-vector products with A'
performed by the Krylov method associated with workspace
.
Examples
We illustrate the use of in-place Krylov solvers with two well-known optimization methods. The details of the optimization methods are described in the section about Factorization-free operators.
Example 1: Newton's method for convex optimization without linesearch
using Krylov
import Krylov: solution
function newton(∇f, ∇²f, x₀; itmax = 200, tol = 1e-8)
n = length(x₀)
x = copy(x₀)
gx = ∇f(x)
iter = 0
S = typeof(x)
workspace = CgWorkspace(n, n, S)
solved = false
tired = false
while !(solved || tired)
Hx = ∇²f(x) # Compute ∇²f(xₖ)
cg!(workspace, Hx, -gx) # Solve ∇²f(xₖ)Δx = -∇f(xₖ)
Δx = solution(workspace) # Recover Δx from the workspace
x = x + Δx # Update xₖ₊₁ = xₖ + Δx
gx = ∇f(x) # ∇f(xₖ₊₁)
iter += 1
solved = norm(gx) ≤ tol
tired = iter ≥ itmax
end
return x
end
Example 2: The Gauss-Newton method for nonlinear least squares without linesearch
using Krylov
import Krylov: solution
function gauss_newton(F, JF, x₀; itmax = 200, tol = 1e-8)
n = length(x₀)
x = copy(x₀)
Fx = F(x)
m = length(Fx)
iter = 0
S = typeof(x)
workspace = LsmrWorkspace(m, n, S)
solved = false
tired = false
while !(solved || tired)
Jx = JF(x) # Compute J(xₖ)
lsmr!(workspace, Jx, -Fx) # Minimize ‖J(xₖ)Δx + F(xₖ)‖
Δx = solution(workspace) # Recover Δx from the workspace
x = x + Δx # Update xₖ₊₁ = xₖ + Δx
Fx_old = Fx # F(xₖ)
Fx = F(x) # F(xₖ₊₁)
iter += 1
solved = norm(Fx - Fx_old) / norm(Fx) ≤ tol
tired = iter ≥ itmax
end
return x
end