Reference
Contents
Index
SolverTools.ARTrustRegionSolverTools.AbstractTrustRegionSolverTools.LineModelSolverTools.TRONTrustRegionSolverTools.TrustRegionSolverTools.TrustRegionExceptionLinearOperators.reset!NLPModels.gradNLPModels.grad!NLPModels.hessNLPModels.objNLPModels.objgradNLPModels.objgrad!SolverTools.acceptableSolverTools.activeSolverTools.active!SolverTools.aredpred!SolverTools.aredpred_commonSolverTools.armijo_conditionSolverTools.armijo_goldsteinSolverTools.armijo_wolfeSolverTools.breakpointsSolverTools.compute_As_slope_qs!SolverTools.compute_Hs_slope_qs!SolverTools.compute_rSolverTools.goldstein_conditionSolverTools.hess!SolverTools.project!SolverTools.project_step!SolverTools.redirect!SolverTools.update!
SolverTools.ARTrustRegion — TypeARTrustRegion(α₀::T;kwargs...)
Select the main parameters used in the TRARC algorithm with α₀ as initial TR/ARC parameter. The keyword arguments are:
max_α::T: Maximum value forα. DefaultT(1) / sqrt(eps(T)).acceptance_threshold::T: Ratio over which the step is successful. DefaultT(0.1).increase_threshold::T: Ratio over which we increaseα. DefaultT(0.75).reduce_threshold::T: Ratio under which we decreaseα. DefaultT(0.1).increase_factor::T: Factor of increase ofα. DefaultT(5.0).decrease_factor::T: Factor of decrease ofα. DefaultT(0.1).max_unsuccinarow::Int: Limit on the number of successive unsucessful iterations. Default30.
Returns a ARTrustRegion structure.
This can be compared to TrustRegion or TRONTrustRegion.
SolverTools.AbstractTrustRegion — TypeAbstractTrustRegion
An abstract trust region type so that specific trust regions define update rules differently. Child types must have at least the following real fields:
acceptance_thresholdinitial_radiusradiusratio
and the following function:
update!(tr, step_norm)
SolverTools.LineModel — TypeA type to represent the restriction of a function to a direction. Given f : R → Rⁿ, x ∈ Rⁿ and a nonzero direction d ∈ Rⁿ,
ϕ = LineModel(nlp, x, d)represents the function ϕ : R → R defined by
ϕ(t) := f(x + td).SolverTools.TRONTrustRegion — TypeTRONTrustRegion{T, V} <: AbstractTrustRegion{T, V}Trust region used by TRON that contains the following fields:
initial_radius::T: initial radius;radius::T: current radius;max_radius::T: upper bound on the radius (default1 / sqrt(eps(T)));acceptance_threshold::T: decrease radius if ratio is below this threshold between 0 and 1 (default1e-4);decrease_threshold::T: ...between 0 and 1 (default0.25);increase_threshold::T: increase radius if ratio is beyond this threshold between 0 and 1 (default0.75);large_decrease_factor::T: decrease factor between 0 and 1 (default0.25);small_decrease_factor::T: decrease factor between 0 and 1 (default0.5);increase_factor::T: increase factor greater than one (default4);ratio::T: current ratioared / pred;quad_min::T: ...;gt::V: pre-allocated memory vector to store the gradient of the objective function;good_grad::Bool:trueifgtis the gradient of the objective function at the trial point.
The following constructors are available:
TRONTrustRegion(gt,::V initial_radius::T; kwargs...)If gt is not known, it is possible to use the following constructors:
TRONTrustRegion(::Type{V}, n::Int, Δ₀::T; kwargs...)
TRONTrustRegion(n::Int, Δ₀::T; kwargs...)that will allocate a vector of size n and type V or Vector{T}.
SolverTools.TrustRegion — TypeTrustRegion{T, V} <: AbstractTrustRegion{T, V}Basic trust region type that contains the following fields:
initial_radius::T: initial radius;radius::T: current radius;max_radius::T: upper bound on the radius (default1 / sqrt(eps(T)));acceptance_threshold::T: decrease radius if ratio is below this threshold between 0 and 1 (default1e-4);increase_threshold::T: increase radius if ratio is beyond this threshold between 0 and 1 (default0.95);decrease_factor::T: decrease factor less between 0 and 1 (default1 / 3);increase_factor::T: increase factor greater than one (default3 / 2);ratio::T: current ratioared / pred;gt::V: pre-allocated memory vector to store the gradient of the objective function;good_grad::Bool:trueifgtis the gradient of the objective function at the trial point.
The following constructors are available:
TrustRegion(gt,::V initial_radius::T; kwargs...)If gt is not known, it is possible to use the following constructors:
TrustRegion(::Type{V}, n::Int, Δ₀::T; kwargs...)
TrustRegion(n::Int, Δ₀::T; kwargs...)that will allocate a vector of size n and type V or Vector{T}.
SolverTools.TrustRegionException — TypeException type raised in case of error.
LinearOperators.reset! — Methodreset!(tr)
Reset the trust-region radius to its initial value.
NLPModels.grad! — Methodgrad!(f, t, g) evaluates the first derivative of the LineModel
ϕ(t) := f(x + td),i.e.,
ϕ'(t) = ∇f(x + td)ᵀd.The gradient ∇f(x + td) is stored in g.
NLPModels.grad — Methodgrad(f, t) evaluates the first derivative of the LineModel
ϕ(t) := f(x + td),i.e.,
ϕ'(t) = ∇f(x + td)ᵀd.NLPModels.hess — MethodEvaluate the second derivative of the LineModel
ϕ(t) := f(x + td),i.e.,
ϕ"(t) = dᵀ∇²f(x + td)d.NLPModels.obj — Methodobj(f, t) evaluates the objective of the LineModel
ϕ(t) := f(x + td).NLPModels.objgrad! — Methodobjgrad!(f, t, g) evaluates the objective and first derivative of the LineModel
ϕ(t) := f(x + td),and
ϕ'(t) = ∇f(x + td)ᵀd.The gradient ∇f(x + td) is stored in g.
NLPModels.objgrad — Methodobjgrad(f, t) evaluates the objective and first derivative of the LineModel
ϕ(t) := f(x + td),and
ϕ'(t) = ∇f(x + td)ᵀd.SolverTools.acceptable — Methodacceptable(tr)
Return true if a step is acceptable, i.e. large or equal to tr.acceptance_threshold.
SolverTools.active! — Methodactive!(indices, x, ℓ, u; rtol = 1e-8, atol = 1e-8)Update a BitVector of the active bounds at x, using tolerance min(rtol * (uᵢ-ℓᵢ), atol). If ℓᵢ or uᵢ is not finite, only atol is used.
SolverTools.active — Methodactive(x, ℓ, u; rtol = 1e-8, atol = 1e-8)Computes the active bounds at x, using tolerance min(rtol * (uᵢ-ℓᵢ), atol). If ℓᵢ or uᵢ is not finite, only atol is used.
SolverTools.aredpred! — Methodared, pred = aredpred(tr, nlp, f, f_trial, Δm, x_trial, step, slope)
ared, pred = aredpred(tr, nls, Fx, f, f_trial, Δm, x_trial, step, slope)Compute the actual and predicted reductions ∆f and Δm, where ared = Δf = f_trial - f is the actual reduction is an objective/merit/penalty function, pred = Δm = m_trial - m is the reduction predicted by the model m of f. We assume that m is being minimized, and therefore that Δm < 0.
If tr.good_grad is true, then the gradient of nlp at x_trial is stored in tr.gt. For AbstractNLSModel, the argument Fx stores the residual if the gradient is updated.
SolverTools.aredpred_common — Methodared, pred, good_grad = aredpred_common(nlp, f, f_trial, Δm, x_trial, step, g_trial, slope)
ared, pred, good_grad = aredpred_common(nlp, Fx, f, f_trial, Δm, x_trial, step, g_trial, slope)Compute the actual and predicted reductions ∆f and Δm, where ared = Δf = f_trial - f is the actual reduction is an objective/merit/penalty function, pred = Δm = m_trial - m is the reduction predicted by the model m of f. We assume that m is being minimized, and therefore that Δm < 0.
good_grad is true if the gradient of nlp at x_trial has been updated and stored in g_trial. For AbstractNLSModel, the argument Fx stores the residual if the gradient is updated.
SolverTools.armijo_condition — Methodarmijo_condition(h₀::T, ht::T, τ₀::T, t::T, slope::T)
Returns true if Armijo condition is satisfied for τ₀.
SolverTools.armijo_goldstein — Methodt, ht, nbk, nbG = armijo_goldstein(h, h₀, slope)Perform a line search from x along the direction d as defined by the LineModel h(t) = f(x + t d), where h₀ = h(0) = f(x), and slope = h'(0) = ∇f(x)ᵀd. The steplength is chosen to satisfy the Armijo-Goldstein conditions. The Armijo condition is
\[h(t) ≤ h₀ + τ₀ t h'(0)\]
and the Goldstein condition is
\[h(t) ≥ h₀ + τ₁ t h'(0).\]
with 0 < τ₀ < τ₁ < 1.
Arguments
h::LineModel{T, S, M}: 1-D model along the search directiond, $h(t) = f(x + t d)$h₀::T: value ofhatt = 0slope: dot product of the gradient and the search direction, $∇f(x)ᵀd$
Keyword arguments
t::T = one(T): initial steplength (default:1);τ₀::T = T(eps(T)^(1/4)): slope factor in the Armijo condition. It should satisfy0 < τ₀ < τ₁ < 1;τ₁::T = min(T(1)-eps(T), T(0.9999)): slope factor in the Goldstein condition. It should satisfy0 < τ₀ < τ₁ < 1;γ₀::T = T(1 / 2): backtracking step length mutliplicative factor (0 < γ₀ <1)γ₁::T = T(2): look-ahead step length mutliplicative factor (γ₁ > 1)bk_max: maximum number of backtracks (default:10);bG_max: maximum number of increases (default:10);verbose: whether to print information (default:false).
Outputs
t::T: the step length;ht::T: the model value att, i.e.,f(x + t * d);nbk::Int: the number of times the steplength was decreased to satisfy the Armijo condition, i.e., the number of backtracks;nbG::Int: the number of times the steplength was increased to satisfy the Goldstein condition.
References
This implementation follows the description given in
C. Cartis, P. R. Sampaio, Ph. L. Toint,
Worst-case evaluation complexity of non-monotone gradient-related algorithms for unconstrained optimization.
Optimization 64(5), 1349–1361 (2015).
DOI: 10.1080/02331934.2013.869809The method initializes an interval [t_low,t_up] guaranteed to contain a point satifying both Armijo and Goldstein conditions, and then uses a bisection algorithm to find such a point. The method is implemented with M=0 (see reference), i.e., Armijo and Goldstein conditions are satisfied only for the current value of the objective h₀.
Examples
using SolverTools, ADNLPModels
nlp = ADNLPModel(x -> x[1]^2 + 4 * x[2]^2, ones(2))
lm = LineModel(nlp, nlp.meta.x0, -ones(2))
t, ft, nbk, nbG = armijo_goldstein(lm, obj(lm, 0.0), grad(lm, 0.0))SolverTools.armijo_wolfe — Methodt, good_grad, ht, nbk, nbW = armijo_wolfe(h, h₀, slope, g)Performs a line search from x along the direction d as defined by the LineModel $h(t) = f(x + t d)$, where h₀ = h(0) = f(x), slope = h'(0) = ∇f(x)ᵀd and g is a vector that will be overwritten with the gradient at various points. On exit, if good_grad=true, g contains the gradient at the final step length. The steplength is chosen trying to satisfy the Armijo and Wolfe conditions. The Armijo condition is
\[h(t) ≤ h₀ + τ₀ t h'(0)\]
and the Wolfe condition is
\[h'(t) ≤ τ₁ h'(0).\]
Initially the step is increased trying to satisfy the Wolfe condition. Afterwards, only backtracking is performed in order to try to satisfy the Armijo condition. The final steplength may only satisfy Armijo's condition.
The output is the following:
- t: the step length;
- good_grad: whether
gis the gradient atx + t * d; - ht: the model value at
t, i.e.,f(x + t * d); - nbk: the number of times the steplength was decreased to satisfy the Armijo condition, i.e., number of backtracks;
- nbW: the number of times the steplength was increased to satisfy the Wolfe condition.
The following keyword arguments can be provided:
t: starting steplength (default1);τ₀: slope factor in the Armijo condition (defaultmax(1e-4, √ϵₘ));τ₁: slope factor in the Wolfe condition. It should satisfyτ₁ > τ₀(default0.9999);bk_max: maximum number of backtracks (default10);bW_max: maximum number of increases (default5);verbose: whether to print information (defaultfalse).
SolverTools.breakpoints — Methodnbrk, brkmin, brkmax = breakpoints(x, d, ℓ, u)Find the smallest and largest values of α such that x + αd lies on the boundary. x is assumed to be feasible. nbrk is the number of breakpoints from x in the direction d.
SolverTools.compute_As_slope_qs! — Methodslope, qs = compute_As_slope_qs!(As, A, s, Fx)Compute slope = dot(As, Fx) and qs = dot(As, As) / 2 + slope. Use As to store A * s.
SolverTools.compute_Hs_slope_qs! — Methodslope, qs = compute_Hs_slope_qs!(Hs, H, s, g)Computes
Hs = H * s
slope = dot(g,s)
qs = ¹/₂sᵀHs + slopeSolverTools.compute_r — Methodcompute_r(nlp, f, Δf, Δq, slope, d, xnext, gnext, robust)Compute the actual vs predicted reduction ratio ∆f/Δq.
Arguments:
nlp: Current model we are trying to solvef: current objective valueΔf:= f - f_trialis the actual reduction is an objective/merit/penalty function,Δq:q - q_trialis the reduction predicted by the model q of f.slope: current sloped: potential next directionxnext: potential next iterategnext: current gradient value, ifgood_gradis true, then this value has been udpated.robust: iftrue, try to trap potential cancellation errors
Output:
r: reduction ratio∆f/Δqgood_grad:trueifgnexthas been recomputedgnext: gradient.
We assume that qis being minimized, and therefore thatΔq > 0`.
SolverTools.goldstein_condition — Methodgoldstein_condition(h₀::T, ht::T, τ₁::T, t::T, slope::T)
Returns true if Goldstein condition is satisfied for τ₁.
SolverTools.hess! — MethodEvaluate the second derivative of the LineModel
ϕ(t) := f(x + td),i.e.,
ϕ"(t) = dᵀ∇²f(x + td)d.SolverTools.project! — Methodproject!(y, x, ℓ, u)Projects x into bounds ℓ and u, in the sense of yᵢ = max(ℓᵢ, min(xᵢ, uᵢ)).
SolverTools.project_step! — Methodproject_step!(y, x, d, ℓ, u, α = 1.0)Computes the projected direction y = P(x + α * d) - x.
SolverTools.redirect! — Methodredirect!(ϕ, x, d)
Change the values of x and d of the LineModel ϕ, but retains the counters.
SolverTools.update! — Functionupdate!(tr, step_norm)
Update the trust-region radius based on the ratio tr.ratio of actual vs. predicted reduction and the step norm.