Reference
Contents
Index
SolverTools.ARTrustRegion
SolverTools.AbstractTrustRegion
SolverTools.LineModel
SolverTools.TRONTrustRegion
SolverTools.TrustRegion
SolverTools.TrustRegionException
LinearOperators.reset!
NLPModels.grad
NLPModels.grad!
NLPModels.hess
NLPModels.obj
NLPModels.objgrad
NLPModels.objgrad!
SolverTools.acceptable
SolverTools.active
SolverTools.active!
SolverTools.aredpred!
SolverTools.aredpred_common
SolverTools.armijo_condition
SolverTools.armijo_goldstein
SolverTools.armijo_wolfe
SolverTools.breakpoints
SolverTools.compute_As_slope_qs!
SolverTools.compute_Hs_slope_qs!
SolverTools.compute_r
SolverTools.goldstein_condition
SolverTools.hess!
SolverTools.project!
SolverTools.project_step!
SolverTools.redirect!
SolverTools.update!
SolverTools.ARTrustRegion
— TypeARTrustRegion(α₀::T;kwargs...)
Select the main parameters used in the TRARC
algorithm with α₀
as initial TR/ARC parameter. The keyword arguments are:
max_α::T
: Maximum value forα
. DefaultT(1) / sqrt(eps(T))
.acceptance_threshold::T
: Ratio over which the step is successful. DefaultT(0.1)
.increase_threshold::T
: Ratio over which we increaseα
. DefaultT(0.75)
.reduce_threshold::T
: Ratio under which we decreaseα
. DefaultT(0.1)
.increase_factor::T
: Factor of increase ofα
. DefaultT(5.0)
.decrease_factor::T
: Factor of decrease ofα
. DefaultT(0.1)
.max_unsuccinarow::Int
: Limit on the number of successive unsucessful iterations. Default30
.
Returns a ARTrustRegion
structure.
This can be compared to TrustRegion
or TRONTrustRegion
.
SolverTools.AbstractTrustRegion
— TypeAbstractTrustRegion
An abstract trust region type so that specific trust regions define update rules differently. Child types must have at least the following real fields:
acceptance_threshold
initial_radius
radius
ratio
and the following function:
update!(tr, step_norm)
SolverTools.LineModel
— TypeA type to represent the restriction of a function to a direction. Given f : R → Rⁿ, x ∈ Rⁿ and a nonzero direction d ∈ Rⁿ,
ϕ = LineModel(nlp, x, d)
represents the function ϕ : R → R defined by
ϕ(t) := f(x + td).
SolverTools.TRONTrustRegion
— TypeTRONTrustRegion{T, V} <: AbstractTrustRegion{T, V}
Trust region used by TRON that contains the following fields:
initial_radius::T
: initial radius;radius::T
: current radius;max_radius::T
: upper bound on the radius (default1 / sqrt(eps(T))
);acceptance_threshold::T
: decrease radius if ratio is below this threshold between 0 and 1 (default1e-4
);decrease_threshold::T
: ...between 0 and 1 (default0.25
);increase_threshold::T
: increase radius if ratio is beyond this threshold between 0 and 1 (default0.75
);large_decrease_factor::T
: decrease factor between 0 and 1 (default0.25
);small_decrease_factor::T
: decrease factor between 0 and 1 (default0.5
);increase_factor::T
: increase factor greater than one (default4
);ratio::T
: current ratioared / pred
;quad_min::T
: ...;gt::V
: pre-allocated memory vector to store the gradient of the objective function;good_grad::Bool
:true
ifgt
is the gradient of the objective function at the trial point.
The following constructors are available:
TRONTrustRegion(gt,::V initial_radius::T; kwargs...)
If gt
is not known, it is possible to use the following constructors:
TRONTrustRegion(::Type{V}, n::Int, Δ₀::T; kwargs...)
TRONTrustRegion(n::Int, Δ₀::T; kwargs...)
that will allocate a vector of size n
and type V
or Vector{T}
.
SolverTools.TrustRegion
— TypeTrustRegion{T, V} <: AbstractTrustRegion{T, V}
Basic trust region type that contains the following fields:
initial_radius::T
: initial radius;radius::T
: current radius;max_radius::T
: upper bound on the radius (default1 / sqrt(eps(T))
);acceptance_threshold::T
: decrease radius if ratio is below this threshold between 0 and 1 (default1e-4
);increase_threshold::T
: increase radius if ratio is beyond this threshold between 0 and 1 (default0.95
);decrease_factor::T
: decrease factor less between 0 and 1 (default1 / 3
);increase_factor::T
: increase factor greater than one (default3 / 2
);ratio::T
: current ratioared / pred
;gt::V
: pre-allocated memory vector to store the gradient of the objective function;good_grad::Bool
:true
ifgt
is the gradient of the objective function at the trial point.
The following constructors are available:
TrustRegion(gt,::V initial_radius::T; kwargs...)
If gt
is not known, it is possible to use the following constructors:
TrustRegion(::Type{V}, n::Int, Δ₀::T; kwargs...)
TrustRegion(n::Int, Δ₀::T; kwargs...)
that will allocate a vector of size n
and type V
or Vector{T}
.
SolverTools.TrustRegionException
— TypeException type raised in case of error.
LinearOperators.reset!
— Methodreset!(tr)
Reset the trust-region radius to its initial value.
NLPModels.grad!
— Methodgrad!(f, t, g)
evaluates the first derivative of the LineModel
ϕ(t) := f(x + td),
i.e.,
ϕ'(t) = ∇f(x + td)ᵀd.
The gradient ∇f(x + td) is stored in g
.
NLPModels.grad
— Methodgrad(f, t)
evaluates the first derivative of the LineModel
ϕ(t) := f(x + td),
i.e.,
ϕ'(t) = ∇f(x + td)ᵀd.
NLPModels.hess
— MethodEvaluate the second derivative of the LineModel
ϕ(t) := f(x + td),
i.e.,
ϕ"(t) = dᵀ∇²f(x + td)d.
NLPModels.obj
— Methodobj(f, t)
evaluates the objective of the LineModel
ϕ(t) := f(x + td).
NLPModels.objgrad!
— Methodobjgrad!(f, t, g)
evaluates the objective and first derivative of the LineModel
ϕ(t) := f(x + td),
and
ϕ'(t) = ∇f(x + td)ᵀd.
The gradient ∇f(x + td) is stored in g
.
NLPModels.objgrad
— Methodobjgrad(f, t)
evaluates the objective and first derivative of the LineModel
ϕ(t) := f(x + td),
and
ϕ'(t) = ∇f(x + td)ᵀd.
SolverTools.acceptable
— Methodacceptable(tr)
Return true
if a step is acceptable, i.e. large or equal to tr.acceptance_threshold
.
SolverTools.active!
— Methodactive!(indices, x, ℓ, u; rtol = 1e-8, atol = 1e-8)
Update a BitVector
of the active bounds at x, using tolerance min(rtol * (uᵢ-ℓᵢ), atol)
. If ℓᵢ or uᵢ is not finite, only atol
is used.
SolverTools.active
— Methodactive(x, ℓ, u; rtol = 1e-8, atol = 1e-8)
Computes the active bounds at x, using tolerance min(rtol * (uᵢ-ℓᵢ), atol)
. If ℓᵢ or uᵢ is not finite, only atol
is used.
SolverTools.aredpred!
— Methodared, pred = aredpred(tr, nlp, f, f_trial, Δm, x_trial, step, slope)
ared, pred = aredpred(tr, nls, Fx, f, f_trial, Δm, x_trial, step, slope)
Compute the actual and predicted reductions ∆f
and Δm
, where ared = Δf = f_trial - f
is the actual reduction is an objective/merit/penalty function, pred = Δm = m_trial - m
is the reduction predicted by the model m
of f
. We assume that m
is being minimized, and therefore that Δm < 0
.
If tr.good_grad
is true
, then the gradient of nlp
at x_trial
is stored in tr.gt
. For AbstractNLSModel
, the argument Fx
stores the residual if the gradient is updated.
SolverTools.aredpred_common
— Methodared, pred, good_grad = aredpred_common(nlp, f, f_trial, Δm, x_trial, step, g_trial, slope)
ared, pred, good_grad = aredpred_common(nlp, Fx, f, f_trial, Δm, x_trial, step, g_trial, slope)
Compute the actual and predicted reductions ∆f
and Δm
, where ared = Δf = f_trial - f
is the actual reduction is an objective/merit/penalty function, pred = Δm = m_trial - m
is the reduction predicted by the model m
of f
. We assume that m
is being minimized, and therefore that Δm < 0
.
good_grad
is true
if the gradient of nlp
at x_trial
has been updated and stored in g_trial
. For AbstractNLSModel
, the argument Fx
stores the residual if the gradient is updated.
SolverTools.armijo_condition
— Methodarmijo_condition(h₀::T, ht::T, τ₀::T, t::T, slope::T)
Returns true if Armijo condition is satisfied for τ₀
.
SolverTools.armijo_goldstein
— Methodt, ht, nbk, nbG = armijo_goldstein(h, h₀, slope)
Perform a line search from x
along the direction d
as defined by the LineModel
h(t) = f(x + t d)
, where h₀ = h(0) = f(x)
, and slope = h'(0) = ∇f(x)ᵀd
. The steplength is chosen to satisfy the Armijo-Goldstein conditions. The Armijo condition is
\[h(t) ≤ h₀ + τ₀ t h'(0)\]
and the Goldstein condition is
\[h(t) ≥ h₀ + τ₁ t h'(0).\]
with 0 < τ₀ < τ₁ < 1
.
Arguments
h::LineModel{T, S, M}
: 1-D model along the search directiond
, $h(t) = f(x + t d)$h₀::T
: value ofh
att = 0
slope
: dot product of the gradient and the search direction, $∇f(x)ᵀd$
Keyword arguments
t::T = one(T)
: initial steplength (default:1
);τ₀::T = T(eps(T)^(1/4))
: slope factor in the Armijo condition. It should satisfy0 < τ₀ < τ₁ < 1
;τ₁::T = min(T(1)-eps(T), T(0.9999))
: slope factor in the Goldstein condition. It should satisfy0 < τ₀ < τ₁ < 1
;γ₀::T = T(1 / 2)
: backtracking step length mutliplicative factor (0 < γ₀ <1)γ₁::T = T(2)
: look-ahead step length mutliplicative factor (γ₁ > 1)bk_max
: maximum number of backtracks (default:10
);bG_max
: maximum number of increases (default:10
);verbose
: whether to print information (default:false
).
Outputs
t::T
: the step length;ht::T
: the model value att
, i.e.,f(x + t * d)
;nbk::Int
: the number of times the steplength was decreased to satisfy the Armijo condition, i.e., the number of backtracks;nbG::Int
: the number of times the steplength was increased to satisfy the Goldstein condition.
References
This implementation follows the description given in
C. Cartis, P. R. Sampaio, Ph. L. Toint,
Worst-case evaluation complexity of non-monotone gradient-related algorithms for unconstrained optimization.
Optimization 64(5), 1349–1361 (2015).
DOI: 10.1080/02331934.2013.869809
The method initializes an interval [t_low,t_up]
guaranteed to contain a point satifying both Armijo and Goldstein conditions, and then uses a bisection algorithm to find such a point. The method is implemented with M=0 (see reference), i.e., Armijo and Goldstein conditions are satisfied only for the current value of the objective h₀
.
Examples
using SolverTools, ADNLPModels
nlp = ADNLPModel(x -> x[1]^2 + 4 * x[2]^2, ones(2))
lm = LineModel(nlp, nlp.meta.x0, -ones(2))
t, ft, nbk, nbG = armijo_goldstein(lm, obj(lm, 0.0), grad(lm, 0.0))
SolverTools.armijo_wolfe
— Methodt, good_grad, ht, nbk, nbW = armijo_wolfe(h, h₀, slope, g)
Performs a line search from x
along the direction d
as defined by the LineModel
$h(t) = f(x + t d)$, where h₀ = h(0) = f(x)
, slope = h'(0) = ∇f(x)ᵀd
and g
is a vector that will be overwritten with the gradient at various points. On exit, if good_grad=true
, g
contains the gradient at the final step length. The steplength is chosen trying to satisfy the Armijo and Wolfe conditions. The Armijo condition is
\[h(t) ≤ h₀ + τ₀ t h'(0)\]
and the Wolfe condition is
\[h'(t) ≤ τ₁ h'(0).\]
Initially the step is increased trying to satisfy the Wolfe condition. Afterwards, only backtracking is performed in order to try to satisfy the Armijo condition. The final steplength may only satisfy Armijo's condition.
The output is the following:
- t: the step length;
- good_grad: whether
g
is the gradient atx + t * d
; - ht: the model value at
t
, i.e.,f(x + t * d)
; - nbk: the number of times the steplength was decreased to satisfy the Armijo condition, i.e., number of backtracks;
- nbW: the number of times the steplength was increased to satisfy the Wolfe condition.
The following keyword arguments can be provided:
t
: starting steplength (default1
);τ₀
: slope factor in the Armijo condition (defaultmax(1e-4, √ϵₘ)
);τ₁
: slope factor in the Wolfe condition. It should satisfyτ₁ > τ₀
(default0.9999
);bk_max
: maximum number of backtracks (default10
);bW_max
: maximum number of increases (default5
);verbose
: whether to print information (defaultfalse
).
SolverTools.breakpoints
— Methodnbrk, brkmin, brkmax = breakpoints(x, d, ℓ, u)
Find the smallest and largest values of α
such that x + αd
lies on the boundary. x
is assumed to be feasible. nbrk
is the number of breakpoints from x
in the direction d
.
SolverTools.compute_As_slope_qs!
— Methodslope, qs = compute_As_slope_qs!(As, A, s, Fx)
Compute slope = dot(As, Fx)
and qs = dot(As, As) / 2 + slope
. Use As
to store A * s
.
SolverTools.compute_Hs_slope_qs!
— Methodslope, qs = compute_Hs_slope_qs!(Hs, H, s, g)
Computes
Hs = H * s
slope = dot(g,s)
qs = ¹/₂sᵀHs + slope
SolverTools.compute_r
— Methodcompute_r(nlp, f, Δf, Δq, slope, d, xnext, gnext, robust)
Compute the actual vs predicted reduction ratio ∆f/Δq
.
Arguments:
nlp
: Current model we are trying to solvef
: current objective valueΔf
:= f - f_trial
is the actual reduction is an objective/merit/penalty function,Δq
:q - q_trial
is the reduction predicted by the model q of f.slope
: current sloped
: potential next directionxnext
: potential next iterategnext
: current gradient value, ifgood_grad
is true, then this value has been udpated.robust
: iftrue
, try to trap potential cancellation errors
Output:
r
: reduction ratio∆f/Δq
good_grad
:true
ifgnext
has been recomputedgnext
: gradient.
We assume that q
is being minimized, and therefore that
Δq > 0`.
SolverTools.goldstein_condition
— Methodgoldstein_condition(h₀::T, ht::T, τ₁::T, t::T, slope::T)
Returns true if Goldstein condition is satisfied for τ₁
.
SolverTools.hess!
— MethodEvaluate the second derivative of the LineModel
ϕ(t) := f(x + td),
i.e.,
ϕ"(t) = dᵀ∇²f(x + td)d.
SolverTools.project!
— Methodproject!(y, x, ℓ, u)
Projects x
into bounds ℓ
and u
, in the sense of yᵢ = max(ℓᵢ, min(xᵢ, uᵢ))
.
SolverTools.project_step!
— Methodproject_step!(y, x, d, ℓ, u, α = 1.0)
Computes the projected direction y = P(x + α * d) - x
.
SolverTools.redirect!
— Methodredirect!(ϕ, x, d)
Change the values of x and d of the LineModel ϕ, but retains the counters.
SolverTools.update!
— Functionupdate!(tr, step_norm)
Update the trust-region radius based on the ratio tr.ratio
of actual vs. predicted reduction and the step norm.