Reference

Reference

Contents
Index

Index

SolverTools.ARTrustRegion
SolverTools.AbstractTrustRegion
SolverTools.LineModel
SolverTools.TRONTrustRegion
SolverTools.TrustRegion
SolverTools.TrustRegionException
LinearOperators.reset!
NLPModels.grad
NLPModels.grad!
NLPModels.hess
NLPModels.obj
NLPModels.objgrad
NLPModels.objgrad!
SolverTools.acceptable
SolverTools.active
SolverTools.active!
SolverTools.aredpred!
SolverTools.aredpred_common
SolverTools.armijo_condition
SolverTools.armijo_goldstein
SolverTools.armijo_wolfe
SolverTools.breakpoints
SolverTools.compute_As_slope_qs!
SolverTools.compute_Hs_slope_qs!
SolverTools.compute_r
SolverTools.goldstein_condition
SolverTools.hess!
SolverTools.project!
SolverTools.project_step!
SolverTools.redirect!
SolverTools.update!

SolverTools.ARTrustRegion — Type

ARTrustRegion(α₀::T;kwargs...)

Select the main parameters used in the TRARC algorithm with α₀ as initial TR/ARC parameter. The keyword arguments are:

max_α::T: Maximum value for α. Default T(1) / sqrt(eps(T)).
acceptance_threshold::T: Ratio over which the step is successful. Default T(0.1).
increase_threshold::T: Ratio over which we increase α. Default T(0.75).
reduce_threshold::T: Ratio under which we decrease α. Default T(0.1).
increase_factor::T: Factor of increase of α. Default T(5.0).
decrease_factor::T: Factor of decrease of α. Default T(0.1).
max_unsuccinarow::Int: Limit on the number of successive unsucessful iterations. Default 30.

Returns a ARTrustRegion structure.

This can be compared to TrustRegion or TRONTrustRegion.

source

SolverTools.AbstractTrustRegion — Type

AbstractTrustRegion

An abstract trust region type so that specific trust regions define update rules differently. Child types must have at least the following real fields:

acceptance_threshold
initial_radius
radius
ratio

and the following function:

update!(tr, step_norm)

source

SolverTools.LineModel — Type

A type to represent the restriction of a function to a direction. Given f : R → Rⁿ, x ∈ Rⁿ and a nonzero direction d ∈ Rⁿ,

ϕ = LineModel(nlp, x, d)

represents the function ϕ : R → R defined by

ϕ(t) := f(x + td).

source

SolverTools.TRONTrustRegion — Type

TRONTrustRegion{T, V} <: AbstractTrustRegion{T, V}

Trust region used by TRON that contains the following fields:

initial_radius::T: initial radius;
radius::T: current radius;
max_radius::T: upper bound on the radius (default 1 / sqrt(eps(T)));
acceptance_threshold::T: decrease radius if ratio is below this threshold between 0 and 1 (default 1e-4);
decrease_threshold::T: ...between 0 and 1 (default 0.25);
increase_threshold::T: increase radius if ratio is beyond this threshold between 0 and 1 (default 0.75);
large_decrease_factor::T: decrease factor between 0 and 1 (default 0.25);
small_decrease_factor::T: decrease factor between 0 and 1 (default 0.5);
increase_factor::T: increase factor greater than one (default 4);
ratio::T: current ratio ared / pred;
quad_min::T: ...;
gt::V: pre-allocated memory vector to store the gradient of the objective function;
good_grad::Bool: true if gt is the gradient of the objective function at the trial point.

The following constructors are available:

TRONTrustRegion(gt,::V initial_radius::T; kwargs...)

If gt is not known, it is possible to use the following constructors:

TRONTrustRegion(::Type{V}, n::Int, Δ₀::T; kwargs...)
TRONTrustRegion(n::Int, Δ₀::T; kwargs...)

that will allocate a vector of size n and type V or Vector{T}.

source

SolverTools.TrustRegion — Type

TrustRegion{T, V} <: AbstractTrustRegion{T, V}

Basic trust region type that contains the following fields:

initial_radius::T: initial radius;
radius::T: current radius;
max_radius::T: upper bound on the radius (default 1 / sqrt(eps(T)));
acceptance_threshold::T: decrease radius if ratio is below this threshold between 0 and 1 (default 1e-4);
increase_threshold::T: increase radius if ratio is beyond this threshold between 0 and 1 (default 0.95);
decrease_factor::T: decrease factor less between 0 and 1 (default 1 / 3);
increase_factor::T: increase factor greater than one (default 3 / 2);
ratio::T: current ratio ared / pred;
gt::V: pre-allocated memory vector to store the gradient of the objective function;
good_grad::Bool: true if gt is the gradient of the objective function at the trial point.

The following constructors are available:

TrustRegion(gt,::V initial_radius::T; kwargs...)

If gt is not known, it is possible to use the following constructors:

TrustRegion(::Type{V}, n::Int, Δ₀::T; kwargs...)
TrustRegion(n::Int, Δ₀::T; kwargs...)

that will allocate a vector of size n and type V or Vector{T}.

source

SolverTools.TrustRegionException — Type

Exception type raised in case of error.

source

LinearOperators.reset! — Method

reset!(tr)

Reset the trust-region radius to its initial value.

source

NLPModels.grad! — Method

grad!(f, t, g) evaluates the first derivative of the LineModel

ϕ(t) := f(x + td),

i.e.,

ϕ'(t) = ∇f(x + td)ᵀd.

The gradient ∇f(x + td) is stored in g.

source

NLPModels.grad — Method

grad(f, t) evaluates the first derivative of the LineModel

ϕ(t) := f(x + td),

i.e.,

ϕ'(t) = ∇f(x + td)ᵀd.

source

NLPModels.hess — Method

Evaluate the second derivative of the LineModel

ϕ(t) := f(x + td),

i.e.,

ϕ"(t) = dᵀ∇²f(x + td)d.

source

NLPModels.obj — Method

obj(f, t) evaluates the objective of the LineModel

ϕ(t) := f(x + td).

source

NLPModels.objgrad! — Method

objgrad!(f, t, g) evaluates the objective and first derivative of the LineModel

ϕ(t) := f(x + td),

and

ϕ'(t) = ∇f(x + td)ᵀd.

The gradient ∇f(x + td) is stored in g.

source

NLPModels.objgrad — Method

objgrad(f, t) evaluates the objective and first derivative of the LineModel

ϕ(t) := f(x + td),

and

ϕ'(t) = ∇f(x + td)ᵀd.

source

SolverTools.acceptable — Method

acceptable(tr)

Return true if a step is acceptable, i.e. large or equal to tr.acceptance_threshold.

source

SolverTools.active! — Method

active!(indices, x, ℓ, u; rtol = 1e-8, atol = 1e-8)

Update a BitVector of the active bounds at x, using tolerance min(rtol * (uᵢ-ℓᵢ), atol). If ℓᵢ or uᵢ is not finite, only atol is used.

source

SolverTools.active — Method

active(x, ℓ, u; rtol = 1e-8, atol = 1e-8)

Computes the active bounds at x, using tolerance min(rtol * (uᵢ-ℓᵢ), atol). If ℓᵢ or uᵢ is not finite, only atol is used.

source

SolverTools.aredpred! — Method

ared, pred = aredpred(tr, nlp, f, f_trial, Δm, x_trial, step, slope)
ared, pred = aredpred(tr, nls, Fx, f, f_trial, Δm, x_trial, step, slope)

Compute the actual and predicted reductions ∆f and Δm, where ared = Δf = f_trial - f is the actual reduction is an objective/merit/penalty function, pred = Δm = m_trial - m is the reduction predicted by the model m of f. We assume that m is being minimized, and therefore that Δm < 0.

If tr.good_grad is true, then the gradient of nlp at x_trial is stored in tr.gt. For AbstractNLSModel, the argument Fx stores the residual if the gradient is updated.

source

SolverTools.aredpred_common — Method

ared, pred, good_grad = aredpred_common(nlp, f, f_trial, Δm, x_trial, step, g_trial, slope)
ared, pred, good_grad = aredpred_common(nlp, Fx, f, f_trial, Δm, x_trial, step, g_trial, slope)

good_grad is true if the gradient of nlp at x_trial has been updated and stored in g_trial. For AbstractNLSModel, the argument Fx stores the residual if the gradient is updated.

source

SolverTools.armijo_condition — Method

armijo_condition(h₀::T, ht::T, τ₀::T, t::T, slope::T)

Returns true if Armijo condition is satisfied for τ₀.

source

SolverTools.armijo_goldstein — Method

t, ht, nbk, nbG = armijo_goldstein(h, h₀, slope)

Perform a line search from x along the direction d as defined by the LineModel h(t) = f(x + t d), where h₀ = h(0) = f(x), and slope = h'(0) = ∇f(x)ᵀd. The steplength is chosen to satisfy the Armijo-Goldstein conditions. The Armijo condition is

\[h(t) ≤ h₀ + τ₀ t h'(0)\]

and the Goldstein condition is

\[h(t) ≥ h₀ + τ₁ t h'(0).\]

with 0 < τ₀ < τ₁ < 1.

Arguments

h::LineModel{T, S, M}: 1-D model along the search direction d, $h(t) = f(x + t d)$
h₀::T: value of h at t = 0
slope: dot product of the gradient and the search direction, $∇f(x)ᵀd$

Keyword arguments

t::T = one(T): initial steplength (default: 1);
τ₀::T = T(eps(T)^(1/4)): slope factor in the Armijo condition. It should satisfy 0 < τ₀ < τ₁ < 1;
τ₁::T = min(T(1)-eps(T), T(0.9999)): slope factor in the Goldstein condition. It should satisfy 0 < τ₀ < τ₁ < 1;
γ₀::T = T(1 / 2): backtracking step length mutliplicative factor (0 < γ₀ <1)
γ₁::T = T(2): look-ahead step length mutliplicative factor (γ₁ > 1)
bk_max: maximum number of backtracks (default: 10);
bG_max: maximum number of increases (default: 10);
verbose: whether to print information (default: false).

Outputs

t::T: the step length;
ht::T: the model value at t, i.e., f(x + t * d);
nbk::Int: the number of times the steplength was decreased to satisfy the Armijo condition, i.e., the number of backtracks;
nbG::Int: the number of times the steplength was increased to satisfy the Goldstein condition.

References

This implementation follows the description given in

C. Cartis, P. R. Sampaio, Ph. L. Toint,
Worst-case evaluation complexity of non-monotone gradient-related algorithms for unconstrained optimization.
Optimization 64(5), 1349–1361 (2015).
DOI: 10.1080/02331934.2013.869809

The method initializes an interval [t_low,t_up] guaranteed to contain a point satifying both Armijo and Goldstein conditions, and then uses a bisection algorithm to find such a point. The method is implemented with M=0 (see reference), i.e., Armijo and Goldstein conditions are satisfied only for the current value of the objective h₀.

Examples

using SolverTools, ADNLPModels
nlp = ADNLPModel(x -> x[1]^2 + 4 * x[2]^2, ones(2))
lm = LineModel(nlp, nlp.meta.x0, -ones(2))

t, ft, nbk, nbG = armijo_goldstein(lm, obj(lm, 0.0), grad(lm, 0.0))

source

SolverTools.armijo_wolfe — Method

t, good_grad, ht, nbk, nbW = armijo_wolfe(h, h₀, slope, g)

Performs a line search from x along the direction d as defined by the LineModel $h(t) = f(x + t d)$, where h₀ = h(0) = f(x), slope = h'(0) = ∇f(x)ᵀd and g is a vector that will be overwritten with the gradient at various points. On exit, if good_grad=true, g contains the gradient at the final step length. The steplength is chosen trying to satisfy the Armijo and Wolfe conditions. The Armijo condition is

\[h(t) ≤ h₀ + τ₀ t h'(0)\]

and the Wolfe condition is

\[h'(t) ≤ τ₁ h'(0).\]

Initially the step is increased trying to satisfy the Wolfe condition. Afterwards, only backtracking is performed in order to try to satisfy the Armijo condition. The final steplength may only satisfy Armijo's condition.

The output is the following:

t: the step length;
good_grad: whether g is the gradient at x + t * d;
ht: the model value at t, i.e., f(x + t * d);
nbk: the number of times the steplength was decreased to satisfy the Armijo condition, i.e., number of backtracks;
nbW: the number of times the steplength was increased to satisfy the Wolfe condition.

The following keyword arguments can be provided:

t: starting steplength (default 1);
τ₀: slope factor in the Armijo condition (default max(1e-4, √ϵₘ));
τ₁: slope factor in the Wolfe condition. It should satisfy τ₁ > τ₀ (default 0.9999);
bk_max: maximum number of backtracks (default 10);
bW_max: maximum number of increases (default 5);
verbose: whether to print information (default false).

source

SolverTools.breakpoints — Method

nbrk, brkmin, brkmax = breakpoints(x, d, ℓ, u)

Find the smallest and largest values of α such that x + αd lies on the boundary. x is assumed to be feasible. nbrk is the number of breakpoints from x in the direction d.

source

SolverTools.compute_As_slope_qs! — Method

slope, qs = compute_As_slope_qs!(As, A, s, Fx)

Compute slope = dot(As, Fx) and qs = dot(As, As) / 2 + slope. Use As to store A * s.

source

SolverTools.compute_Hs_slope_qs! — Method

slope, qs = compute_Hs_slope_qs!(Hs, H, s, g)

Computes

Hs = H * s
slope = dot(g,s)
qs = ¹/₂sᵀHs + slope

source

SolverTools.compute_r — Method

compute_r(nlp, f, Δf, Δq, slope, d, xnext, gnext, robust)

Compute the actual vs predicted reduction ratio ∆f/Δq.

Arguments:

nlp: Current model we are trying to solve
f: current objective value
Δf: = f - f_trial is the actual reduction is an objective/merit/penalty function,
Δq: q - q_trial is the reduction predicted by the model q of f.
slope: current slope
d: potential next direction
xnext: potential next iterate
gnext: current gradient value, if good_grad is true, then this value has been udpated.
robust: if true, try to trap potential cancellation errors

Output:

r: reduction ratio ∆f/Δq
good_grad: true if gnext has been recomputed
gnext: gradient.

We assume that qis being minimized, and therefore thatΔq > 0`.

source

SolverTools.goldstein_condition — Method

goldstein_condition(h₀::T, ht::T, τ₁::T, t::T, slope::T)

Returns true if Goldstein condition is satisfied for τ₁.

source

SolverTools.hess! — Method

Evaluate the second derivative of the LineModel

ϕ(t) := f(x + td),

i.e.,

ϕ"(t) = dᵀ∇²f(x + td)d.

source

SolverTools.project! — Method

project!(y, x, ℓ, u)

Projects x into bounds ℓ and u, in the sense of yᵢ = max(ℓᵢ, min(xᵢ, uᵢ)).

source

SolverTools.project_step! — Method

project_step!(y, x, d, ℓ, u, α = 1.0)

Computes the projected direction y = P(x + α * d) - x.

source

SolverTools.redirect! — Method

redirect!(ϕ, x, d)

Change the values of x and d of the LineModel ϕ, but retains the counters.

source

SolverTools.update! — Function

update!(tr, step_norm)

Update the trust-region radius based on the ratio tr.ratio of actual vs. predicted reduction and the step norm.

source

Reference

Contents

Index