Tutorial

Tutorial

NLPModels.jl was created for two purposes:

Mainly, this means CUTEst.jl, but it also gives access to AMPL problems, as well as JuMP defined problems (e.g. as in OptimizationProblems.jl).

As a consequence, optimization methods designed according to the NLPModels API will accept NLPModels of any provenance. See, for instance, Optimize.jl.

The main interfaces for user defined problems are

ADNLPModel Tutorial

ADNLPModel is simple to use and is useful for classrooms. It only needs the objective function $f$ and a starting point $x^0$ to be well-defined. For constrained problems, you'll also need the constraints function $c$, and the constraints vectors $c_L$ and $c_U$, such that $c_L \leq c(x) \leq c_U$. Equality constraints will be automatically identified as those indices $i$ for which $c_{L_i} = c_{U_i}$.

Let's define the famous Rosenbrock function

\[f(x) = (x_1 - 1)^2 + 100(x_2 - x_1^2)^2,\]

with starting point $x^0 = (-1.2,1.0)$.

using NLPModels

nlp = ADNLPModel(x->(x[1] - 1.0)^2 + 100*(x[2] - x[1]^2)^2 , [-1.2; 1.0])
ADNLPModel(Minimization problem Generic
nvar = 2, ncon = 0 (0 linear)
, Counters(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), getfield(Main.ex-adnlp, Symbol("##1#2"))(), getfield(NLPModels, Symbol("##117#120"))())

This is enough to define the model. Let's get the objective function value at $x^0$, using only nlp.

fx = obj(nlp, nlp.meta.x0)
println("fx = $fx")
fx = 24.199999999999996

Done. Let's try the gradient and Hessian.

gx = grad(nlp, nlp.meta.x0)
Hx = hess(nlp, nlp.meta.x0)
println("gx = $gx")
println("Hx = $Hx")
gx = [-215.6, -88.0]
Hx = [1330.0 0.0; 480.0 200.0]

Notice how only the lower triangle of the Hessian is stored. Also notice that it is dense. This is a current limitation of this model. It doesn't return sparse matrices, so use it with care.

Let's do something a little more complex here, defining a function to try to solve this problem through steepest descent method with Armijo search. Namely, the method

  1. Given $x^0$, $\varepsilon > 0$, and $\eta \in (0,1)$. Set $k = 0$;
  2. If $\Vert \nabla f(x^k) \Vert < \varepsilon$ STOP with $x^* = x^k$;
  3. Compute $d^k = -\nabla f(x^k)$;
  4. Compute $\alpha_k \in (0,1]$ such that $f(x^k + \alpha_kd^k) < f(x^k) + \alpha_k\eta \nabla f(x^k)^Td^k$
  5. Define $x^{k+1} = x^k + \alpha_kx^k$
  6. Update $k = k + 1$ and go to step 2.
using LinearAlgebra

function steepest(nlp; itmax=100000, eta=1e-4, eps=1e-6, sigma=0.66)
  x = nlp.meta.x0
  fx = obj(nlp, x)
  ∇fx = grad(nlp, x)
  slope = dot(∇fx, ∇fx)
  ∇f_norm = sqrt(slope)
  iter = 0
  while ∇f_norm > eps && iter < itmax
    t = 1.0
    x_trial = x - t * ∇fx
    f_trial = obj(nlp, x_trial)
    while f_trial > fx - eta * t * slope
      t *= sigma
      x_trial = x - t * ∇fx
      f_trial = obj(nlp, x_trial)
    end
    x = x_trial
    fx = f_trial
    ∇fx = grad(nlp, x)
    slope = dot(∇fx, ∇fx)
    ∇f_norm = sqrt(slope)
    iter += 1
  end
  optimal = ∇f_norm <= eps
  return x, fx, ∇f_norm, optimal, iter
end

x, fx, ngx, optimal, iter = steepest(nlp)
println("x = $x")
println("fx = $fx")
println("ngx = $ngx")
println("optimal = $optimal")
println("iter = $iter")
x = [1.0, 1.0]
fx = 4.2438440239813445e-13
ngx = 9.984661274466946e-7
optimal = true
iter = 17962

Maybe this code is too complicated? If you're in a class you just want to show a Newton step.

g(x) = grad(nlp, x)
H(x) = Symmetric(hess(nlp, x), :L)
x = nlp.meta.x0
d = -H(x)\g(x)
2-element Array{Float64,1}:
 0.02471910112359557
 0.3806741573033705

or a few

for i = 1:5
  global x
  x = x - H(x)\g(x)
  println("x = $x")
end
x = [-1.17528, 1.38067]
x = [0.763115, -3.17503]
x = [0.76343, 0.582825]
x = [0.999995, 0.944027]
x = [0.999996, 0.999991]

Also, notice how we can reuse the method.

f(x) = (x[1]^2 + x[2]^2 - 5)^2 + (x[1]*x[2] - 2)^2
x0 = [3.0; 2.0]
nlp = ADNLPModel(f, x0)

x, fx, ngx, optimal, iter = steepest(nlp)
([2.0, 1.0], 3.911490500207018e-14, 8.979870927068038e-7, true, 153)

Even using a different model. In this case, a model from NLPModelsJuMP implemented in OptimizationProblems.

using NLPModelsJuMP, OptimizationProblems

nlp = MathProgNLPModel(woods())
x, fx, ngx, optimal, iter = steepest(nlp)
println("fx = $fx")
println("ngx = $ngx")
println("optimal = $optimal")
println("iter = $iter")
fx = 1.0000000000002167
ngx = 9.893253859340887e-7
optimal = true
iter = 12016

For constrained minimization, you need the constraints vector and bounds too. Bounds on the variables can be passed through a new vector.

f(x) = (x[1] - 1.0)^2 + 100*(x[2] - x[1]^2)^2
x0 = [-1.2; 1.0]
lvar = [-Inf; 0.1]
uvar = [0.5; 0.5]
c(x) = [x[1] + x[2] - 2; x[1]^2 + x[2]^2]
lcon = [0.0; -Inf]
ucon = [Inf; 1.0]
nlp = ADNLPModel(f, x0, c=c, lvar=lvar, uvar=uvar, lcon=lcon, ucon=ucon)

println("cx = $(cons(nlp, nlp.meta.x0))")
println("Jx = $(jac(nlp, nlp.meta.x0))")
cx = [-2.2, 2.44]
Jx = [1.0 1.0; -2.4 2.0]

SimpleNLPModel Tutorial

SimpleNLPModel allows you to pass every single function of the model. On the other hand, it doesn't handle anything else. Calling an undefined function will throw a NotImplementedError. Only the objective function is mandatory (if you don't need it, pass x->0).

using NLPModels

f(x) = (x[1] - 1.0)^2 + 4*(x[2] - 1.0)^2
x0 = zeros(2)
nlp = SimpleNLPModel(f, x0)

fx = obj(nlp, nlp.meta.x0)
println("fx = $fx")

# grad(nlp, nlp.meta.x0) # This is undefined
fx = 5.0
g(x) = [2*(x[1] - 1.0); 8*(x[2] - 1.0)]
nlp = SimpleNLPModel(f, x0, g=g)

grad(nlp, nlp.meta.x0)
2-element Array{Float64,1}:
 -2.0
 -8.0

"But what's to stop me from defining g however I want?" Nothing. So you have to be careful on how you're defining it. You should probably check your derivatives. If the function is simply defined, you can try using automatic differentiation. Alternatively, you can use the Derivative Checker.

gradient_check(nlp)
Dict{Int64,Float64} with 0 entries
gwrong(x) = [2*(x[1] - 1.0); 8*x[2] - 1.0] # Find the error
nlp = SimpleNLPModel(f, x0, g=gwrong)
gradient_check(nlp)
Dict{Int64,Float64} with 1 entry:
  2 => 7.0

For constrained problems, we still need the constraints function, lcon and ucon. Also, let's pass the Jacobian-vector product.

c(x) = [x[1]^2 + x[2]^2; x[1]*x[2] - 1]
lcon = [1.0; 0.0]
ucon = [4.0; 0.0]
Jacprod(x, v) = [2*x[1]*v[1] + 2*x[2]*v[2]; x[2]*v[1] + x[1]*v[2]]
nlp = SimpleNLPModel(f, x0, c=c, lcon=lcon, ucon=ucon, g=g, Jp=Jacprod)
jprod(nlp, ones(2), ones(2))
2-element Array{Float64,1}:
 4.0
 2.0

Furthermore, NLPModels also works with inplace operations. Since some models do not take full advantage of this (like ADNLPModel), a user might want to define his/her own functions that do.

f(x) = (x[1] - 1.0)^2 + 4*(x[2] - 1.0)^2
x0 = zeros(2)
g!(x, gx) = begin
  gx[1] = 2*(x[1] - 1.0)
  gx[2] = 8*(x[2] = 1.0)
  return gx
end
nlp = SimpleNLPModel(f, x0, g! =g!) # Watchout, g!=g! is interpreted as g != g!
gx = zeros(2)
grad!(nlp, nlp.meta.x0, gx)
2-element Array{Float64,1}:
 -2.0
  8.0

Nonlinear least squares models

In addition to the general nonlinear model, we can define the residual function for a nonlinear least-squares problem. In other words, the objective function of the problem is of the form $f(x) = \tfrac{1}{2}\|F(x)\|^2$, and we can define the function $F$ and its derivatives.

A simple way to define an NLS problem is with ADNLSModel, which uses automatic differentiation.

F(x) = [x[1] - 1.0; 10 * (x[2] - x[1]^2)]
x0 = [-1.2; 1.0]
nls = ADNLSModel(F, x0, 2) # 2 nonlinear equations
residual(nls, x0)
2-element Array{Float64,1}:
 -2.2
 -4.3999999999999995
jac_residual(nls, x0)
2×2 Array{Float64,2}:
  1.0   0.0
 24.0  10.0

We can also define a linear least squares by passing the matrices that define the problem

\[\begin{align*} \min \quad & \tfrac{1}{2}\|Ax - b\|^2 \\ & c_L \leq Cx \leq c_U \\ & \ell \leq x \leq u. \end{align*}\]
A = rand(10, 3)
b = rand(10)
C = rand(2, 3)
nls = LLSModel(A, b, C=C, lcon=zeros(2), ucon=zeros(2), lvar=-ones(3), uvar=ones(3))

@info norm(jac_residual(nls, zeros(3)) - A)
@info norm(jac(nls, zeros(3)) - C)
[ Info: 0.0
[ Info: 0.0

Another way to define a nonlinear least squares is using FeasibilityResidual to consider the constraints of a general nonlinear problem as the residual of the NLS.

nlp = ADNLPModel(x->0, # objective doesn't matter,
                 ones(2), c=x->[x[1] + x[2] - 1; x[1] * x[2] - 2],
                 lcon=zeros(2), ucon=zeros(2))
nls = FeasibilityResidual(nlp)
s = 0.0
for t = 1:100
  global s
  x = rand(2)
  s += norm(residual(nls, x) - cons(nlp, x))
end
@info "s = $s"
[ Info: s = 0.0