Tutorial
NLPModels.jl was created for two purposes:
Allow users to access problem databases in an unified way.
Mainly, this means CUTEst.jl, but it also gives access to AMPL problems, as well as JuMP defined problems (e.g. as in OptimizationProblems.jl).
Allow users to create their own problems in the same way.
As a consequence, optimization methods designed according to the NLPModels API will accept NLPModels of any provenance. See, for instance, Optimize.jl.
The main interfaces for user defined problems are
ADNLPModel, which defines a model easily, using automatic differentiation.
SimpleNLPModel, which allows users to handle all functions themselves, giving
ADNLPModel Tutorial
ADNLPModel is simple to use and is useful for classrooms. It only needs the objective function $f$ and a starting point $x^0$ to be well-defined. For constrained problems, you'll also need the constraints function $c$, and the constraints vectors $c_L$ and $c_U$, such that $c_L \leq c(x) \leq c_U$. Equality constraints will be automatically identified as those indices $i$ for which $c_{L_i} = c_{U_i}$.
Let's define the famous Rosenbrock function
with starting point $x^0 = (-1.2,1.0)$.
using NLPModels
nlp = ADNLPModel(x->(x[1] - 1.0)^2 + 100*(x[2] - x[1]^2)^2 , [-1.2; 1.0])
NLPModels.ADNLPModel(Minimization problem Generic
nvar = 2, ncon = 0 (0 linear)
,NLPModels.Counters(0,0,0,0,0,0,0,0,0,0,0),ex-adnlp.#1,NLPModels.#32)
This is enough to define the model. Let's get the objective function value at $x^0$, using only nlp
.
fx = obj(nlp, nlp.meta.x0)
println("fx = $fx")
fx = 24.199999999999996
Done. Let's try the gradient and Hessian.
gx = grad(nlp, nlp.meta.x0)
Hx = hess(nlp, nlp.meta.x0)
println("gx = $gx")
println("Hx = $Hx")
gx = [-215.6,-88.0]
Hx = [1330.0 0.0; 480.0 200.0]
Notice how only the lower triangle of the Hessian is stored. Also notice that it is dense. This is a current limitation of this model. It doesn't return sparse matrices, so use it with care.
Let's do something a little more complex here, defining a function to try to solve this problem through steepest descent method with Armijo search. Namely, the method
Given $x^0$, $\varepsilon > 0$, and $\eta \in (0,1)$. Set $k = 0$;
If $\Vert \nabla f(x^k) \Vert < \varepsilon$ STOP with $x^* = x^k$;
Compute $d^k = -\nabla f(x^k)$;
Compute $\alpha_k \in (0,1]$ such that $f(x^k + \alpha_kd^k) < f(x^k) + \alpha_k\eta \nabla f(x^k)^Td^k$
Define $x^{k+1} = x^k + \alpha_kx^k$
Update $k = k + 1$ and go to step 2.
function steepest(nlp; itmax=100000, eta=1e-4, eps=1e-6, sigma=0.66)
x = nlp.meta.x0
fx = obj(nlp, x)
∇fx = grad(nlp, x)
slope = dot(∇fx, ∇fx)
∇f_norm = sqrt(slope)
iter = 0
while ∇f_norm > eps && iter < itmax
t = 1.0
x_trial = x - t * ∇fx
f_trial = obj(nlp, x_trial)
while f_trial > fx - eta * t * slope
t *= sigma
x_trial = x - t * ∇fx
f_trial = obj(nlp, x_trial)
end
x = x_trial
fx = f_trial
∇fx = grad(nlp, x)
slope = dot(∇fx, ∇fx)
∇f_norm = sqrt(slope)
iter += 1
end
optimal = ∇f_norm <= eps
return x, fx, ∇f_norm, optimal, iter
end
x, fx, ngx, optimal, iter = steepest(nlp)
println("x = $x")
println("fx = $fx")
println("ngx = $ngx")
println("optimal = $optimal")
println("iter = $iter")
x = [1.0,1.0]
fx = 4.2438440239813445e-13
ngx = 9.984661274466946e-7
optimal = true
Maybe this code is too complicated? If you're in a class you just want to show a Newton step.
g(x) = grad(nlp, x)
H(x) = hess(nlp, x) + triu(hess(nlp, x)', 1)
x = nlp.meta.x0
d = -H(x)\g(x)
2-element Array{Float64,1}:
0.0247191
0.380674
or a few
for i = 1:5
x = x - H(x)\g(x)
println("x = $x")
end
x = [-1.17528,1.38067]
x = [0.763115,-3.17503]
x = [0.76343,0.582825]
x = [0.999995,0.944027]
x = [0.999996,0.999991]
Also, notice how we can reuse the method.
f(x) = (x[1]^2 + x[2]^2 - 4)^2 + (x[1]*x[2] - 1)^2
x0 = [2.0; 1.0]
nlp = ADNLPModel(f, x0)
x, fx, ngx, optimal, iter = steepest(nlp)
([1.93185,0.517638],1.2842480268876117e-14,9.402834701778815e-7,true,120)
Even using a different model.
using OptimizationProblems # Defines a lot of JuMP models
nlp = MathProgNLPModel(woods())
x, fx, ngx, optimal, iter = steepest(nlp)
println("fx = $fx")
println("ngx = $ngx")
println("optimal = $optimal")
println("iter = $iter")
fx = 1.0000000000002167
ngx = 9.893253859340887e-7
optimal = true
iter = 12016
For constrained minimization, you need the constraints vector and bounds too. Bounds on the variables can be passed through a new vector.
f(x) = (x[1] - 1.0)^2 + 100*(x[2] - x[1]^2)^2
x0 = [-1.2; 1.0]
lvar = [-Inf; 0.1]
uvar = [0.5; 0.5]
c(x) = [x[1] + x[2] - 2; x[1]^2 + x[2]^2]
lcon = [0.0; -Inf]
ucon = [Inf; 1.0]
nlp = ADNLPModel(f, x0, c=c, lvar=lvar, uvar=uvar, lcon=lcon, ucon=ucon)
println("cx = $(cons(nlp, nlp.meta.x0))")
println("Jx = $(jac(nlp, nlp.meta.x0))")
cx = [-2.2,2.44]
Jx = [1.0 1.0; -2.4 2.0]
SimpleNLPModel Tutorial
SimpleNLPModel allows you to pass every single function of the model. On the other hand, it doesn't handle anything else. Calling an undefined function will throw a NotImplementedError
. Only the objective function is mandatory (if you don't need it, pass x->0
).
using NLPModels
f(x) = (x[1] - 1.0)^2 + 4*(x[2] - 1.0)^2
x0 = zeros(2)
nlp = SimpleNLPModel(f, x0)
fx = obj(nlp, nlp.meta.x0)
println("fx = $fx")
# grad(nlp, nlp.meta.x0) # This is undefined
fx = 5.0
g(x) = [2*(x[1] - 1.0); 8*(x[2] - 1.0)]
nlp = SimpleNLPModel(f, x0, g=g)
grad(nlp, nlp.meta.x0)
2-element Array{Float64,1}:
-2.0
-8.0
"But what's to stop me from defining g
however I want?" Nothing. So you have to be careful on how you're defining it. You should probably check your derivatives. If the function is simply defined, you can try using automatic differentiation. Alternatively, you can use the derivative checker.
gradient_check(nlp)
Dict{Int64,Float64} with 0 entries
gwrong(x) = [2*(x[1] - 1.0); 8*x[2] - 1.0] # Find the error
nlp = SimpleNLPModel(f, x0, g=gwrong)
gradient_check(nlp)
Dict{Int64,Float64} with 1 entry:
2 => 7.0
For constrained problems, we still need the constraints function, lcon
and ucon
. Also, let's pass the Jacobian-vector product.
c(x) = [x[1]^2 + x[2]^2; x[1]*x[2] - 1]
lcon = [1.0; 0.0]
ucon = [4.0; 0.0]
Jacprod(x, v) = [2*x[1]*v[1] + 2*x[2]*v[2]; x[2]*v[1] + x[1]*v[2]]
nlp = SimpleNLPModel(f, x0, c=c, lcon=lcon, ucon=ucon, g=g, Jp=Jacprod)
jprod(nlp, ones(2), ones(2))
2-element Array{Float64,1}:
4.0
2.0
Furthermore, NLPModels also works with inplace operations. Since some models do not take full advantage of this (like ADNLPModel), a user might want to define his/her own functions that do.
f(x) = (x[1] - 1.0)^2 + 4*(x[2] - 1.0)^2
x0 = zeros(2)
g!(x, gx) = begin
gx[1] = 2*(x[1] - 1.0)
gx[2] = 8*(x[2] = 1.0)
return gx
end
nlp = SimpleNLPModel(f, x0, g! =g!) # Watchout, g!=g! is interpreted as g != g!
gx = zeros(2)
grad!(nlp, nlp.meta.x0, gx)
2-element Array{Float64,1}:
-2.0
8.0