A regularized nonlinear least-square problem
In this tutorial, we will show how to model and solve the nonconvex nonsmooth least-square problem
\[ \min_{x \in \mathbb{R}^2} \tfrac{1}{2} \sum_{i=1}^m \big(y_i - x_1 e^{x_2 t_i}\big)^2 + \lambda \|x\|_0.\]
This problem models the fitting of an exponential curve, given noisy data.
Modelling the problem
We first formulate the objective function as the sum of a smooth function $f$ and a nonsmooth regularizer $h$:
\[ \tfrac{1}{2} \sum_{i=1}^m \big(y_i - x_1 e^{x_2 t_i}\big)^2 + \lambda \|x\|_0 = f(x) + h(x),\]
where
\[\begin{align*} f(x) &:= \tfrac{1}{2} \sum_{i=1}^m \big(y_i - x_1 e^{x_2 t_i}\big)^2,\\ h(x) &:= \lambda\|x\|_0. \end{align*}\]
To model $f$, we are going to use ADNLPModels.jl. For the nonsmooth regularizer, we use ProximalOperators.jl. We then wrap the smooth function and the regularizer in a RegularizedNLPModel.
using ADNLPModels
using ProximalOperators
using Random
using RegularizedProblems
Random.seed!(0)
# Generate synthetic nonlinear least-squares data
m = 100
t = range(0, 1, length=m)
a_true, b_true = 2.0, -1.0
y = [a_true * exp(b_true * ti) + 0.1*randn() for ti in t]
# Starting point
x0 = [1.0, 0.0] # [a, b]
# Define nonlinear residuals
function F(x)
a, b = x
return [yi - a*exp(b*ti) for (ti, yi) in zip(t, y)]
end
# Build ADNLSModel
f_model = ADNLSModel(F, x0, m, name = "nonlinear LS model of f")
# Get the regularizer from ProximalOperators
λ = 0.01
h = NormL0(λ)
# Wrap into a RegularizedNLPModel
regularized_pb = RegularizedNLPModel(f_model, h)Smooth model: ADNLSModel - Nonlinear least-squares model with automatic differentiation backend ADModelBackend{
ForwardDiffADGradient,
ForwardDiffADHvprod,
EmptyADbackend,
EmptyADbackend,
EmptyADbackend,
SparseADHessian,
EmptyADbackend,
ForwardDiffADHvprod,
ForwardDiffADJprod,
ForwardDiffADJtprod,
SparseADJacobian,
SparseADHessian,
}
Problem name: nonlinear LS model of f
All variables: ████████████████████ 2 All constraints: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 All residuals: ████████████████████ 100
free: ████████████████████ 2 free: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 linear: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
lower: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 lower: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 nonlinear: ████████████████████ 100
upper: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 upper: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 nnzj: ( 0.00% sparsity) 200
low/upp: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 low/upp: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 nnzh: ( 33.33% sparsity) 2
fixed: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 fixed: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
infeas: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 infeas: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
nnzh: ( 0.00% sparsity) 3 linear: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
nonlinear: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
nnzj: (------% sparsity)
Counters:
obj: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 grad: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 cons: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
cons_lin: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 cons_nln: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 jcon: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
jgrad: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 jac: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 jac_lin: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
jac_nln: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 jprod: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 jprod_lin: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
jprod_nln: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 jtprod: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 jtprod_lin: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
jtprod_nln: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 hess: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 hprod: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
jhess: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 jhprod: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 residual: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
jac_residual: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 jprod_residual: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 jtprod_residual: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
hess_residual: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 jhess_residual: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0 hprod_residual: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0
Regularizer: ProximalOperators.NormL0{Float64}(0.01)
Selected variables: 1:2Solving the problem
We can now choose one of the solvers presented here to solve the problem we defined above. In the case of least-squares, it is usually more appropriate to choose LM or LMTR.
using RegularizedOptimization
# LM is a quadratic regularization method.
out = LM(regularized_pb, verbose = 1, atol = 1e-4)
println("LM converged after $(out.iter) iterations.")[ Info: outer inner f(x) h(x) √(ξ1/ν) ρ σ ‖x‖ ‖s‖ ‖J‖² LM
[ Info: 0 25 1.0e+01 1.0e-02 2.6e+01 8.4e-01 7.4e-04 1.0e+00 1.5e+00 1.3e+02 =
[ Info: 1 14 2.0e+00 2.0e-02 1.3e+01 9.9e-01 7.4e-04 2.2e+00 3.0e-01 5.2e+01 ↘
[ Info: 2 15 3.8e-01 2.0e-02 1.5e+00 1.0e+00 2.5e-04 2.2e+00 3.9e-02 7.1e+01 ↘
[ Info: 3 17 3.6e-01 2.0e-02 1.7e-02 1.0e+00 8.2e-05 2.2e+00 7.9e-04 6.8e+01 ↘
[ Info: 4 0 3.6e-01 2.0e-02 2.5e-05 1.0e+00 2.7e-05 2.2e+00 3.7e-07 6.8e+01
[ Info: LM: terminating with √(ξ1/ν) = 2.5352037067312497e-5
LM converged after 4 iterations.We can visualize the solution with plots,
using Plots
# Extract estimated parameters
a_est, b_est = out.solution
# True curve
y_true = [a_true * exp(b_true * ti) for ti in t]
# Estimated curve
y_est = [a_est * exp(b_est * ti) for ti in t]
# Plot
scatter(t, y, label="Noisy data", legend=:bottomleft)
plot!(t, y_true, label="True model", lw=2)
plot!(t, y_est, label="Fitted model", lw=2, ls=:dash)#We can choose LMTR instead which is a trust-region method
out = LMTR(regularized_pb, verbose = 1, atol = 1e-4)
println("LMTR converged after $(out.iter) iterations.")[ Info: outer inner f(x) h(x) √(ξ1/ν) ρ Δ ‖x‖ ‖s‖ ν LMTR
[ Info: 0 16 1.0e+01 1.0e-02 2.6e+01 9.1e-01 1.0e+00 1.0e+00 1.0e+00 7.9e-03 ↗
[ Info: 1 6 1.5e+00 2.0e-02 1.2e+01 1.0e+00 3.0e+00 1.8e+00 2.2e-01 1.6e-02 ↗
[ Info: 2 18 3.6e-01 2.0e-02 1.7e-01 1.0e+00 3.0e+00 2.0e+00 1.3e-02 1.5e-02 ↗
[ Info: 3 4 3.6e-01 2.0e-02 3.9e-04 1.0e+00 3.0e+00 2.0e+00 2.9e-05 1.5e-02 ↗
[ Info: 4 0 3.6e-01 2.0e-02 8.6e-05 1.0e+00 3.0e+00 2.0e+00 9.8e-07 1.5e-02
[ Info: LMTR: terminating with √(ξ1/ν) = 8.638767447185872e-5
LMTR converged after 4 iterations.# Extract estimated parameters
a_est, b_est = out.solution
# Estimated curve
y_est = [a_est * exp(b_est * ti) for ti in t]
# Plot
scatter(t, y, label="Noisy data", legend=:bottomleft)
plot!(t, y_true, label="True model", lw=2)
plot!(t, y_est, label="Fitted model", lw=2, ls=:dash)