Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Bi7740: Scientiﬁc computing
Optimization: a brief summary
Vlad Popovici
popovici@iba.muni.cz
Institute of Biostatistics and Analyses
Masaryk University, Brno
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Book:
Venkataraman P., Applied optimization using Matlab, Wiley & Sons,
2002
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Outline
1 Problem setting
2 Optimization in R
3 Optimization in Rn
Unconstrained optimization in Rn
4 Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Problem setting
minimization problem: f : Rn
→ R, S ⊆ Rn
, ﬁnd x∗ ∈ S:
f(x) ≤ f(y), ∀y ∈ S \ {x}
x∗ is called minimizer (minimum, extremum) of f
maximization is equivalent to minimizing −f
f is called objective function and considered, here,
differentiable with continuous second derivative
constraint set S (or feasible region) is deﬁned by a system of
equations and/or inequations
y ∈ S is called a feasible point
if S = Rn
the optimization is unconstrained
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Optimization problem
min
x
f(x)
subject to
g(x) = 0
hk (x) ≤ 0
where f : Rn
→ R, g : Rn
→ Rm
, hk : Rn
→ R.
If f, g and hk functions are linear: linear programming.
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Some theory
Rolle’s thm: f cont. on [a, b] and differentiable on (a, b) with
f(a) = f(b), then ∃c ∈ (a, b) : f (c) = 0
Weierstrass’ thm: f cont. on a compact set with values in a
subset of R attains its extrema
Fermat’s thm: f : (a, b) → R then in a stationary point
x0 ∈ (a, b), f (x0) = 0. Generalization: f(x0) = 0.
convex function: f (x) > 0; concave function: f (x) < 0
if f (x0) = 0 and f (x0) < 0 then x0 is a minimizer
if f (x0) = 0 and f (x0) > 0 then x0 is a maximizer
if f (x0) = f (x0) = 0, then x0 is an inﬂection point
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Set convexity
Formally: a set S is convex if αx1 + (1 − αx2) ∈ S for all x1, x2 ∈ S
and α ∈ [0, 1].
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Function convexity
Formally: f is said to be convex on a convex set S if
f(αx1 + (1 − α)x2) ≤ αf(x1) + (1 − α)f(x2) for all x1, x2 ∈ S and
α ∈ [0, 1].
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Uniqueness of the solution
any local minimum of a convex function f on a convex set
S ⊆ Rn
is global minimum of f on S
any local minimum of a strictly convex function f on a convex
set S ⊆ Rn
is unique global minimum of f on S
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Optimality criteria
For x∗ ∈ S to be an extremum of f : S ⊆ Rn
→ R
ﬁrst order condition: x∗ must be a critical point:
f(x∗
) = 0
second order condition: the Hessian matrix Hf (x∗) must be
positive or negative deﬁnite
[Hf (x)]ij =
∂f(x)
∂xi∂xj
If the Hessian is
positive deﬁnite, then x∗
is a minimum of f
negative deﬁnite, then x∗
is a maximum of f
indeﬁnite, then x∗
is a saddle point of f
singular, then different degenerated cases are possible...
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Saddle point
source: Wikipedia
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Outline
1 Problem setting
2 Optimization in R
3 Optimization in Rn
Unconstrained optimization in Rn
4 Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Unimodality
Unimodality allows discarding safely parts of the interval, without
loosing the solution (like in the case of interval bisection).
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Golden section search
evaluate the function at 3 points
and decide which part to discard
ensure that the sampling space
remains proportional:
c
a
=
a
b
⇒
b
a
=
1 +
√
5
2
= 1.618 . . .
convergence is linear, with
C ≈ 0.618
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Successive parabolic interpolations
Convergence is superlinear, with r ≈ 1.32.
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Newton’s method
From Taylor’s series:
f(x + h) ≈ f(x) + f (x)h +
f (x)
2
h2
whose minimum is at h = −f (x)/f (x). HOMEWORK: prove it!
Iteration scheme:
xk+1 = xk − f (x)/f (x)
(That’s Newton’s method for ﬁnding the zero of f (x) = 0.)
Quadratic convergences, but needs to start close to the solution.
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Hybrid methods
idea: combine "slow-but-sure" methods with "fast-but-risky"
most library routines are using such approach
popular combination: golden search and successive parabolic
interpolation
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Matlab functions for optimization in R
ﬁrst, check optimset for managing the optimization options
fminbnd: bounded function minimization
you can use functions for multivariate case as well
Try in Matlab:
>> opts = optimset('display','iter'); % what's for?
>> f = ...
@(x)(1./((x−0.3).^2+0.01)+1./((x−0.9).^2+0.04)−6);
>> [x,fx] = fminbnd(f, .2, 1, opts)
>> g = @(x) (cos(x) − 2*log(x));
>> [x, gx] = fminbnd(g, 2, 4, opts); % explain
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Unconstrained optimization in Rn
Outline
1 Problem setting
2 Optimization in R
3 Optimization in Rn
Unconstrained optimization in Rn
4 Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Unconstrained optimization in Rn
Outline
1 Problem setting
2 Optimization in R
3 Optimization in Rn
Unconstrained optimization in Rn
4 Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Unconstrained optimization in Rn
Nelder-Mead (simplex) method
direct search methods simply compare the function values at
different points in S
Nelder-Mead selects n + 1 points (in Rn
) forming a simplex
(i.e. a segment in R, a triangle in R2
, a tetrahedron in R3
, etc)
along the line from the point with highest function value
through the centroid of the rest, select a new vertex
the new vertex replaces the worst previous point
repeat until convergence
useful procedure for non-smooth functions, but expensive for
large n
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Unconstrained optimization in Rn
Nelder-Mead in Matlab
Use the function fminsearch.
Example:
>> f = @(x) (sin(norm(x,2)^2));
>> x = fminsearch(f, [.5,.5], ...
opts)
>> x = fminsearch(f, ...
[.25,.25], opts)
>> x = fminsearch(f, [1,1], opts)
−3
−2
−1
0
1
2
3
−3
−2
−1
0
1
2
3
−1
0
1
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Unconstrained optimization in Rn
Steepest descent (gradient descent)
f : Rn
→ R: the negative gradient, − f(x) is locally the
steepest descent towards a (local) minimum
xk+1 = xk − αk f(xk ) where αk is line search parameter
x0
x1
x2
x3
x4
*
*
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Unconstrained optimization in Rn
αk = arg minα f(xk − f(xk ))
the method always progresses towards minimum, as long as
the gradient is non-zero
the convergence is slow, the search direction may zig-zag
the method is "myopic" in its choices
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Unconstrained optimization in Rn
Newton’s method
exploit the 1st and 2nd derivative
Newton iteration
xk+1 = xk − H−1
f (xk ) f(xk )
no need to invert the Hessian; solve the system
Hf (xk )sk = − f(xk )
and then
xk+1 = xk + sk
variation: damped Newton method uses a line search along
the direction of sk to make the method more robust
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Unconstrained optimization in Rn
Newton’s method, cont’d
close to minimum, the Hessian is symmetric positive deﬁnite,
so you can use Cholesky decomposition
if initialized far from minimum, the Newton step may not be in
the direction of steepest descent:
( f(xk ))T
sk < 0
choose a different direction based on negative gradient,
negative curvature, etc
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Unconstrained optimization in Rn
Quasi-Newton methods
improve reliability and reduce overhead
general form
xk+1 = xk − αk B−1
k f(xk )
where αk is a line search parameter and Bk is an
approximation to the Hessian
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Unconstrained optimization in Rn
BFGS (Broyden-Fletcher-Goldfarb-Shanno) method
Algorithm 1: BFGS method
x0 = some initial value
B0 = initial approximation of the Hessian
for k = 0, 1, 2, . . . do
solve Bk sk = − f(xk ) for sk
xk+1 = xk + sk
yk = f(xk+1) − f(xk )
Bk+1 = Bk + (yk yT
k
)/(yT
k
sk ) − (Bk sk sT
k
Bk )/(sT
k
Bk sk )
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Unconstrained optimization in Rn
BFGS, cont’d
update only the factorization of Bk rather than factorizing it at
each iteration
no 2nd derivative is needed
can start with B0 = I
Bk does not necessarily converge to true Hessian
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Unconstrained optimization in Rn
Conjugate gradient (CG)
does not need 2nd derivative, does not construct an
approximation of the Hessian
searches on conjugate directions, implicitly accumulating
information about the Hessian
for quadratic problems, it converges in n steps to exact
solution (theoretically)
two vectors x, y are conjugate with respect to a matrix A is
xT
Ay = 0
idea: start with an initial guess x0 (could be 0); go along the
negative gradient at the current point; compute the new
direction as a combination of previous and new gradients
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Unconstrained optimization in Rn
Algorithm 2: CG method
x0 = some initial value
g0 = f(x0)
s0 = −g0
for k = 0, 1, 2, . . . do
αk = arg minα f(xk + αsk )
xk+1 = xk + αk sk
gk+1 = f(xk+1)
βk+1 = (gT
k+1
gk+1)/(gT
k
gk )
sk+1 = −gk+1 + βk+1sk
x0
x
source: Wikipedia
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Unconstrained optimization in Rn
Other methods
we barely scratched the surface!
heuristic methods
genetic algorithms
stochastic methods
hybrid methods
etc etc etc
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Unconstrained optimization in Rn
Matlab functions
linear and quadratic optimization: linprog, quadprog
linear least squares: lsqlin, lsqnonneg
nonlinear minimization:
fminbnd - scalar bounded problem;
fmincon - multidimensional constrained nonlinear
minimization
fminsearch - Nelder-Mead unconstrained nonlinear
minimization
fminunc - multidimensional unconstrained nonlinear
minimization
fseminf -multidimensional constrained minimization,
semi-inﬁnite constraints
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Outline
1 Problem setting
2 Optimization in R
3 Optimization in Rn
Unconstrained optimization in Rn
4 Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Outline
1 Problem setting
2 Optimization in R
3 Optimization in Rn
Unconstrained optimization in Rn
4 Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Linear programming (LP)
General form:
minimize fT
x
subject to
Aeqx = beq
Ax ≤ b
lb ≤ x ≤ ub
Matlab:
X = linprog(f,A,b,Aeq,beq,LB,UB,X0)
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
LP - Example
Solve the LP:
maximize 2x1 + 3x2
such that
x1 + 2x2 ≤ 8
2x1 + x2 ≤ 10
x2 ≤ 3
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
c = [−2,−3]';
A = [1,2;2,1;0,1];
b = [8,10,3]';
x = linprog(c,A,b,[],[],[],[],[])
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Chebyshev data approximation
Let (xi, yi) be a set of points. Find the best approximation with a
d-degree polynomial p(x) = αdxd
+ αd−1xd−1
+ · · · + α0:
minimize max
i
|yi − p(xi)|
Solution: let f = maxi |yi − p(xi)|. The problem can be formulated
as a LP problem:
minimize f with respect to αi
such that
−f ≤ yi − p(xi) ≤ f
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
...which is equivalent to
minimize f
such that
−p(xi) − f ≤ −yi
p(xi) − f ≤ yi
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Example
Approximate a set of 14 points with a 4-degree polyomial.
% given data: x, y
x = [0,3,7,8,9,10,12,14,16,18,19,20,21,23]';
y = [3,5,5,4,3,6,7,6,6,11,11,10,8,6]';
% ineq. constraints:
A1 = [−x.^4,−x.^3,−x.^2,−x,−ones(14,1),−ones(14,1)];
A2 = [x.^4,x.^3,x.^2,x,ones(14,1),−ones(14,1)];
A = [A1; A2];
b = [−y;y];
f = zeros(6,1); f(6)=1; % objective function
[alpha, fval, exitflag] = linprog(f,A,b);
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Outline
1 Problem setting
2 Optimization in R
3 Optimization in Rn
Unconstrained optimization in Rn
4 Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Quadratic programming (QP)
General form:
minimize
1
2
xT
Hx + fT
x
subject to
Ax ≤ b
Aeqx = beq
lb ≤ x ≤ ub
with H ∈ Rn×s
symmetric. Matlab:
X = quadprog(H,f,A,b,Aeq,beq,LB,UB,X0,OPTIONS)
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
QP - Example
Solve:
minimize x2
1
+ x1x2 + 2x2
2
+ 2x2
3
+ 2x2x3 + 4x1 + 6x2 + 12x3
subject to
x1 + x2 + x3 ≥ 6
−x1 − x2 + 2x3 ≥ 2
x1, x2, x3 ≥ 0
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
H = [2,1,0;1,4,2;0,2,4];
f = [4,6,12];
A = [−1,−1,−1;1,1,−2]; b = [−6,−2];
lb = [0;0;0]; ub = [inf;inf;inf];
opts=optimoptions('quadprog', 'algorithm', ...
'interior−point−convex');
[x,fval,exitflag,output] = ...
quadprog(H, f, A, b, [], [], lb, ub, [],opts);
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Outline
1 Problem setting
2 Optimization in R
3 Optimization in Rn
Unconstrained optimization in Rn
4 Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Constrained nonlinear optimization - fmincon
Problem:
minimize f(x)
subject to
c(x) ≤ 0
ceq(x) = 0
Ax ≤ b
Aeqx = beq
lb ≤ x ≤ ub
Matlab:
[x,fval,exitflag,output] = fmincon(fun, x0, A, b, ...
Aeq, beq, lb, ub, nonlcon, options)
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Algorithms for fmincon
trust-region reﬂective: requires the gradient and allows only
bounds or linear equality constraints, but not both. Works on
large sparse and small dense problems efﬁciently.
active-set can take large steps to converge fast. It is effective
on some small problems with nonsmooth constraints.
sqp satisﬁes bounds at each iteration. Not for large-scale
problems.
interior-point: for large+sparse or small+dense problems.
Designed for large problems, can recover from NaN or Inf
results. Satisﬁes bounds at each iteration.
Use the documentation for fmincon and optimoptions functions
for details. You can use optimtool for a graphical user interface
to optimization toolbox!
Vlad Bi7740: Scientiﬁc computing
Problem setting
Optimization in R
Optimization in Rn
Important classes of optimization problems
Linear programming
Quadratic programming
Constrained nonlinear optimization
Exercise
Design the optimal beer can! It must be:
cylindrical
ecological: uses the minimum amount of materials (i.e.
minimum total surface)
of exact volume V = 333cm3
not higher than twice its diameter
Tasks:
1 identify the variables
2 write the mathematical formulation of the problem
3 write the formula of the gradient of the objective function and
the Jacobian of the nonlinear constraint function
4 implement in Matlab
Vlad Bi7740: Scientiﬁc computing