Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Bi7740: Scientific computing Optimization: a brief summary Vlad Popovici popovici@iba.muni.cz Institute of Biostatistics and Analyses Masaryk University, Brno Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Book: Venkataraman P., Applied optimization using Matlab, Wiley & Sons, 2002 Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Outline 1 Problem setting 2 Optimization in R 3 Optimization in Rn Unconstrained optimization in Rn 4 Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Problem setting minimization problem: f : Rn → R, S ⊆ Rn , find x∗ ∈ S: f(x) ≤ f(y), ∀y ∈ S \ {x} x∗ is called minimizer (minimum, extremum) of f maximization is equivalent to minimizing −f f is called objective function and considered, here, differentiable with continuous second derivative constraint set S (or feasible region) is defined by a system of equations and/or inequations y ∈ S is called a feasible point if S = Rn the optimization is unconstrained Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Optimization problem min x f(x) subject to g(x) = 0 hk (x) ≤ 0 where f : Rn → R, g : Rn → Rm , hk : Rn → R. If f, g and hk functions are linear: linear programming. Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Some theory Rolle’s thm: f cont. on [a, b] and differentiable on (a, b) with f(a) = f(b), then ∃c ∈ (a, b) : f (c) = 0 Weierstrass’ thm: f cont. on a compact set with values in a subset of R attains its extrema Fermat’s thm: f : (a, b) → R then in a stationary point x0 ∈ (a, b), f (x0) = 0. Generalization: f(x0) = 0. convex function: f (x) > 0; concave function: f (x) < 0 if f (x0) = 0 and f (x0) < 0 then x0 is a minimizer if f (x0) = 0 and f (x0) > 0 then x0 is a maximizer if f (x0) = f (x0) = 0, then x0 is an inflection point Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Set convexity Formally: a set S is convex if αx1 + (1 − αx2) ∈ S for all x1, x2 ∈ S and α ∈ [0, 1]. Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Function convexity Formally: f is said to be convex on a convex set S if f(αx1 + (1 − α)x2) ≤ αf(x1) + (1 − α)f(x2) for all x1, x2 ∈ S and α ∈ [0, 1]. Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Uniqueness of the solution any local minimum of a convex function f on a convex set S ⊆ Rn is global minimum of f on S any local minimum of a strictly convex function f on a convex set S ⊆ Rn is unique global minimum of f on S Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Optimality criteria For x∗ ∈ S to be an extremum of f : S ⊆ Rn → R first order condition: x∗ must be a critical point: f(x∗ ) = 0 second order condition: the Hessian matrix Hf (x∗) must be positive or negative definite [Hf (x)]ij = ∂f(x) ∂xi∂xj If the Hessian is positive definite, then x∗ is a minimum of f negative definite, then x∗ is a maximum of f indefinite, then x∗ is a saddle point of f singular, then different degenerated cases are possible... Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Saddle point source: Wikipedia Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Outline 1 Problem setting 2 Optimization in R 3 Optimization in Rn Unconstrained optimization in Rn 4 Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Unimodality Unimodality allows discarding safely parts of the interval, without loosing the solution (like in the case of interval bisection). Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Golden section search evaluate the function at 3 points and decide which part to discard ensure that the sampling space remains proportional: c a = a b ⇒ b a = 1 + √ 5 2 = 1.618 . . . convergence is linear, with C ≈ 0.618 Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Successive parabolic interpolations Convergence is superlinear, with r ≈ 1.32. Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Newton’s method From Taylor’s series: f(x + h) ≈ f(x) + f (x)h + f (x) 2 h2 whose minimum is at h = −f (x)/f (x). HOMEWORK: prove it! Iteration scheme: xk+1 = xk − f (x)/f (x) (That’s Newton’s method for finding the zero of f (x) = 0.) Quadratic convergences, but needs to start close to the solution. Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Hybrid methods idea: combine "slow-but-sure" methods with "fast-but-risky" most library routines are using such approach popular combination: golden search and successive parabolic interpolation Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Matlab functions for optimization in R first, check optimset for managing the optimization options fminbnd: bounded function minimization you can use functions for multivariate case as well Try in Matlab: >> opts = optimset('display','iter'); % what's for? >> f = ... @(x)(1./((x−0.3).^2+0.01)+1./((x−0.9).^2+0.04)−6); >> [x,fx] = fminbnd(f, .2, 1, opts) >> g = @(x) (cos(x) − 2*log(x)); >> [x, gx] = fminbnd(g, 2, 4, opts); % explain Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Unconstrained optimization in Rn Outline 1 Problem setting 2 Optimization in R 3 Optimization in Rn Unconstrained optimization in Rn 4 Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Unconstrained optimization in Rn Outline 1 Problem setting 2 Optimization in R 3 Optimization in Rn Unconstrained optimization in Rn 4 Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Unconstrained optimization in Rn Nelder-Mead (simplex) method direct search methods simply compare the function values at different points in S Nelder-Mead selects n + 1 points (in Rn ) forming a simplex (i.e. a segment in R, a triangle in R2 , a tetrahedron in R3 , etc) along the line from the point with highest function value through the centroid of the rest, select a new vertex the new vertex replaces the worst previous point repeat until convergence useful procedure for non-smooth functions, but expensive for large n Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Unconstrained optimization in Rn Nelder-Mead in Matlab Use the function fminsearch. Example: >> f = @(x) (sin(norm(x,2)^2)); >> x = fminsearch(f, [.5,.5], ... opts) >> x = fminsearch(f, ... [.25,.25], opts) >> x = fminsearch(f, [1,1], opts) −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −1 0 1 Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Unconstrained optimization in Rn Steepest descent (gradient descent) f : Rn → R: the negative gradient, − f(x) is locally the steepest descent towards a (local) minimum xk+1 = xk − αk f(xk ) where αk is line search parameter x0 x1 x2 x3 x4 * * Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Unconstrained optimization in Rn αk = arg minα f(xk − f(xk )) the method always progresses towards minimum, as long as the gradient is non-zero the convergence is slow, the search direction may zig-zag the method is "myopic" in its choices Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Unconstrained optimization in Rn Newton’s method exploit the 1st and 2nd derivative Newton iteration xk+1 = xk − H−1 f (xk ) f(xk ) no need to invert the Hessian; solve the system Hf (xk )sk = − f(xk ) and then xk+1 = xk + sk variation: damped Newton method uses a line search along the direction of sk to make the method more robust Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Unconstrained optimization in Rn Newton’s method, cont’d close to minimum, the Hessian is symmetric positive definite, so you can use Cholesky decomposition if initialized far from minimum, the Newton step may not be in the direction of steepest descent: ( f(xk ))T sk < 0 choose a different direction based on negative gradient, negative curvature, etc Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Unconstrained optimization in Rn Quasi-Newton methods improve reliability and reduce overhead general form xk+1 = xk − αk B−1 k f(xk ) where αk is a line search parameter and Bk is an approximation to the Hessian Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Unconstrained optimization in Rn BFGS (Broyden-Fletcher-Goldfarb-Shanno) method Algorithm 1: BFGS method x0 = some initial value B0 = initial approximation of the Hessian for k = 0, 1, 2, . . . do solve Bk sk = − f(xk ) for sk xk+1 = xk + sk yk = f(xk+1) − f(xk ) Bk+1 = Bk + (yk yT k )/(yT k sk ) − (Bk sk sT k Bk )/(sT k Bk sk ) Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Unconstrained optimization in Rn BFGS, cont’d update only the factorization of Bk rather than factorizing it at each iteration no 2nd derivative is needed can start with B0 = I Bk does not necessarily converge to true Hessian Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Unconstrained optimization in Rn Conjugate gradient (CG) does not need 2nd derivative, does not construct an approximation of the Hessian searches on conjugate directions, implicitly accumulating information about the Hessian for quadratic problems, it converges in n steps to exact solution (theoretically) two vectors x, y are conjugate with respect to a matrix A is xT Ay = 0 idea: start with an initial guess x0 (could be 0); go along the negative gradient at the current point; compute the new direction as a combination of previous and new gradients Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Unconstrained optimization in Rn Algorithm 2: CG method x0 = some initial value g0 = f(x0) s0 = −g0 for k = 0, 1, 2, . . . do αk = arg minα f(xk + αsk ) xk+1 = xk + αk sk gk+1 = f(xk+1) βk+1 = (gT k+1 gk+1)/(gT k gk ) sk+1 = −gk+1 + βk+1sk x0 x source: Wikipedia Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Unconstrained optimization in Rn Other methods we barely scratched the surface! heuristic methods genetic algorithms stochastic methods hybrid methods etc etc etc Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Unconstrained optimization in Rn Matlab functions linear and quadratic optimization: linprog, quadprog linear least squares: lsqlin, lsqnonneg nonlinear minimization: fminbnd - scalar bounded problem; fmincon - multidimensional constrained nonlinear minimization fminsearch - Nelder-Mead unconstrained nonlinear minimization fminunc - multidimensional unconstrained nonlinear minimization fseminf -multidimensional constrained minimization, semi-infinite constraints Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Outline 1 Problem setting 2 Optimization in R 3 Optimization in Rn Unconstrained optimization in Rn 4 Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Outline 1 Problem setting 2 Optimization in R 3 Optimization in Rn Unconstrained optimization in Rn 4 Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Linear programming (LP) General form: minimize fT x subject to Aeqx = beq Ax ≤ b lb ≤ x ≤ ub Matlab: X = linprog(f,A,b,Aeq,beq,LB,UB,X0) Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization LP - Example Solve the LP: maximize 2x1 + 3x2 such that x1 + 2x2 ≤ 8 2x1 + x2 ≤ 10 x2 ≤ 3 Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization c = [−2,−3]'; A = [1,2;2,1;0,1]; b = [8,10,3]'; x = linprog(c,A,b,[],[],[],[],[]) Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Chebyshev data approximation Let (xi, yi) be a set of points. Find the best approximation with a d-degree polynomial p(x) = αdxd + αd−1xd−1 + · · · + α0: minimize max i |yi − p(xi)| Solution: let f = maxi |yi − p(xi)|. The problem can be formulated as a LP problem: minimize f with respect to αi such that −f ≤ yi − p(xi) ≤ f Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization ...which is equivalent to minimize f such that −p(xi) − f ≤ −yi p(xi) − f ≤ yi Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Example Approximate a set of 14 points with a 4-degree polyomial. % given data: x, y x = [0,3,7,8,9,10,12,14,16,18,19,20,21,23]'; y = [3,5,5,4,3,6,7,6,6,11,11,10,8,6]'; % ineq. constraints: A1 = [−x.^4,−x.^3,−x.^2,−x,−ones(14,1),−ones(14,1)]; A2 = [x.^4,x.^3,x.^2,x,ones(14,1),−ones(14,1)]; A = [A1; A2]; b = [−y;y]; f = zeros(6,1); f(6)=1; % objective function [alpha, fval, exitflag] = linprog(f,A,b); Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Outline 1 Problem setting 2 Optimization in R 3 Optimization in Rn Unconstrained optimization in Rn 4 Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Quadratic programming (QP) General form: minimize 1 2 xT Hx + fT x subject to Ax ≤ b Aeqx = beq lb ≤ x ≤ ub with H ∈ Rn×s symmetric. Matlab: X = quadprog(H,f,A,b,Aeq,beq,LB,UB,X0,OPTIONS) Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization QP - Example Solve: minimize x2 1 + x1x2 + 2x2 2 + 2x2 3 + 2x2x3 + 4x1 + 6x2 + 12x3 subject to x1 + x2 + x3 ≥ 6 −x1 − x2 + 2x3 ≥ 2 x1, x2, x3 ≥ 0 Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization H = [2,1,0;1,4,2;0,2,4]; f = [4,6,12]; A = [−1,−1,−1;1,1,−2]; b = [−6,−2]; lb = [0;0;0]; ub = [inf;inf;inf]; opts=optimoptions('quadprog', 'algorithm', ... 'interior−point−convex'); [x,fval,exitflag,output] = ... quadprog(H, f, A, b, [], [], lb, ub, [],opts); Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Outline 1 Problem setting 2 Optimization in R 3 Optimization in Rn Unconstrained optimization in Rn 4 Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Constrained nonlinear optimization - fmincon Problem: minimize f(x) subject to c(x) ≤ 0 ceq(x) = 0 Ax ≤ b Aeqx = beq lb ≤ x ≤ ub Matlab: [x,fval,exitflag,output] = fmincon(fun, x0, A, b, ... Aeq, beq, lb, ub, nonlcon, options) Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Algorithms for fmincon trust-region reflective: requires the gradient and allows only bounds or linear equality constraints, but not both. Works on large sparse and small dense problems efficiently. active-set can take large steps to converge fast. It is effective on some small problems with nonsmooth constraints. sqp satisfies bounds at each iteration. Not for large-scale problems. interior-point: for large+sparse or small+dense problems. Designed for large problems, can recover from NaN or Inf results. Satisfies bounds at each iteration. Use the documentation for fmincon and optimoptions functions for details. You can use optimtool for a graphical user interface to optimization toolbox! Vlad Bi7740: Scientific computing Problem setting Optimization in R Optimization in Rn Important classes of optimization problems Linear programming Quadratic programming Constrained nonlinear optimization Exercise Design the optimal beer can! It must be: cylindrical ecological: uses the minimum amount of materials (i.e. minimum total surface) of exact volume V = 333cm3 not higher than twice its diameter Tasks: 1 identify the variables 2 write the mathematical formulation of the problem 3 write the formula of the gradient of the objective function and the Jacobian of the nonlinear constraint function 4 implement in Matlab Vlad Bi7740: Scientific computing