The example shows how to include derivative information in nonlinear problem-based optimization. Including gradients or a Hessian in an optimization can give the following benefits:
More robust results. Finite differencing steps sometimes reach points where the objective or a nonlinear constraint function is undefined, not finite, or complex.
Analytic gradients can be more accurate than finite difference estimates.
Including a Hessian can lead to faster convergence, meaning fewer iterations.
Analytic gradients can be faster to calculate than finite difference estimates, especially for problems with a sparse structure. For complicated expressions, however, analytic gradients can be slower to calculate.
Despite these advantages, the problem-based approach currently does not use derivative
information. To use derivatives in problem-based optimization, convert your problem using
prob2struct
, and
edit the resulting objective and constraint functions.
Create a problem-based nonlinear optimization. With 2-D control variables
x
and y
, use the objective function
Include the constraint that the sum of squares of x
and
y
is no more than 4.
fun2
is not based on supported functions for optimization
expressions; see Supported Operations on Optimization Variables and Expressions. Therefore, to include
fun2
in an optimization problem, you must convert it to an optimization
expression using fcn2optimexpr
.
prob = optimproblem; x = optimvar('x',2); y = optimvar('y',2); fun1 = 100*(x(2) - x(1)^2)^2 + (1 - x(1))^2; fun2 = @(x,y)-exp(-sum((x - y).^2))*exp(-exp(-y(1)))*sech(y(2)); prob.Objective = fun1 + fcn2optimexpr(fun2,x,y); prob.Constraints.cons = sum(x.^2 + y.^2) <= 4;
To include derivatives, convert the problem to a structure using prob2struct
.
problem = prob2struct(prob);
During the conversion, prob2struct
creates function files that
represent the objective and nonlinear constraint functions. By default, these functions have
the names 'generatedObjective.m'
and
'generatedConstraints.m'
.
Calculate the derivatives associated with the objective and nonlinear constraint
functions. If you have a Symbolic Math
Toolbox™ license, you can use the gradient
or jacobian
function to help compute the
derivatives. See Symbolic Math Toolbox™ Calculates Gradients and Hessians.
The solver-based approach has one control variable. Each optimization variable
(x
or y
, in this example) is a portion of the
control variable.
You can find the mapping between optimization variables and the single control variable
in the generated objective function file, 'generatedObjective.m'
. The
beginning of the file contains these lines of code:
%% Variable indices. idx_x = [1 2]; idx_y = [3 4]; %% Map solver-based variables to problem-based. x = inputVariables(idx_x); x = x(:); y = inputVariables(idx_y); y = y(:);
In this code, the control variable has the name
inputVariables
.
Alternatively, you can find the mapping by using varindex
.
idx = varindex(prob); disp(idx.x)
1 2
disp(idx.y)
3 4
Once you know the mapping, you can use standard calculus to find the following
expressions for the gradient grad
of the objective function
objective = fun1 + fun2
with respect to the control variable
[x(:);y(:)]
.
where
To include the calculated gradients in the objective function file, edit
'generatedObjective.m'
as follows.
%% Insert gradient calculation here. % If you include a gradient, notify the solver by setting the % SpecifyObjectiveGradient option to true. if nargout > 1 c = exp(-sum((x - y).^2)); d = c*exp(-exp(-y(1)))*sech(y(2)); sigma1 = 2*(x(1) - y(1))*d; sigma2 = 2*(x(2) - y(2))*d; grad = zeros(4,1); grad(1) = 2*x(1) - 400*x(1)*(x(2) - x(1)^2) + sigma1 - 2; grad(2) = 200*x(2) - 200*x(1)^2 + sigma2; grad(3) = -sigma1 - d*exp(-y(1)); grad(4) = -sigma2 + d*tanh(y(2)); end
Recall that the nonlinear constraint is x(1)^2 + x(2)^2 + y(1)^2 + y(2)^2 <=
4
. Clearly, the gradient of this constraint function is
2*[x;y]
. To include the calculated gradients of the nonlinear
constraint, edit 'generatedConstraints.m'
as follows.
%% Insert gradient calculation here. % If you include a gradient, notify the solver by setting the % SpecifyConstraintGradient option to true. if nargout > 2 cineqGrad = 2*[x;y]; ceqGrad = []; end
Run the problem using both the problem-based and solver-based methods to see the differences. To run the problem using derivative information, create appropriate options in the problem structure.
options = optimoptions('fmincon','SpecifyObjectiveGradient',true,... 'SpecifyConstraintGradient',true); problem.options = options;
Nonlinear problems require a nonempty x0
field in the problem
structure.
x0 = [-1;2;1;-1]; problem.x0 = x0;
Call fmincon
on the problem structure.
[xsolver,fvalsolver,exitflagsolver,outputsolver] = fmincon(problem)
Local minimum found that satisfies the constraints. Optimization completed because the objective function is non-decreasing in feasible directions, to within the value of the optimality tolerance, and constraints are satisfied to within the value of the constraint tolerance. <stopping criteria details> xsolver = 0.8671 0.7505 1.0433 0.5140 fvalsolver = -0.5500 exitflagsolver = 1 outputsolver = struct with fields: iterations: 46 funcCount: 77 constrviolation: 0 stepsize: 7.4091e-06 algorithm: 'interior-point' firstorderopt: 7.5203e-07 cgiterations: 9 message: '↵Local minimum found that satisfies the constraints.↵↵Optimization completed because the objective function is non-decreasing in ↵feasible directions, to within the value of the optimality tolerance,↵and constraints are satisfied to within the value of the constraint tolerance.↵↵<stopping criteria details>↵↵Optimization completed: The relative first-order optimality measure, 7.520304e-07,↵is less than options.OptimalityTolerance = 1.000000e-06, and the relative maximum constraint↵violation, 0.000000e+00, is less than options.ConstraintTolerance = 1.000000e-06.↵↵'
Compare this solution with the one obtained from solve
, which uses
no derivative information.
init.x = x0(1:2); init.y = x0(3:4); [xproblem,fvalproblem,exitflagproblem,outputproblem] = solve(prob,init)
Local minimum found that satisfies the constraints. Optimization completed because the objective function is non-decreasing in feasible directions, to within the value of the optimality tolerance, and constraints are satisfied to within the value of the constraint tolerance. <stopping criteria details> xproblem = struct with fields: x: [2×1 double] y: [2×1 double] fvalproblem = -0.5500 exitflagproblem = OptimalSolution outputproblem = struct with fields: iterations: 48 funcCount: 276 constrviolation: 0 stepsize: 7.9340e-07 algorithm: 'interior-point' firstorderopt: 6.5496e-07 cgiterations: 9 message: '↵Local minimum found that satisfies the constraints.↵↵Optimization completed because the objective function is non-decreasing in ↵feasible directions, to within the value of the optimality tolerance,↵and constraints are satisfied to within the value of the constraint tolerance.↵↵<stopping criteria details>↵↵Optimization completed: The relative first-order optimality measure, 6.549635e-07,↵is less than options.OptimalityTolerance = 1.000000e-06, and the relative maximum constraint↵violation, 0.000000e+00, is less than options.ConstraintTolerance = 1.000000e-06.↵↵' solver: 'fmincon'
Both solutions report the same function value to display precision, and both require roughly the same number of iterations (46 using gradient information, 48 without). However, the solution using gradients requires only 77 function evaluations, compared to 276 for the solution without gradients.
fcn2optimexpr
| prob2struct
| varindex