Parallel processing is an attractive way to speed optimization
algorithms. To use parallel processing, you must have a Parallel Computing Toolbox™ license,
and have a parallel worker pool (parpool
). For
more information, see How to Use Parallel Processing in Global Optimization Toolbox.
Global Optimization Toolbox solvers use parallel computing in various ways.
Solver | Parallel? | Parallel Characteristics |
---|---|---|
| × | No parallel functionality. However, fmincon
can use parallel gradient estimation when run in GlobalSearch . See Using Parallel Computing in Optimization Toolbox. |
|
| Start points distributed to multiple processors. From these points, local solvers run to completion. For more details, see MultiStart and How to Use Parallel Processing in Global Optimization Toolbox. |
For fmincon , no parallel gradient estimation
with parallel MultiStart . | ||
|
| Population evaluated in parallel, which occurs once per iteration. For more details, see Genetic Algorithm and How to Use Parallel Processing in Global Optimization Toolbox. |
No vectorization of fitness or constraint functions. | ||
|
| Population evaluated in parallel, which occurs once per iteration. For more details, see Particle Swarm and How to Use Parallel Processing in Global Optimization Toolbox. |
No vectorization of objective or constraint functions. | ||
|
| Poll points evaluated in parallel, which occurs once per iteration. For more details, see Pattern Search and How to Use Parallel Processing in Global Optimization Toolbox. |
No vectorization of objective or constraint functions. | ||
| × | No parallel functionality. However,
simulannealbnd can use a hybrid function
that runs in parallel. See Simulated Annealing. |
|
| Search points evaluated in parallel. |
No vectorization of objective or constraint functions. |
In addition, several solvers have hybrid functions that run
after they finish. Some hybrid functions can run in parallel. Also,
most patternsearch
search methods can run in
parallel. For more information, see Parallel Search Functions or Hybrid Functions.
No Nested parfor Loops. Most solvers employ the Parallel Computing Toolbox
parfor
(Parallel Computing Toolbox) function to perform
parallel computations. Two solvers, surrogateopt
and
paretosearch
, use parfeval
(Parallel Computing Toolbox) instead.
Note
parfor
does not work in parallel when called from
within another parfor
loop.
Note
The documentation recommends not to use parfor
or
parfeval
when calling Simulink®; see Using sim function within parfor (Simulink). Therefore, you might
encounter issues when optimizing a Simulink simulation in parallel using a solver's built-in parallel
functionality.
Suppose, for example, your objective function userfcn
calls parfor
, and you want to call
fmincon
using MultiStart
and parallel processing. Suppose also that the
conditions for parallel gradient evaluation of fmincon
are satisfied, as given in Parallel Optimization Functionality. The figure When parfor Runs In Parallel shows
three cases:
The outermost loop is parallel MultiStart
. Only that loop runs in parallel.
The outermost parfor
loop is in
fmincon
. Only fmincon
runs in parallel.
The outermost parfor
loop is in
userfcn
. In this case,
userfcn
can use parfor
in parallel.
When parfor Runs In Parallel
Parallel Random Numbers Are Not Reproducible. Random number sequences in MATLAB® are pseudorandom, determined from a seed, or an initial setting. Parallel computations use seeds that are not necessarily controllable or reproducible. For example, each instance of MATLAB has a default global setting that determines the current seed for random sequences.
For patternsearch
, if you select MADS as
a poll or search method, parallel pattern search does not have reproducible
runs. If you select the genetic algorithm or Latin hypercube as search
methods, parallel pattern search does not have reproducible runs.
For ga
and gamultiobj
,
parallel population generation gives nonreproducible results.
MultiStart
is different. You can have
reproducible runs from parallel MultiStart
. Runs
are reproducible because MultiStart
generates pseudorandom
start points locally, and then distributes the start points to parallel
processors. Therefore, the parallel processors do not use random numbers.
For more details, see Parallel Processing and Random Number Streams.
Limitations and Performance Considerations. More caveats related to parfor
appear in Parallel for-Loops (parfor) (Parallel Computing Toolbox).
For information on factors that affect the speed of parallel computations, and factors that affect the results of parallel computations, see Improving Performance with Parallel Computing. The same considerations apply to parallel computing with Global Optimization Toolbox functions.
MultiStart
can automatically distribute a problem
and start points to multiple processes or processors. The problems
run independently, and MultiStart
combines the distinct
local minima into a vector of GlobalOptimSolution
objects. MultiStart
uses
parallel computing when you:
Have a license for Parallel Computing Toolbox software.
Enable parallel computing with parpool
,
a Parallel Computing Toolbox function.
Set the UseParallel
property to true
in
the MultiStart
object:
ms = MultiStart('UseParallel',true);
When these conditions hold, MultiStart
distributes
a problem and start points to processes or processors one at a time.
The algorithm halts when it reaches a stopping condition or runs out
of start points to distribute. If the MultiStart
Display
property
is 'iter'
, then MultiStart
displays:
Running the local solvers in parallel.
For an example of parallel MultiStart
, see Parallel MultiStart.
Implementation Issues in Parallel MultiStart. fmincon
cannot estimate gradients in parallel
when used with parallel MultiStart
. This lack of
parallel gradient estimation is due to the limitation of parfor
described
in No Nested parfor Loops.
fmincon
can take longer to estimate gradients in parallel rather than in
serial. In this case, using MultiStart
with parallel gradient estimation in fmincon
amplifies
the slowdown. For example, suppose the ms
MultiStart
object has
UseParallel
set to false
. Suppose
fmincon
takes 1 s longer to solve
problem
with
problem.options.UseParallel
set to
true
. Then run(ms,problem,200)
takes 200 s longer than the same run with
problem.options.UseParallel
set to
false
Note
When executing serially, parfor
loops run
slower than for
loops. Therefore, for best performance,
set your local solver UseParallel
option to false
when
the MultiStart
UseParallel
property
is true
.
Note
Even when running in parallel, a solver occasionally calls the objective and nonlinear constraint functions serially on the host machine. Therefore, ensure that your functions have no assumptions about whether they are evaluated in serial and parallel.
GlobalSearch
does not distribute a problem
and start points to multiple processes or processors. However, when GlobalSearch
runs
the fmincon
local solver, fmincon
can
estimate gradients by parallel finite differences. fmincon
uses
parallel computing when you:
Have a license for Parallel Computing Toolbox software.
Enable parallel computing with parpool
,
a Parallel Computing Toolbox function.
Set the UseParallel
option to true
with optimoptions
.
Set this option in the problem
structure:
opts = optimoptions(@fmincon,'UseParallel',true,'Algorithm','sqp'); problem = createOptimProblem('fmincon','objective',@myobj,... 'x0',startpt,'options',opts);
For more details, see Using Parallel Computing in Optimization Toolbox.
patternsearch
can automatically distribute
the evaluation of objective and constraint functions associated with
the points in a pattern to multiple processes or processors. patternsearch
uses
parallel computing when you:
Have a license for Parallel Computing Toolbox software.
Enable parallel computing with parpool
,
a Parallel Computing Toolbox function.
Set the following options using
optimoptions
:
UseCompletePoll
is
true
.
UseVectorized
is false
(default).
UseParallel
is
true
.
When these conditions hold, the solver computes the objective
function and constraint values of the pattern search in parallel during
a poll. Furthermore, patternsearch
overrides
the setting of the Cache
option, and uses the default 'off'
setting.
Beginning in R2019a, when you set the
UseParallel
option to true
,
patternsearch
internally overrides the
UseCompletePoll
setting to true
so it polls in
parallel.
Note
Even when running in parallel, patternsearch
occasionally
calls the objective and nonlinear constraint functions serially on
the host machine. Therefore, ensure that your functions have no assumptions
about whether they are evaluated in serial or parallel.
Parallel Search Function. patternsearch
can optionally call a search
function at each iteration. The search is parallel when you:
Set UseCompleteSearch
to true
.
Do not set the search method to @searchneldermead
or custom
.
Set the search method to a patternsearch
poll
method or Latin hypercube search, and set UseParallel
to true
.
Or, if you set the search method to ga
,
create a search method option with UseParallel
set
to true
.
Implementation Issues in Parallel Pattern Search. The limitations on patternsearch
options,
listed in Pattern Search, arise partly
from the limitations of parfor
, and partly from
the nature of parallel processing:
Cache
is overridden to be 'off'
— patternsearch
implements Cache
as
a persistent variable. parfor
does not handle
persistent variables, because the variable could have different settings
at different processors.
UseCompletePoll
is true
— UseCompletePoll
determines
whether a poll stops as soon as patternsearch
finds
a better point. When searching in parallel, parfor
schedules
all evaluations simultaneously, and patternsearch
continues
after all evaluations complete. patternsearch
cannot
halt evaluations after they start.
Beginning in R2019a, when you set the
UseParallel
option to true
,
patternsearch
internally overrides the
UseCompletePoll
setting to true
so it polls in
parallel.
UseVectorized
is false
— UseVectorized
determines
whether patternsearch
evaluates all points in
a pattern with one function call in a vectorized fashion. If UseVectorized
is true
, patternsearch
does
not distribute the evaluation of the function, so does not use parfor
.
ga
and gamultiobj
can
automatically distribute the evaluation of objective and nonlinear
constraint functions associated with a population to multiple processors. ga
uses
parallel computing when you:
Have a license for Parallel Computing Toolbox software.
Enable parallel computing with parpool
,
a Parallel Computing Toolbox function.
Set the following options using
optimoptions
:
UseVectorized
is false
(default).
UseParallel
is
true
.
When these conditions hold, ga
computes
the objective function and nonlinear constraint values of the individuals
in a population in parallel.
Note
Even when running in parallel, ga
occasionally
calls the fitness and nonlinear constraint functions serially on the
host machine. Therefore, ensure that your functions have no assumptions
about whether they are evaluated in serial or parallel.
Implementation Issues in Parallel Genetic Algorithm. The limitations on options, listed in Genetic Algorithm, arise partly from limitations of parfor
,
and partly from the nature of parallel processing:
UseVectorized
is false
— UseVectorized
determines
whether ga
evaluates an entire population with
one function call in a vectorized fashion. If UseVectorized
is true
, ga
does
not distribute the evaluation of the function, so does not use parfor
.
ga
can have a hybrid function that runs
after it finishes; see Hybrid Scheme in the Genetic Algorithm. If you want the hybrid function
to take advantage of parallel computation, set its options separately
so that UseParallel
is true
.
If the hybrid function is patternsearch
, set UseCompletePoll
to true
so
that patternsearch
runs in parallel.
If the hybrid function is fmincon
, set
the following options with optimoptions
to have
parallel gradient estimation:
GradObj
must not be 'on'
—
it can be 'off'
or []
.
Or, if there is a nonlinear constraint function, GradConstr
must
not be 'on'
— it can be 'off'
or []
.
To find out how to write options for the hybrid function, see Parallel Hybrid Functions.
Parallel computing with gamultiobj
works
almost the same as with ga
. For detailed information,
see Genetic Algorithm.
The difference between parallel computing with gamultiobj
and ga
has
to do with the hybrid function. gamultiobj
allows
only one hybrid function, fgoalattain
. This function
optionally runs after gamultiobj
finishes its
run. Each individual in the calculated Pareto frontier, that is, the
final population found by gamultiobj
, becomes
the starting point for an optimization using fgoalattain
.
These optimizations run in parallel. The number of processors performing
these optimizations is the smaller of the number of individuals and
the size of your parpool
.
For fgoalattain
to run in parallel, set
its options correctly:
fgoalopts = optimoptions(@fgoalattain,'UseParallel',true) gaoptions = optimoptions('ga','HybridFcn',{@fgoalattain,fgoalopts});
gamultiobj
with gaoptions
,
and fgoalattain
runs in parallel. For more information
about setting the hybrid function, see Hybrid Function Options.gamultiobj
calls fgoalattain
using
a parfor
loop, so fgoalattain
does
not estimate gradients in parallel when used as a hybrid function
with gamultiobj
. For more information, see No Nested parfor Loops.
particleswarm
can automatically distribute
the evaluation of the objective function associated with a population
to multiple processors. particleswarm
uses parallel
computing when you:
Have a license for Parallel Computing Toolbox software.
Enable parallel computing with parpool
,
a Parallel Computing Toolbox function.
Set the following options using optimoptions
:
UseVectorized
is false
(default).
UseParallel
is true
.
When these conditions hold, particleswarm
computes
the objective function of the particles in a population in parallel.
Note
Even when running in parallel, particleswarm
occasionally
calls the objective function serially on the host machine. Therefore,
ensure that your objective function has no assumptions about whether
it is evaluated in serial or parallel.
Implementation Issues in Parallel Particle Swarm Optimization. The limitations on options, listed in Particle Swarm, arise partly from limitations of parfor
,
and partly from the nature of parallel processing:
UseVectorized
is false
— UseVectorized
determines
whether particleswarm
evaluates an entire population
with one function call in a vectorized fashion. If UseVectorized
is true
, particleswarm
does
not distribute the evaluation of the function, so does not use parfor
.
particleswarm
can have a hybrid function
that runs after it finishes; see Hybrid Scheme in the Genetic Algorithm. If you want the hybrid function
to take advantage of parallel computation, set its options separately
so that UseParallel
is true
.
If the hybrid function is patternsearch
, set UseCompletePoll
to true
so
that patternsearch
runs in parallel.
If the hybrid function is fmincon
, set
the GradObj
option to 'off'
or []
with optimoptions
to
have parallel gradient estimation.
To find out how to write options for the hybrid function, see Parallel Hybrid Functions.
simulannealbnd
does not run in parallel automatically.
However, it can call hybrid functions that take advantage of parallel computing.
To find out how to write options for the hybrid function, see Parallel Hybrid Functions.
paretosearch
can automatically distribute the evaluation
of the objective function associated with a population to multiple processors.
paretosearch
uses parallel computing when you:
Have a license for Parallel Computing Toolbox software.
Enable parallel computing with parpool
, a
Parallel Computing Toolbox function.
Set the following option using
optimoptions
:
UseParallel
is
true
.
When these conditions hold, paretosearch
computes the
objective function of the particles in a population in parallel.
Note
Even when running in parallel, paretosearch
occasionally calls the objective function serially on the host machine.
Therefore, ensure that your objective function has no assumptions about
whether it is evaluated in serial or parallel.
For algorithmic details, see Modifications for Parallel Computation and Vectorized Function Evaluation.
surrogateopt
can automatically distribute the evaluation
of the objective function associated with a population to multiple processors.
surrogateopt
uses parallel computing when you:
Have a license for Parallel Computing Toolbox software.
Enable parallel computing with parpool
, a
Parallel Computing Toolbox function.
Set the following option using
optimoptions
:
UseParallel
is
true
.
When these conditions hold, surrogateopt
computes the
objective function of the particles in a population in parallel.
Note
Even when running in parallel, surrogateopt
occasionally calls the objective function serially on the host machine.
Therefore, ensure that your objective function has no assumptions about
whether it is evaluated in serial or parallel.
For algorithmic details, see Parallel surrogateopt Algorithm.