Distributed Arrays

Analyze big data sets in parallel using distributed arrays and simultaneous execution

Parallel Computing Toolbox™ supports distributed arrays to partition large arrays across multiple MATLAB® workers. You operate on the entire array as a single entity, however, workers operate only on their part of the array, and automatically transfer data between themselves when necessary. Simultaneous execution is supported by the single program multiple data (spmd) language construct to facilitate communication between workers. Use distributed-enabled matrix operations and functions to work directly with these arrays without further modification. You can use distributed arrays in Parallel Computing Toolbox to run big data applications using the combined memory of your cluster.

Functions

expand all

distributedCreate distributed array from data in the client workspace or a datastore
gatherTransfer distributed array or gpuArray to local workspace
spmdExecute code in parallel on workers of parallel pool
CompositeCreate Composite object
parallel.pool.ConstantBuild parallel.pool.Constant from data or function handle
codistributedCreate codistributed array from replicated local data
parpoolCreate parallel pool on cluster
delete (Pool)Shut down parallel pool
redistributeRedistribute codistributed array with another distribution scheme
codistributed.buildCreate codistributed array from distributed data
forfor-loop over distributed range
getLocalPartLocal portion of codistributed array
globalIndicesGlobal indices for local part of codistributed array
gopGlobal operation across all workers
writeWrite distributed data to an output location
pagefunApply function to each page of distributed array or gpuArray

Classes

expand all

distributedAccess elements of distributed arrays from client
codistributedAccess elements of arrays distributed among workers in parallel pool
CompositeAccess nondistributed variables on multiple workers from client
codistributor1d1-D distribution scheme for codistributed array
codistributor2dbc2-D block-cyclic distribution scheme for codistributed array
parallel.PoolParallel pool of workers

Examples and How To

Create and Use Distributed Arrays

When your data array is too big to fit into the memory of a single machine, you can create a distributed array.

Run MATLAB Functions with Distributed Arrays

MATLAB functions that operate on distributed arrays

Distributing Arrays to Parallel Workers

Use datastore or distributed to create distributed arrays and partition the data among your workers

Run Single Programs on Multiple Data Sets

Use spmd statements to run the same code on multiple datasets and control codistributed arrays

Access Worker Variables with Composites

Composite objects in the MATLAB client session let you directly access data values on the workers.

Train Network in Parallel with Custom Training Loop

This example shows how to set up a custom training loop to train a network in parallel.

Using GOP to Achieve MPI_Allreduce Functionality

In this example, we look at the gop function and the functions that build on it: gplus and gcat.

Numerical Estimation of Pi Using Message Passing

This example shows the basics of working with spmd statements, and how they provide an interactive means of performing parallel computations.

Choose Between spmd, parfor, and parfeval

Compare and contrast spmd against other parallel computing functionality such as parfor and parfeval.

Concepts

Run Code on Parallel Pools

Learn about starting and stopping parallel pools, pool size, and cluster selection.

Specify Your Parallel Preferences

Specify your preferences, and automatically create a parallel pool.

Nondistributed Versus Distributed Arrays

Describes the various types of arrays used in communicating jobs

Working with Codistributed Arrays

Describes how to use codistributed arrays for calculation

Looping Over a Distributed Range (for-drange)

Describes how to program a for-loop with codistributed arrays

Work with Remote Data

Work with remote data in Amazon S3™, Microsoft® Azure® Storage Blob, or HDFS™.

Featured Examples