Deep learning transposed convolution
The transposed convolution operation upsamples feature maps.
Note
This function applies the deep learning transposed convolution operation to dlarray
data. If
you want to apply transposed convolution within a layerGraph
object
or Layer
array, use
one of the following layers:
computes the deep learning transposed convolution of the input dlY
= dltranspconv(dlX
,weights
,bias
)dlX
using
the filters defined by weights
, and adds a constant
bias
. The input dlX
is a formatted
dlarray
with dimension labels. Transposed convolution acts on
dimensions that you specify as 'S'
and 'C'
dimensions.
The output dlY
is a formatted dlarray
with the same
dimension labels as dlX
.
specifies options using one or more name-value pair arguments in addition to the input
arguments in previous syntaxes. For example, dlY
= dltranspconv(___Name,Value
)'Stride',3
sets the stride
of the convolution operation.
Convolve an image and then use transposed convolution to resize the convolved image to the same size as the original image.
Import the image data and convert it to a dlarray
.
X = imread('sherlock.jpg'); dlX = dlarray(single(X),'SSC');
Display the image.
imshow(X)
Initialize the convolutional filters and bias term. Specify an ungrouped convolution that applies a single filter to all three channels of the input data.
filterHeight = 10; filterWidth = 10; numChannelsPerGroup = 3; numFiltersPerGroup = 1; numGroups = 1; weights = rand(filterHeight,filterWidth,numChannelsPerGroup,numFiltersPerGroup,numGroups); bias = rand(numFiltersPerGroup*numGroups,1);
Perform the convolution. Use a 'Stride'
value of 2
and a 'DilationFactor'
value of 2
.
dlY = dlconv(dlX,weights,bias,'Stride',2,'DilationFactor',3);
Display the convolved image.
Y = extractdata(dlY); imshow(rescale(Y))
Initialize the transposed convolutional filters and bias. Specify an ungrouped transposed convolution that applies three filters to the input. Use the same filter height and filter width as for the convolution operation.
numChannelsPerGroupTC = 1; numFiltersPerGroupTC = 3; weightsTC = rand(filterHeight,filterWidth,numFiltersPerGroupTC,numChannelsPerGroupTC,numGroups); biasTC = rand(numFiltersPerGroupTC*numGroups,1);
Perform the transposed convolution. Use the same stride and dilation factor as for the convolution operation.
dlZ = dltranspconv(dlY,weightsTC,biasTC,'Stride',2,'DilationFactor',3);
Display the image after the transposed convolution.
Z = extractdata(dlZ); imshow(rescale(Z))
Compare the size of the original image, the convolved image, and the image after the transposed convolution.
sizeX = size(X)
sizeX = 1×3
640 960 3
sizeY = size(Y)
sizeY = 1×2
307 467
sizeZ = size(Z)
sizeZ = 1×3
640 960 3
The transposed convolution upsamples the convolved data to the size of the original input data.
Apply transposed convolution to the input data in three groups of two channels each. Apply four filters per group.
Create the input data as ten observations of size 100-by-100 with six channels.
height = 100;
width = 100;
channels = 6;
numObservations = 10;
X = rand(height,width,channels,numObservations);
dlX = dlarray(X,'SSCB');
Initialize the filters for the transposed convolution operation. Specify three groups of transposed convolutions that each apply four filters to two channels of the input data.
filterHeight = 8; filterWidth = 8; numChannelsPerGroup = 2; numFiltersPerGroup = 4; numGroups = 3; weights = rand(filterHeight,filterWidth,numFiltersPerGroup,numChannelsPerGroup,numGroups);
Initialize the bias term.
bias = rand(numFiltersPerGroup*numGroups,1);
Perform the transposed convolution.
dlY = dltranspconv(dlX,weights,bias); size(dlY)
ans = 1×4
107 107 12 10
dims(dlY)
ans = 'SSCB'
The 12 channels of the convolution output represent the three groups of transposed convolutions with four filters per group.
dlX
— Input datadlarray
| numeric arrayInput data, specified as a dlarray
with or without dimension
labels or a numeric array. When dlX
is not a formatted
dlarray
, you must specify the dimension label format using
'DataFormat',FMT
. If dlX
is a numeric array, at
least one of weights
or bias
must be a
dlarray
.
Convolution acts on dimensions that you specify as spatial dimensions using the
'S'
dimension label. You can specify up to three dimensions in
dlX
as 'S'
dimensions.
Data Types: single
| double
weights
— Filtersdlarray
| numeric arrayFilters, specified as a dlarray
with or without labels or a
numeric array. The weights
argument specifies the size and values of
the filters, as well as the number of filters and the number of groups for grouped
transposed convolutions.
Specify weights as a
filterSize
-by-numFiltersPerGroup
-by-numChannelsPerGroup
-by-numGroups
array.
filterSize
— Size of the convolutional filters.
filterSize
can have up to three dimensions, depending on the
number of spatial dimensions in the input data.
Input Data 'S' Dimensions | filterSize |
---|---|
1-D | h, where h corresponds to the height of the filter |
2-D | h-by-w, where h and w correspond to the height and width of the filter, respectively |
3-D | h-by-w-by-d, where h, w, and d correspond to the height, width, and depth of the filter, respectively |
numFiltersPerGroup
— Number of filters to apply within each
group.
numChannelsPerGroup
— Number of channels within each group
for grouped transposed convolutions. numChannelsPerGroup
must
equal the number of channels in the input data divided by
numGroups
, the number of groups. For ungrouped convolutions,
where numGroups = 1
, numChannelsPerGroup
must
equal the number of channels in the input data.
numGroups
— Number of groups (optional). When
numGroups > 1
, the function performs grouped transposed
convolutions. When numGroups = 1
, the function performs ungrouped
transposed convolutions; in this case, this dimension is singleton and can be
omitted.
If weights
is a formatted dlarray
, it can have
multiple spatial dimensions labeled 'S'
, one channel dimension
labeled 'C'
, and up to two other dimensions labeled
'U'
. The number of 'S'
dimensions must match the
number of 'S'
dimensions of the input data. The labeled dimensions
correspond to the filter specifications as follows.
Filter Specification | Dimension Labels |
---|---|
filterSize | Up to three 'S' dimensions |
numFiltersPerGroup | 'C' dimension |
numChannelsPerGroup | First 'U' dimension |
numGroups (optional) | Second 'U' dimension |
Data Types: single
| double
bias
— Bias constantdlarray
vector | dlarray
scalar | numeric vector | numeric scalarBias constant, specified as a dlarray
vector or
dlarray
scalar with or without labels, a numeric vector, or a
numeric scalar.
If bias
is a scalar or has only singleton dimensions, the
same bias is applied to each entry of the output.
If bias
has a nonsingleton dimension, each element of
bias
is the bias applied to the corresponding convolutional
filter specified by weights
. The number of elements of
bias
must match the number of filters specified by
weights
.
If bias
is a formatted dlarray
, the
nonsingleton dimension must be a channel dimension labeled 'C'
.
Data Types: single
| double
Specify optional
comma-separated pairs of Name,Value
arguments. Name
is
the argument name and Value
is the corresponding value.
Name
must appear inside quotes. You can specify several name and value
pair arguments in any order as
Name1,Value1,...,NameN,ValueN
.
'Stride',2
sets the stride of each filter to
2.'DataFormat'
— Dimension order of unformatted dataDimension order of unformatted input data, specified as the comma-separated pair
consisting of 'DataFormat'
and a character array or string
FMT
that provides a label for each dimension of the data. Each
character in FMT
must be one of the following:
'S'
— Spatial
'C'
— Channel
'B'
— Batch (for example, samples and
observations)
'T'
— Time (for example, sequences)
'U'
— Unspecified
You can specify multiple dimensions labeled 'S'
or
'U'
. You can use the labels 'C'
,
'B'
, and 'T'
at most once.
You must specify 'DataFormat'
when the input data
dlX
is not a formatted dlarray
.
Example: 'DataFormat','SSCB'
Data Types: char
| string
'Stride'
— Step size for traversing input data1
(default) | numeric scalar | numeric vectorStep size for traversing the input data, specified as the comma-separated pair consisting of
'Stride'
and a numeric scalar or numeric vector. If you specify
'Stride'
as a scalar, the same value is used for all spatial
dimensions. If you specify 'Stride'
as a vector of the same size as
the number of spatial dimensions of the input data, the vector values are used for the
corresponding spatial dimensions.
The default value of 'Stride'
is 1
.
Example: 'Stride',3
Data Types: single
| double
'DilationFactor'
— Filter dilation factor1
(default) | numeric scalar | numeric vectorFilter dilation factor, specified as the comma-separated pair consisting of 'DilationFactor'
and one of the following.
Numeric scalar — The same dilation factor value is applied for all spatial dimensions.
Numeric vector — A different dilation factor value is applied along each
spatial dimension. Use a vector of size d
, where
d
is the number of spatial dimensions of the input
data. The i
th element of the vector specifies the
dilation factor applied to the i
th spatial
dimension.
Use the dilation factor to increase the receptive field of the filter (the area of the input that the filter can see) on the input data. Using a dilation factor corresponds to an effective filter size of filterSize + (filterSize-1)*(dilationFactor-1)
.
Example: 'DilationFactor',2
Data Types: single
| double
'Cropping'
— Cropping applied to edges of data'same'
| numeric scalar | numeric vector | numeric matrixCropping applied to edges of data, specified as the comma-separated pair
consisting of 'Cropping'
and one of the following.
'same'
— Cropping is set so that the output size is the
same as the input size when the stride is 1
. More generally,
the output size of each spatial dimension is
inputSize*stride
, where inputSize
is the
size of the input along a spatial dimension.
Numeric scalar — The same cropping value is applied to both ends of all spatial dimensions.
Numeric vector — A different cropping value is applied along each spatial
dimension. Use a vector of size d
, where d
is the number of spatial dimensions of the input data. The
i
th element of the vector specifies the cropping applied to
the start and the end along the i
th spatial dimension.
Numeric matrix — A different cropping value is applied to the start and end
of each spatial dimension. Use a matrix of size 2-by-d
, where
d
is the number of spatial dimensions of the input data.
The element (1,d)
specifies the cropping applied to the start
of spatial dimension d
. The element (2,d)
specifies the cropping applied to the end of spatial dimension
d
. For example, in 2-D the format is [top, left;
bottom, right]
.
Example: 'Cropping','same'
Data Types: single
| double
dlY
— Feature mapdlarray
Feature map, returned as a dlarray
. The output
dlY
has the same underlying data type as the input
dlX
.
If the input data dlX
is a formatted dlarray
,
dlY
has the same dimension labels as dlX
. If the
input data is not a formatted dlarray
, dlY
is an
unformatted dlarray
or numeric array with the same dimension order as
the input data.
The size of the 'C'
channel dimension of dlY
depends on the size of the weights
input. The size of the
'C'
dimension of output Y
is the product of the
size of the dimensions numFiltersPerGroup
and
numGroups
in the weights
argument. If
weights
is a formatted dlarray
, this product is
the same as the product of the size of the 'C'
dimension and the
second 'U'
dimension.
Usage notes and limitations:
When at least one of the following input arguments is a gpuArray
or a dlarray
with underlying data of type
gpuArray
, this function runs on the GPU.
dlX
weights
bias
For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).
You have a modified version of this example. Do you want to open this example with your edits?