Optimized GPU implementation of functions containing matrix-matrix operations
C = gpucoder.matrixMatrixKernel(FUN,A,B)
generates kernels
from functions that contain GEMM-like operations. For example, matching feature
points between two images by using:
The sum of absolute differences (SAD) — F() =
@(a,b)abs(a-b)
The sum of squared differences (SSD) — F() =
@(a,b)(a-b).*(a-b)
FUN
is a handle to a user-defined function. It takes one row
and column from matrix A
and one row and column from matrix
B
and outputs a vector with the same type as the input. The
output vector is then summed to compute a single scalar value in
C
. Numeric inputs A
and
B
must be either of the same size or have sizes that are
compatible. For example, if A
is an
M
-by-K
matrix, B
is a
K
-by-N
matrix then C
is
an M
-by-N
matrix.
C = gpucoder.matrixMatrixKernel(FUN,A,B,orientation)
has the
optional argument orientation
that specifies the orientation of
A
and B
matrices. It can take one of four
possible values:
'nn'
- Matrices A
and
B
are normal.
'nt'
- Matrix B
is
transposed.
'tn'
- Matrix A
is
transposed.
'tt'
- Both matrices A
and
B
are transposed.
codegen
| coder.gpu.constantMemory
| coder.gpu.kernel
| coder.gpu.kernelfun
| coder.gpu.nokernel
| gpucoder.batchedMatrixMultiply
| gpucoder.batchedMatrixMultiplyAdd
| gpucoder.stencilKernel
| gpucoder.stridedMatrixMultiply
| gpucoder.stridedMatrixMultiplyAdd