Statistics and Machine Learning Toolbox™ supports the following data types for input arguments:
Numeric scalars, vectors, matrices, or arrays having
single- or double-precision entries. These data forms have data type single
or double
.
Examples include response variables, predictor variables, and numeric
values.
Cell arrays of character vectors; character, string, logical, or
categorical arrays; or numeric vectors for categorical variables
representing grouping data. These data forms have data types
cell
(specifically cellstr
), char
, string
, logical
, categorical
, and single
or double
, respectively. An
example is an array of class labels in machine learning.
You can also use nominal or ordinal arrays for categorical
data. However, the nominal
and
ordinal
data types are not recommended.
To work with nominal or ordinal categorical data, use the
categorical
data type instead.
You can use signed or unsigned integers, e.g., int8
or uint8
. However:
Estimation functions might not support signed or unsigned integer data types for nongrouping data.
If you recast a single
or
double
numeric vector
containing NaN
values to a signed
or unsigned integer, then the software converts the
NaN
elements to
0
.
Some functions support tabular arrays for heterogeneous
data (for details, see Tables). The table
data type contains
variables of any of the data types previously listed. An example is
mixed categorical and numerical predictor data for regression analysis.
For some functions, you can also use dataset arrays for
heterogeneous data. However, the
dataset
data type is not recommended. To
work with heterogeneous data, use the
table
data type if the estimation function
supports it.
Functions that do not support the table
data
type support sample data of type single
or double
,
e.g., matrices.
Some functions accept gpuArray
(Parallel Computing Toolbox) input arguments
so that they execute on the GPU. For the full list of Statistics and Machine Learning Toolbox functions that accept GPU arrays, see Function List (GPU
Arrays).
Some functions accept tall
array input arguments
to work with large data sets. For the full list of Statistics and Machine Learning Toolbox functions that accept tall arrays, see Function List (Tall Arrays).
Some functions accept sparse matrices, i.e., matrix A
such that issparse(A)
returns 1
. For
functions that do not accept sparse matrices, recast the data to a full
matrix by using full
.
Statistics and Machine Learning Toolbox does not support the following data types:
Complex numbers.
Custom numeric data types, e.g., a variable that is double precision and an object.
Signed or unsigned numeric integers for nongrouping
data, e.g., unint8
and int16
.
Note
If you specify data of an unsupported type, then the software might return an error or unexpected results.