Note
The nominal
and ordinal
array data types are not recommended. To represent ordered and unordered discrete, nonnumeric
data, use the Categorical Arrays data type instead.
When working with categorical variables and their levels, you’ll encounter some
typical challenges. This table summarizes the functions you can use with nominal or
ordinal arrays to manipulate category levels. For additional functions, type
methods nominal
or methods ordinal
at the
command line, or see the nominal
and ordinal
reference pages.
Task | Function |
---|---|
Add new category levels | addlevels |
Drop category levels | droplevels |
Combine category levels | mergelevels |
Reorder category levels | reorderlevels |
Count the number of observations in each category | levelcounts |
Change the label or name of category levels | setlabels |
Create an interaction factor | times |
Find observations that are not in a defined category | isundefined |
You can use nominal and ordinal arrays in a variety of statistical analyses. For example, you might want to compute descriptive statistics for data grouped by the category levels, conduct statistical tests on differences between category means, or perform regression analysis using categorical predictors.
Statistics and Machine Learning Toolbox™ functions that accept a grouping variable as an input argument accept nominal and ordinal arrays. This includes descriptive functions such as:
You can also use nominal and ordinal arrays as input arguments to analysis functions and methods based on models, such as:
When you use a nominal or ordinal array as a predictor in these functions, the
fitting function automatically recognizes the categorical predictor, and constructs
appropriate dummy indicator variables for analysis. Alternatively, you can construct
your own dummy indicator variables using dummyvar
.
The levels of categorical variables are often defined as text, which can be costly
to store and manipulate in a cell array of character vectors or
char
array. Nominal and ordinal arrays separately store
category membership and category labels, greatly reducing the amount of memory
required to store the variable.
For example, load some sample data:
load('fisheriris')
species
is a cell array of character vectors requiring
19,300 bytes of memory.
Convert species
to a nominal array:
species = nominal(species);
There is a 95% reduction in memory required to store the variable.