Create dummy variables
returns a matrix D
= dummyvar(group
)D
containing zeros and ones, whose columns are
dummy variables for the grouping variables in
group
. Each column of group
is a
single grouping variable, with values indicating category levels. The rows of
group
represent observations across all variables.
Use dummy variables in regression analysis and ANOVA to indicate values of categorical predictors.
dummyvar
treats NaN
values and
undefined categorical levels in group
as missing data and
returns NaN
values in D
.
If a column of ones is introduced in the matrix D
, then
the resulting matrix X = [ones(size(D,1),1) D]
is rank
deficient. If group
has multiple columns, then the matrix
D
itself is rank deficient because dummy variables
produced from any column of group
always sum to a column of
ones. Regression and ANOVA calculations often address this issue by eliminating
one dummy variable (implicitly setting the coefficients for dropped columns to
zero) from each group of dummy variables produced by a column of
group
.
If group
is a numeric vector with levels that do not
correspond exactly to the integers 1:max(group)
, first
convert the data to a categorical vector by using categorical
. You can then pass
the result to dummyvar
. For an example, see Create Dummy Variables from Multiple Grouping Variables.
anova1
| categories
| grp2idx
| regress