addTerms

Add terms to generalized linear regression model

Syntax

NewMdl = addTerms(mdl,terms)

Description

NewMdl = addTerms(mdl,terms) returns a generalized linear regression model fitted using the input data and settings in mdl with the terms terms added.

Examples

collapse all

Add Terms to Generalized Linear Regression Model

Open Live Script

Create a generalized linear regression model using one predictor, and then add another predictor.

Generate sample data using Poisson random numbers with two underlying predictors X(:,1) and X(:,2).

rng('default') % For reproducibility
rndvars = randn(100,2);
X = [2 + rndvars(:,1),rndvars(:,2)];
mu = exp(1 + X*[1;2]);
y = poissrnd(mu);

Create a generalized linear regression model of Poisson data. Include only the first predictor in the model.

mdl = fitglm(X,y,'y ~ x1','Distribution','poisson')

mdl = 
Generalized linear regression model:
    log(y) ~ 1 + x1
    Distribution = Poisson

Estimated Coefficients:
                   Estimate       SE        tStat     pValue
                   ________    _________    ______    ______

    (Intercept)     2.7784      0.014043    197.85      0   
    x1              1.1732     0.0033653     348.6      0   


100 observations, 98 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 1.25e+05, p-value = 0

Add the second predictor to the model.

mdl1 = addTerms(mdl,'x2')

mdl1 = 
Generalized linear regression model:
    log(y) ~ 1 + x1 + x2
    Distribution = Poisson

Estimated Coefficients:
                   Estimate       SE        tStat     pValue
                   ________    _________    ______    ______

    (Intercept)     1.0405      0.022122    47.034      0   
    x1              0.9968      0.003362    296.49      0   
    x2               1.987     0.0063433    313.24      0   


100 observations, 97 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 2.95e+05, p-value = 0

Input Arguments

collapse all

`mdl` — Generalized linear regression model
`GeneralizedLinearModel` object

Generalized linear regression model, specified as a GeneralizedLinearModel object created using fitglm or stepwiseglm.

`terms` — Terms to add to regression model
character vector or string scalar formula in Wilkinson notation | t-by-p terms matrix

Terms to add to the regression model mdl, specified as one of the following:

Character vector or string scalar formula in Wilkinson Notation representing one or more terms. The variable names in the formula must be valid MATLAB^® identifiers.
Terms matrix T of size t-by-p, where t is the number of terms and p is the number of predictor variables in mdl. The value of T(i,j) is the exponent of variable j in term i.
For example, suppose mdl has three variables A, B, and C in that order. Each row of T represents one term:
- [0 0 0] — Constant term or intercept
- [0 1 0] — B; equivalently, A^0 * B^1 * C^0
- [1 0 1] — A*C
- [2 0 0] — A^2
- [0 1 2] — B*(C^2)

addTerms treats a group of indicator variables for a categorical predictor as a single variable. Therefore, you cannot specify an indicator variable to add to the model. If you specify a categorical predictor to add to the model, addTerms adds a group of indicator variables for the predictor in one step.

Output Arguments

collapse all

`NewMdl` — Generalized linear regression model with additional terms
`GeneralizedLinearModel` object

Generalized linear regression model with additional terms, returned as a GeneralizedLinearModel object. NewMdl is a newly fitted model that uses the input data and settings in mdl with additional terms specified in terms.

To overwrite the input argument mdl, assign the newly fitted model to mdl:

mdl = addTerms(mdl,terms);

More About

collapse all

Wilkinson Notation

Wilkinson notation describes the terms present in a model. The notation relates to the terms present in a model, not to the multipliers (coefficients) of those terms.

Wilkinson notation uses these symbols:

+ means include the next variable.
– means do not include the next variable.
: defines an interaction, which is a product of terms.
* defines an interaction and all lower-order terms.
^ raises the predictor to a power, exactly as in * repeated, so ^ includes lower-order terms as well.
() groups terms.

This table shows typical examples of Wilkinson notation.

Wilkinson Notation	Term in Standard Notation
`1`	Constant (intercept) term
`A^k`, where `k` is a positive integer	`A`, `A²`, ..., `A^k`
`A + B`	`A`, `B`
`A*B`	`A`, `B`, `A*B`
`A:B`	`A*B` only
`–B`	Do not include `B`
`A*B + C`	`A`, `B`, `C`, `A*B`
`A + B + C + A:B`	`A`, `B`, `C`, `A*B`
`ABC – A:B:C`	`A`, `B`, `C`, `AB`, `AC`, `B*C`
`A*(B + C)`	`A`, `B`, `C`, `AB`, `AC`

Statistics and Machine Learning Toolbox™ notation always includes a constant term unless you explicitly remove the term using –1.

For more details, see Wilkinson Notation.

Algorithms

addTerms treats a categorical predictor as follows:
- A model with a categorical predictor that has L levels (categories) includes L – 1 indicator variables. The model uses the first category as a reference level, so it does not include the indicator variable for the reference level. If the data type of the categorical predictor is categorical, then you can check the order of categories by using categories and reorder the categories by using reordercats to customize the reference level.
- addTerms treats the group of L – 1 indicator variables as a single variable. If you want to treat the indicator variables as distinct predictor variables, create indicator variables manually by using dummyvar. Then use the indicator variables, except the one corresponding to the reference level of the categorical variable, when you fit a model. For the categorical predictor X, if you specify all columns of dummyvar(X) and an intercept term as predictors, then the design matrix becomes rank deficient.
- Interaction terms between a continuous predictor and a categorical predictor with L levels consist of the element-wise product of the L – 1 indicator variables with the continuous predictor.
- Interaction terms between two categorical predictors with L and M levels consist of the (L – 1)*(M – 1) indicator variables to include all possible combinations of the two categorical predictor levels.
- You cannot specify higher-order terms for a categorical predictor because the square of an indicator is equal to itself.

Alternative Functionality

Use stepwiseglm to specify terms in a starting model and continue improving the model until no single step of adding or removing a term is beneficial.
Use removeTerms to remove specific terms from a model.
Use step to optimally improve a model by adding or removing terms.

Documentation

addTerms

Syntax

Description

Examples

Add Terms to Generalized Linear Regression Model

Input Arguments

`mdl` — Generalized linear regression model
`GeneralizedLinearModel` object

`terms` — Terms to add to regression model
character vector or string scalar formula in Wilkinson notation | t-by-p terms matrix

Output Arguments

`NewMdl` — Generalized linear regression model with additional terms
`GeneralizedLinearModel` object

More About

Wilkinson Notation

Algorithms

Alternative Functionality

See Also

Topics

Introduced in R2012a

Statistics and Machine Learning Toolbox Documentation

Support

Documentation

addTerms

Syntax

Description

Examples

Add Terms to Generalized Linear Regression Model

Input Arguments

mdl — Generalized linear regression model GeneralizedLinearModel object

terms — Terms to add to regression model character vector or string scalar formula in Wilkinson notation | t-by-p terms matrix

Output Arguments

NewMdl — Generalized linear regression model with additional terms GeneralizedLinearModel object

More About

Wilkinson Notation

Algorithms

Alternative Functionality

See Also

Topics

Introduced in R2012a

Statistics and Machine Learning Toolbox Documentation

Support

`mdl` — Generalized linear regression model
`GeneralizedLinearModel` object

`terms` — Terms to add to regression model
character vector or string scalar formula in Wilkinson notation | t-by-p terms matrix

`NewMdl` — Generalized linear regression model with additional terms
`GeneralizedLinearModel` object