Lasso is a regularization technique. Use lasso
to:
Reduce the number of predictors in a regression model.
Identify important predictors.
Select among redundant predictors.
Produce shrinkage estimates with potentially lower predictive errors than ordinary least squares.
Elastic net is a related technique. Use elastic net when you
have several highly correlated variables. lasso
provides
elastic net regularization when you set the Alpha
name-value
pair to a number strictly between 0
and 1
.
See Lasso and Elastic Net Details.
For lasso regularization of regression ensembles, see regularize
.
Lasso is a regularization technique for performing linear regression. Lasso includes a penalty term that constrains the size of the estimated coefficients. Therefore, it resembles ridge regression. Lasso is a shrinkage estimator: it generates coefficient estimates that are biased to be small. Nevertheless, a lasso estimator can have smaller mean squared error than an ordinary least-squares estimator when you apply it to new data.
Unlike ridge regression, as the penalty term increases, lasso sets more coefficients to zero. This means that the lasso estimator is a smaller model, with fewer predictors. As such, lasso is an alternative to stepwise regression and other model selection and dimensionality reduction techniques.
Elastic net is a related technique. Elastic net is a hybrid of ridge regression and lasso regularization. Like lasso, elastic net can generate reduced models by generating zero-valued coefficients. Empirical studies have suggested that the elastic net technique can outperform lasso on data with highly correlated predictors.
The lasso technique solves this regularization
problem. For a given value of λ, a nonnegative
parameter, lasso
solves the problem
N is the number of observations.
yi is the response at observation i.
xi is data, a vector of p values at observation i.
λ is a positive regularization
parameter corresponding to one value of Lambda
.
The parameters β0 and β are scalar and p-vector respectively.
As λ increases, the number of nonzero components of β decreases.
The lasso problem involves the L1 norm of β, as contrasted with the elastic net algorithm.
The elastic net technique solves this regularization problem. For an α strictly between 0 and 1, and a nonnegative λ, elastic net solves the problem
where
Elastic net is the same as lasso when α = 1. As α shrinks
toward 0, elastic net approaches ridge
regression.
For other values of α, the penalty term Pα(β)
interpolates between the L1 norm
of β and the squared L2 norm
of β.
[1] Tibshirani, R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, Vol 58, No. 1, pp. 267–288, 1996.
[2] Zou, H. and T. Hastie. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B, Vol. 67, No. 2, pp. 301–320, 2005.
[3] Friedman, J., R. Tibshirani, and T. Hastie.
Regularization paths for generalized linear models via coordinate
descent. Journal of Statistical Software, Vol 33, No. 1, 2010.
https://www.jstatsoft.org/v33/i01
[4] Hastie, T., R. Tibshirani, and J. Friedman. The Elements of Statistical Learning, 2nd edition. Springer, New York, 2008.
fitrlinear
| lasso
| lassoglm
| lassoPlot
| ridge