Partial least-squares regression
[XL,YL] = plsregress(X,Y,ncomp)
[XL,YL,XS] = plsregress(X,Y,ncomp)
[XL,YL,XS,YS] = plsregress(X,Y,ncomp)
[XL,YL,XS,YS,BETA] = plsregress(X,Y,ncomp,...)
[XL,YL,XS,YS,BETA,PCTVAR] = plsregress(X,Y,ncomp)
[XL,YL,XS,YS,BETA,PCTVAR,MSE] = plsregress(X,Y,ncomp)
[XL,YL,XS,YS,BETA,PCTVAR,MSE] = plsregress(...,param1
,val1
,param2
,val2
,...)
[XL,YL,XS,YS,BETA,PCTVAR,MSE,stats] = plsregress(X,Y,ncomp,...)
[XL,YL] = plsregress(X,Y,ncomp)
computes
a partial least-squares (PLS) regression of Y
on X
,
using ncomp
PLS components, and returns the predictor
and response loadings in XL
and YL
,
respectively. X
is an n-by-p matrix
of predictor variables, with rows corresponding to observations and
columns to variables. Y
is an n-by-m response
matrix. XL
is a p-by-ncomp
matrix
of predictor loadings, where each row contains coefficients that define
a linear combination of PLS components that approximate the original
predictor variables. YL
is an m-by-ncomp
matrix
of response loadings, where each row contains coefficients that define
a linear combination of PLS components that approximate the original
response variables.
[XL,YL,XS] = plsregress(X,Y,ncomp)
returns
the predictor scores XS
, that is, the PLS components
that are linear combinations of the variables in X
. XS
is
an n-by-ncomp
orthonormal matrix
with rows corresponding to observations and columns to components.
[XL,YL,XS,YS] = plsregress(X,Y,ncomp)
returns the response scores YS
, that is, the linear
combinations of the responses with which the PLS components XS
have
maximum covariance. YS
is an n-by-ncomp
matrix
with rows corresponding to observations and columns to components. YS
is
neither orthogonal nor normalized.
plsregress
uses the SIMPLS algorithm, first
centering X
and Y
by subtracting
off column means to get centered variables X0
and Y0
.
However, it does not rescale the columns. To perform PLS with standardized
variables, use zscore
to normalize X
and Y
.
If ncomp
is omitted, its default value is min(size(X,1)-1,size(X,2))
.
The relationships between the scores, loadings, and centered
variables X0
and Y0
are:
XL = (XS\X0)' = X0'*XS
,
YL = (XS\Y0)' = Y0'*XS
,
XL
and YL
are the coefficients
from regressing X0
and Y0
on XS
,
and XS*XL'
and XS*YL'
are the
PLS approximations to X0
and Y0
.
plsregress
initially computes YS
as:
YS = Y0*YL = Y0*Y0'*XS
,
By convention, however, plsregress
then
orthogonalizes each column of YS
with respect to
preceding columns of XS
, so that XS'*YS
is
lower triangular.
[XL,YL,XS,YS,BETA] = plsregress(X,Y,ncomp,...)
returns the PLS regression
coefficients BETA
. BETA
is a
(p+1)-by-m matrix, containing intercept terms
in the first row:
Y = [ones(n,1),X]*BETA + Yresiduals
,
Y0 = X0*BETA(2:end,:) + Yresiduals
. Here Yresiduals
is
the vector of response residuals.
[XL,YL,XS,YS,BETA,PCTVAR] = plsregress(X,Y,ncomp)
returns
a 2-by-ncomp
matrix PCTVAR
containing
the percentage of variance explained by the model. The first row of PCTVAR
contains
the percentage of variance explained in X
by each
PLS component, and the second row contains the percentage of variance
explained in Y
.
[XL,YL,XS,YS,BETA,PCTVAR,MSE] = plsregress(X,Y,ncomp)
returns
a 2-by-(ncomp
+1) matrix MSE
containing
estimated mean-squared errors for PLS models with 0:ncomp
components.
The first row of MSE
contains mean-squared errors
for the predictor variables in X
, and the second
row contains mean-squared errors for the response variable(s) in Y
.
[XL,YL,XS,YS,BETA,PCTVAR,MSE] = plsregress(...,
specifies
optional parameter name/value pairs from the following table to control
the calculation of param1
,val1
,param2
,val2
,...)MSE
.
Parameter | Value |
---|---|
'cv' | The method used to compute
The default is |
'mcreps' | A positive integer indicating the number of Monte-Carlo
repetitions for cross-validation. The default value is |
options | A structure that specifies whether to run in parallel, and specifies the random stream
or streams. Create the
To compute in parallel, you need Parallel Computing Toolbox™ |
[XL,YL,XS,YS,BETA,PCTVAR,MSE,stats] = plsregress(X,Y,ncomp,...)
returns a
structure stats
with the following fields:
W
— A p-by-ncomp
matrix
of PLS weights so that XS = X0*W
.
T2
— The T2 statistic
for each point in XS
.
Xresiduals
— The predictor
residuals, that is, X0-XS*XL'
.
Yresiduals
— The response
residuals, that is, Y0-XS*YL'
.
[1] de Jong, S. “SIMPLS: An Alternative Approach to Partial Least Squares Regression.” Chemometrics and Intelligent Laboratory Systems. Vol. 18, 1993, pp. 251–263.
[2] Rosipal, R., and N. Kramer. “Overview and Recent Advances in Partial Least Squares.” Subspace, Latent Structure and Feature Selection: Statistical and Optimization Perspectives Workshop (SLSFS 2005), Revised Selected Papers (Lecture Notes in Computer Science 3940). Berlin, Germany: Springer-Verlag, 2006, pp. 34–51.