Linear model for binary classification of high-dimensional data
ClassificationLinear
is a trained linear model object for binary classification; the linear model is a support vector machine (SVM) or logistic regression model. fitclinear
fits a ClassificationLinear
model by minimizing the objective function using techniques that reduce computation time for high-dimensional data sets (e.g., stochastic gradient descent). The classification loss plus the regularization term compose the objective function.
Unlike other classification models, and for economical memory usage, ClassificationLinear
model objects do not store the training data. However, they do store, for example, the estimated linear model coefficients, prior-class probabilities, and the regularization strength.
You can use trained ClassificationLinear
models to predict labels or classification scores for new data. For details, see predict
.
Create a ClassificationLinear
object by using fitclinear
.
Lambda
— Regularization term strengthRegularization term strength, specified as a nonnegative scalar or vector of nonnegative values.
Data Types: double
| single
Learner
— Linear classification model type'logistic'
| 'svm'
Linear classification model type, specified as 'logistic'
or 'svm'
.
In this table,
β is a vector of p coefficients.
x is an observation from p predictor variables.
b is the scalar bias.
Value | Algorithm | Loss Function | FittedLoss Value |
---|---|---|---|
'logistic' | Logistic regression | Deviance (logistic): | 'logit' |
'svm' | Support vector machine | Hinge: | 'hinge' |
Beta
— Linear coefficient estimatesLinear coefficient estimates, specified as a numeric vector with length equal to the number of predictors.
Data Types: double
Bias
— Estimated bias termEstimated bias term or model intercept, specified as a numeric scalar.
Data Types: double
FittedLoss
— Loss function used to fit linear model'hinge'
| 'logit'
This property is read-only.
Loss function used to fit the linear model, specified as 'hinge'
or 'logit'
.
Value | Algorithm | Loss Function | Learner Value |
---|---|---|---|
'hinge' | Support vector machine | Hinge: | 'svm' |
'logit' | Logistic regression | Deviance (logistic): | 'logistic' |
Regularization
— Complexity penalty type'lasso (L1)'
| 'ridge (L2)'
Complexity penalty type, specified as 'lasso (L1)'
or 'ridge
(L2)'
.
The software composes the objective function for minimization from the sum of the average loss
function (see FittedLoss
) and a regularization value from this
table.
Value | Description |
---|---|
'lasso (L1)' | Lasso (L1) penalty: |
'ridge (L2)' | Ridge (L2) penalty: |
λ specifies the regularization term
strength (see Lambda
).
The software excludes the bias term (β0) from the regularization penalty.
CategoricalPredictors
— Categorical predictor indices[]
Categorical predictor indices, specified as a vector of positive integers. Assuming that the predictor data contains observations in rows, CategoricalPredictors
contains index values corresponding to the columns of the predictor data that contain categorical predictors. If none of the predictors are categorical, then this property is empty ([]
).
Data Types: single
| double
ClassNames
— Unique class labelsUnique class labels used in training, specified as a categorical or
character array, logical or numeric vector, or cell array of
character vectors. ClassNames
has the same
data type as the class labels Y
.
(The software treats string arrays as cell arrays of character
vectors.)
ClassNames
also determines the class
order.
Data Types: categorical
| char
| logical
| single
| double
| cell
Cost
— Misclassification costsThis property is read-only.
Misclassification costs, specified as a square numeric matrix. Cost
has K rows
and columns, where K is the number of classes.
Cost(
is
the cost of classifying a point into class i
,j
)j
if
its true class is i
. The order of the rows
and columns of Cost
corresponds to the order of
the classes in ClassNames
.
Data Types: double
ModelParameters
— Parameters used for training modelParameters used for training the ClassificationLinear
model, specified as a structure.
Access fields of ModelParameters
using dot notation. For example, access
the relative tolerance on the linear coefficients and the bias term by using
Mdl.ModelParameters.BetaTolerance
.
Data Types: struct
PredictorNames
— Predictor namesPredictor names in order of their appearance in the predictor data, specified as a
cell array of character vectors. The length of PredictorNames
is
equal to the number of variables in the training data X
or
Tbl
used as predictor variables.
Data Types: cell
ExpandedPredictorNames
— Expanded predictor namesExpanded predictor names, specified as a cell array of character vectors.
If the model uses encoding for categorical variables, then
ExpandedPredictorNames
includes the names that describe the
expanded variables. Otherwise, ExpandedPredictorNames
is the same as
PredictorNames
.
Data Types: cell
Prior
— Prior class probabilitiesThis property is read-only.
Prior class probabilities, specified as a numeric vector.
Prior
has as many elements as
classes in ClassNames
, and the order of the
elements corresponds to the elements of
ClassNames
.
Data Types: double
ResponseName
— Response variable nameResponse variable name, specified as a character vector.
Data Types: char
ScoreTransform
— Score transformation function'doublelogit'
| 'invlogit'
| 'ismax'
| 'logit'
| 'none'
| function handle | ...Score transformation function to apply to predicted scores, specified as a function name or function handle.
For linear classification models and before transformation, the predicted
classification score for the observation x (row vector) is f(x) =
xβ + b, where β and b correspond to
Mdl.Beta
and Mdl.Bias
, respectively.
To change the score transformation function to, for example,
function
, use dot notation.
For a built-in function, enter this code and replace
function
with a value in the table.
Mdl.ScoreTransform = 'function';
Value | Description |
---|---|
'doublelogit' | 1/(1 + e–2x) |
'invlogit' | log(x / (1 – x)) |
'ismax' | Sets the score for the class with the largest score to 1, and sets the scores for all other classes to 0 |
'logit' | 1/(1 + e–x) |
'none' or 'identity' | x (no transformation) |
'sign' | –1 for x < 0 0 for x = 0 1 for x > 0 |
'symmetric' | 2x – 1 |
'symmetricismax' | Sets the score for the class with the largest score to 1, and sets the scores for all other classes to –1 |
'symmetriclogit' | 2/(1 + e–x) – 1 |
For a MATLAB® function, or a function that you define, enter its function handle.
Mdl.ScoreTransform = @function;
function
must accept a matrix of the original
scores for each class, and then return a matrix of the same size
representing the transformed scores for each class.
Data Types: char
| function_handle
edge | Classification edge for linear classification models |
incrementalLearner | Convert linear model for binary classification to incremental learner |
loss | Classification loss for linear classification models |
margin | Classification margins for linear classification models |
partialDependence | Compute partial dependence |
plotPartialDependence | Create partial dependence plot (PDP) and individual conditional expectation (ICE) plots |
predict | Predict labels for linear classification models |
selectModels | Choose subset of regularized, binary linear classification models |
update | Update model parameters for code generation |
Value. To learn how value classes affect copy operations, see Copying Objects.
Train a binary, linear classification model using support vector machines, dual SGD, and ridge regularization.
Load the NLP data set.
load nlpdata
X
is a sparse matrix of predictor data, and Y
is a categorical vector of class labels. There are more than two classes in the data.
Identify the labels that correspond to the Statistics and Machine Learning Toolbox™ documentation web pages.
Ystats = Y == 'stats';
Train a binary, linear classification model that can identify whether the word counts in a documentation web page are from the Statistics and Machine Learning Toolbox™ documentation. Train the model using the entire data set. Determine how well the optimization algorithm fit the model to the data by extracting a fit summary.
rng(1); % For reproducibility
[Mdl,FitInfo] = fitclinear(X,Ystats)
Mdl = ClassificationLinear ResponseName: 'Y' ClassNames: [0 1] ScoreTransform: 'none' Beta: [34023x1 double] Bias: -1.0059 Lambda: 3.1674e-05 Learner: 'svm' Properties, Methods
FitInfo = struct with fields:
Lambda: 3.1674e-05
Objective: 5.3783e-04
PassLimit: 10
NumPasses: 10
BatchLimit: []
NumIterations: 238561
GradientNorm: NaN
GradientTolerance: 0
RelativeChangeInBeta: 0.0562
BetaTolerance: 1.0000e-04
DeltaGradient: 1.4582
DeltaGradientTolerance: 1
TerminationCode: 0
TerminationStatus: {'Iteration limit exceeded.'}
Alpha: [31572x1 double]
History: []
FitTime: 0.1012
Solver: {'dual'}
Mdl
is a ClassificationLinear
model. You can pass Mdl
and the training or new data to loss
to inspect the in-sample classification error. Or, you can pass Mdl
and new predictor data to predict
to predict class labels for new observations.
FitInfo
is a structure array containing, among other things, the termination status (TerminationStatus
) and how long the solver took to fit the model to the data (FitTime
). It is good practice to use FitInfo
to determine whether optimization-termination measurements are satisfactory. Because training time is small, you can try to retrain the model, but increase the number of passes through the data. This can improve measures like DeltaGradient
.
Load the NLP data set.
load nlpdata n = size(X,1); % Number of observations
Identify the labels that correspond to the Statistics and Machine Learning Toolbox™ documentation web pages.
Ystats = Y == 'stats';
Hold out 5% of the data.
rng(1); % For reproducibility cvp = cvpartition(n,'Holdout',0.05)
cvp = Hold-out cross validation partition NumObservations: 31572 NumTestSets: 1 TrainSize: 29994 TestSize: 1578
cvp
is a CVPartition
object that defines the random partition of n data into training and test sets.
Train a binary, linear classification model using the training set that can identify whether the word counts in a documentation web page are from the Statistics and Machine Learning Toolbox™ documentation. For faster training time, orient the predictor data matrix so that the observations are in columns.
idxTrain = training(cvp); % Extract training set indices X = X'; Mdl = fitclinear(X(:,idxTrain),Ystats(idxTrain),'ObservationsIn','columns');
Predict observations and classification error for the hold out sample.
idxTest = test(cvp); % Extract test set indices labels = predict(Mdl,X(:,idxTest),'ObservationsIn','columns'); L = loss(Mdl,X(:,idxTest),Ystats(idxTest),'ObservationsIn','columns')
L = 7.1753e-04
Mdl
misclassifies fewer than 1% of the out-of-sample observations.
Usage notes and limitations:
When you train a linear classification model by using fitclinear
, the following restrictions apply.
The class labels input argument value (Y
)
cannot be a categorical
array.
Code generation does
not support categorical predictors (logical
, categorical
,
char
, string
, or cell
). If you
supply training data in a table, the predictors must be numeric (double
or
single
). Also, you cannot use the
'CategoricalPredictors'
name-value pair argument.To include categorical predictors in a model, preprocess the
categorical predictors by using dummyvar
before fitting the model.
If the predictor data input argument value is a matrix, it must be a full, numeric matrix. Code generation does not support sparse data.
The value of the 'ClassNames'
name-value pair argument or
property cannot be a categorical
array.
You can specify only one regularization strength, either 'auto'
or a nonnegative scalar for the 'Lambda'
name-value pair argument.
The value of the 'ScoreTransform'
name-value pair argument cannot be an anonymous function.
For more information, see Introduction to Code Generation.
ClassificationECOC
| ClassificationKernel
| ClassificationPartitionedLinear
| ClassificationPartitionedLinearECOC
| fitclinear
| predict
You have a modified version of this example. Do you want to open this example with your edits?