Generate Code to Classify Numeric Data in Table

This example shows how to generate code for classifying numeric data in a table using a binary decision tree.

In the general code generation workflow, you can train a classification or regression model on data in a table, provided that you use only numeric predictor variables. When you create an entry-point function for prediction, you pass a numeric matrix (instead of a table) to predict.

Starting in R2020a, you can pass a table to predict inside your entry-point function. For more information on table support in code generation, see Code Generation for Tables (MATLAB Coder) and Table Limitations for Code Generation (MATLAB Coder).

Train Classification Model

Load the patients data set. Create a table that contains numeric predictors of type single and double and the response variable Smoker of type logical. Each row of the table corresponds to a different patient.

load patients
Age = single(Age);
Weight = single(Weight);
Tbl = table(Age,Diastolic,Smoker,Systolic,Weight);

Train a classification tree using the data in Tbl. Notice the predictor names and their order.

Mdl = fitctree(Tbl,'Smoker');
Mdl.PredictorNames
ans = 1x4 cell
    {'Age'}    {'Diastolic'}    {'Systolic'}    {'Weight'}

Save Model

Save the tree classifier to a file using saveLearnerForCoder.

saveLearnerForCoder(Mdl,'TreeModel');

saveLearnerForCoder saves the classifier to the MATLAB® binary file TreeModel.mat as a structure array in the current folder.

Define Entry-Point Function

Define the entry-point function predictSmoker, which takes numeric predictor variables as input arguments. Within the function, load the tree classifier by using loadLearnerForCoder, create a table from the input arguments, and then pass the classifier and table to predict.

type predictSmoker.mlx
function [labels,scores] = predictSmoker(age,diastolic,systolic,weight) %#codegen
%PREDICTSMOKER Label new observations using a trained tree model
%   predictSmoker predicts whether patients are smokers (1) or nonsmokers
%   (0) based on their age, diastolic blood pressure, systolic blood
%   pressure, and weight. The function also provides classification scores
%   indicating the likelihood that a predicted label comes from a
%   particular class (smoker or nonsmoker).
mdl = loadLearnerForCoder('TreeModel');
varnames = mdl.PredictorNames;
tbl = table(age,diastolic,systolic,weight,'VariableNames',varnames);
[labels,scores] = predict(mdl,tbl);
end

When you create a table inside an entry-point function, you must specify the variable names (for example, by using the 'VariableNames' name-value pair argument of table). If your table contains only predictor variables, and the predictors are in the same order as in the table used to train the model, then you can find the predictor variable names in mdl.PredictorNames.

Note: If you click the button located in the upper-right section of this page and open this example in MATLAB, then MATLAB opens the example folder. This folder includes the entry-point function file predictSmoker.mlx.

Generate Code

Generate code for predictSmoker by using codegen. Specify the data type and dimensions of the predictor variable input arguments using coder.typeof.

  • The first input argument of coder.typeof specifies the data type of the predictor.

  • The second input argument specifies the upper bound on the number of rows (Inf) and columns (1) in the predictor.

  • The third input argument specifies that the number of rows in the predictor can change at run time but the number of columns is fixed.

ARGS = cell(4,1);
ARGS{1} = coder.typeof(Age,[Inf 1],[1 0]);
ARGS{2} = coder.typeof(Diastolic,[Inf 1],[1 0]);
ARGS{3} = coder.typeof(Systolic,[Inf 1],[1 0]);
ARGS{4} = coder.typeof(Weight,[Inf 1],[1 0]);

codegen predictSmoker -args ARGS

codegen generates the MEX function predictSmoker_mex with a platform-dependent extension in your current folder.

Verify Generated Code

Verify that predict, predictSmoker, and the MEX file return the same results for a random sample of 20 patients.

rng('default') % For reproducibility
[newTbl,idx] = datasample(Tbl,20);

[labels1,scores1] = predict(Mdl,newTbl);
[labels2,scores2] = predictSmoker(Age(idx),Diastolic(idx),Systolic(idx),Weight(idx));
[labels3,scores3] = predictSmoker_mex(Age(idx),Diastolic(idx),Systolic(idx),Weight(idx));

verifyMEXlabels = isequal(labels1,labels2,labels3)
verifyMEXlabels = logical
   1

verifyMEXscores = isequal(scores1,scores2,scores3)
verifyMEXscores = logical
   1

See Also

| | (MATLAB Coder) | (MATLAB Coder)

Related Topics