This example shows how to generate code for classifying numeric data in a table using a binary decision tree.
In the general code generation workflow, you can train a classification or regression model on data in a table, provided that you use only numeric predictor variables. When you create an entry-point function for prediction, you pass a numeric matrix (instead of a table) to predict
.
Starting in R2020a, you can pass a table to predict
inside your entry-point function. For more information on table support in code generation, see Code Generation for Tables (MATLAB Coder) and Table Limitations for Code Generation (MATLAB Coder).
Load the patients
data set. Create a table that contains numeric predictors of type single
and double
and the response variable Smoker
of type logical
. Each row of the table corresponds to a different patient.
load patients
Age = single(Age);
Weight = single(Weight);
Tbl = table(Age,Diastolic,Smoker,Systolic,Weight);
Train a classification tree using the data in Tbl
. Notice the predictor names and their order.
Mdl = fitctree(Tbl,'Smoker');
Mdl.PredictorNames
ans = 1x4 cell
{'Age'} {'Diastolic'} {'Systolic'} {'Weight'}
Save the tree classifier to a file using saveLearnerForCoder
.
saveLearnerForCoder(Mdl,'TreeModel');
saveLearnerForCoder
saves the classifier to the MATLAB® binary file TreeModel.mat
as a structure array in the current folder.
Define the entry-point function predictSmoker
, which takes numeric predictor variables as input arguments. Within the function, load the tree classifier by using loadLearnerForCoder
, create a table from the input arguments, and then pass the classifier and table to predict
.
type predictSmoker.mlx
function [labels,scores] = predictSmoker(age,diastolic,systolic,weight) %#codegen %PREDICTSMOKER Label new observations using a trained tree model % predictSmoker predicts whether patients are smokers (1) or nonsmokers % (0) based on their age, diastolic blood pressure, systolic blood % pressure, and weight. The function also provides classification scores % indicating the likelihood that a predicted label comes from a % particular class (smoker or nonsmoker). mdl = loadLearnerForCoder('TreeModel'); varnames = mdl.PredictorNames; tbl = table(age,diastolic,systolic,weight,'VariableNames',varnames); [labels,scores] = predict(mdl,tbl); end
When you create a table inside an entry-point function, you must specify the variable names (for example, by using the 'VariableNames'
name-value pair argument of table
). If your table contains only predictor variables, and the predictors are in the same order as in the table used to train the model, then you can find the predictor variable names in mdl.PredictorNames
.
Note: If you click the button located in the upper-right section of this page and open this example in MATLAB, then MATLAB opens the example folder. This folder includes the entry-point function file predictSmoker.mlx
.
Generate code for predictSmoker
by using codegen
. Specify the data type and dimensions of the predictor variable input arguments using coder.typeof
.
The first input argument of coder.typeof
specifies the data type of the predictor.
The second input argument specifies the upper bound on the number of rows (Inf
) and columns (1
) in the predictor.
The third input argument specifies that the number of rows in the predictor can change at run time but the number of columns is fixed.
ARGS = cell(4,1); ARGS{1} = coder.typeof(Age,[Inf 1],[1 0]); ARGS{2} = coder.typeof(Diastolic,[Inf 1],[1 0]); ARGS{3} = coder.typeof(Systolic,[Inf 1],[1 0]); ARGS{4} = coder.typeof(Weight,[Inf 1],[1 0]); codegen predictSmoker -args ARGS
codegen
generates the MEX function predictSmoker_mex
with a platform-dependent extension in your current folder.
Verify that predict
, predictSmoker
, and the MEX file return the same results for a random sample of 20 patients.
rng('default') % For reproducibility [newTbl,idx] = datasample(Tbl,20); [labels1,scores1] = predict(Mdl,newTbl); [labels2,scores2] = predictSmoker(Age(idx),Diastolic(idx),Systolic(idx),Weight(idx)); [labels3,scores3] = predictSmoker_mex(Age(idx),Diastolic(idx),Systolic(idx),Weight(idx)); verifyMEXlabels = isequal(labels1,labels2,labels3)
verifyMEXlabels = logical
1
verifyMEXscores = isequal(scores1,scores2,scores3)
verifyMEXscores = logical
1
loadLearnerForCoder
| saveLearnerForCoder
| codegen
(MATLAB Coder) | coder.typeof
(MATLAB Coder)