In Regression Learner, use the response plot to try to identify predictors that are useful for predicting the response. To visualize the relation between different predictors and the response, under X-axis, select different variables in the X list.
Before you train a regression model, the response plot shows the training data. If you have trained a regression model, then the response plot also shows the model predictions.
Observe which variables are associated most clearly with the response. When you
plot the carbig
data set, the predictor
Horsepower
shows a clear negative association with the
response.
Look for features that do not seem to have any association with the response and
use Feature Selection
to remove those features from the set of used
predictors.
You can export the response plots you create in the app to figures. See Export Plots in Regression Learner App.
In Regression Learner, you can specify different features (or predictors) to include in the model. See if you can improve models by removing features with low predictive power. If data collection is expensive or difficult, you might prefer a model that performs satisfactorily with fewer predictors.
On the Regression Learner tab, in the
Features section, click Feature
Selection
.
In the Feature Selection window, clear the check boxes for the predictors you want to exclude.
Tip
You can close the Feature Selection window, or move it. The choices you make in the window remain.
Click Train to train a new model using the new predictor options.
Observe the new model in the History list. The Current Model window displays how many predictors are excluded.
Check which predictors are included in a trained model. Click the model in the History list and look at the check boxes in the Feature Selection window.
Try to improve the model by including different features.
For an example using feature selection, see Train Regression Trees Using Regression Learner App.
Use principal component analysis (PCA) to reduce the dimensionality of the predictor space. Reducing the dimensionality can create regression models in Regression Learner that help prevent overfitting. PCA linearly transforms predictors to remove redundant dimensions, and generates a new set of variables called principal components.
On the Regression Learner tab, in the Features section, select PCA.
In the Advanced PCA Options window, select the Enable PCA check box.
You can close the PCA window, or move it. The choices you make in the window remain.
Click Train again. The pca
function transforms your selected features before training the
model.
By default, PCA keeps only the components that explain 95% of the variance. In the PCA window, you can change the percentage of variance to explain in the Explained variance box. A higher value risks overfitting, while a lower value risks removing useful dimensions.
Manually limit the number of PCA components. In the
Component reduction criterion list, select
Specify number of components
. Edit the
number in the Number of numeric components box. The
number of components cannot be larger than the number of numeric
predictors. PCA is not applied to categorical predictors.
You can check PCA Options for trained models in the Current Model window. For example:
PCA is keeping enough components to explain 95% variance. After training, 2 components were kept. Explained variance per component (in order): 92.5%, 5.3%, 1.7%, 0.5%
To learn more about how Regression Learner applies PCA to your data, generate code
for your trained regression model. For more information on PCA, see the pca
function.