addK

Class: clustering.evaluation.ClusterCriterion
Package: clustering.evaluation

Evaluate additional numbers of clusters

Syntax

eva_out = addK(eva,klist)

Description

eva_out = addK(eva,klist) returns a clustering evaluation object eva_out that contains the evaluation data stored in the input object eva, plus additional evaluation data for the proposed number of clusters specified in klist.

Input Arguments

expand all

Clustering evaluation data, specified as a clustering evaluation object. Create a clustering evaluation object using evalclusters.

Additional numbers of clusters to evaluate, specified as a vector of positive integer values. If any values in klist overlap with clustering solutions already evaluated in the input object eva, then addK ignores the overlapping values.

Output Arguments

expand all

Updated clustering evaluation data, returned as a clustering evaluation object. eva_out contains data on the proposed clustering solutions included in the input clustering evaluation object eva, plus data on the additional proposed numbers of clusters specified in klist.

For all clustering evaluation object classes, addK updates the InspectedK and CriterionValues properties to include the proposed clustering solutions specified in klist and their corresponding criterion values. addK might also update the OptimalK and OptimalY properties to reflect the new optimal number of clusters and optimal clustering solution.

For certain cluster evaluation objects classes, addK might also update the following additional property values:

  • For gap evaluation objects — LogW, ExpectedLogW, StdLogW, and SE

  • For silhouette evaluation objects — ClusterSilhouettes

Examples

expand all

Create a clustering evaluation object using evalclusters, then use addK to evaluate additional numbers of clusters.

Load the sample data.

load fisheriris

The data contains length and width measurements from the sepals and petals of three species of iris flowers.

Cluster the flower measurement data using kmeans, and use the Calinski-Harabasz criterion to evaluate proposed solutions of one through five clusters.

eva = evalclusters(meas,'kmeans','calinski','klist',1:5)
eva = 
  CalinskiHarabaszEvaluation with properties:

    NumObservations: 150
         InspectedK: [1 2 3 4 5]
    CriterionValues: [Inf 513.9245 561.6278 530.4871 456.1279]
           OptimalK: 1

The clustering evaluation object eva contains data on each proposed clustering solution. The returned value of OptimalK indicates that the optimal solution is three clusters.

Evaluate proposed solutions of 6 through 10 clusters using the same criteria. Add these evaluations to the original clustering evaluation object eva.

eva = addK(eva,6:10)
eva = 
  CalinskiHarabaszEvaluation with properties:

    NumObservations: 150
         InspectedK: [1 2 3 4 5 6 7 8 9 10]
    CriterionValues: [1x10 double]
           OptimalK: 1

The updated values for InspectedK and CriterionValues show that eva now evaluates proposed solutions of 1 through 10 clusters. The OptimalK value still equals 3, indicating that three clusters remain the optimal solution.