Package: clustering.evaluation
Superclasses: ClusterCriterion
Gap criterion clustering evaluation object
GapEvaluation
is an object consisting of sample data, clustering
data, and gap criterion values used to evaluate the optimal number of clusters. Create a
gap criterion clustering evaluation object using evalclusters
.
creates a gap criterion clustering evaluation object.eva
= evalclusters(x
,clust
,'Gap')
creates a gap criterion clustering evaluation object using additional options specified
by one or more name-value pair arguments.eva
= evalclusters(x
,clust
,'Gap',Name,Value
)
|
Number of data sets generated from the reference distribution, stored as a positive integer value. |
|
Clustering algorithm used to cluster the input data, stored
as a valid clustering algorithm name or function handle. If the clustering
solutions are provided in the input, |
|
Name of the criterion used for clustering evaluation, stored as a valid criterion name. |
|
Criterion values corresponding to each proposed number of clusters
in |
|
Distance metric used for clustering data, stored as a valid distance metric name. |
|
Expectation of the natural logarithm of W based on the
generated reference data, stored as a vector of scalar values.
W is the within-cluster dispersion computed using the
distance metric |
|
List of the number of proposed clusters for which to compute criterion values, stored as a vector of positive integer values. |
|
Natural logarithm of W based on the input data, stored
as a vector of scalar values. W is the within-cluster
dispersion computed using the distance metric
|
|
Logical flag for excluded data, stored as a column vector of
logical values. If |
|
Number of observations in the data matrix |
|
Optimal number of clusters, stored as a positive integer value. |
|
Optimal clustering solution corresponding to |
|
Reference data generation method, stored as a valid reference distribution name. |
|
Standard error of the natural logarithm of W with
respect to the reference data for each number of clusters in
|
|
Method for determining the optimal number of clusters, stored as a valid search method name. |
|
Standard deviation of the natural logarithm of W with
respect to the reference data for each number of clusters in
|
|
Data used for clustering, stored as a matrix of numerical values. |
increaseB | Increase reference data sets |
[1] Tibshirani, R., G. Walther, and T. Hastie. “Estimating the number of clusters in a data set via the gap statistic.” Journal of the Royal Statistical Society: Series B. Vol. 63, Part 2, 2001, pp. 411–423.