Adjust Fuzzy Overlap in Fuzzy C-Means Clustering

This example shows how to adjust the amount of fuzzy overlap when performing fuzzy c-means clustering.

Create a random data set. For reproducibility, initialize the random number generator to its default value.

rng('default')
data = rand(100,2);

Specify fuzzy partition matrix exponents.

M = [1.1 2.0 3.0 4.0];

The exponent values in M must be greater than 1, with smaller values specifying a lower degree of fuzzy overlap. In other words, as M approaches 1, the boundaries between the clusters become more crisp.

For each overlap exponent:

  • Cluster the data.

  • Classify each data point into the cluster for which it has the highest degree of membership.

  • Find the data points with maximum membership values below 0.6. These points have a more fuzzy classification.

  • To quantify the degree of fuzzy overlap, calculate the average maximum membership value across all data points. A higher average maximum membership value indicates that there is less fuzzy overlap.

  • Plot the clustering results.

for i = 1:4
    % Cluster the data.
    options = [M(i) NaN NaN 0];
    [centers,U] = fcm(data,2,options);
    
    % Classify the data points.
    maxU = max(U);
    index1 = find(U(1,:) == maxU);
    index2 = find(U(2,:) == maxU);
    
    % Find data points with lower maximum membership values.
    index3 = find(maxU < 0.6);
    
    % Calculate the average maximum membership value.
    averageMax = mean(maxU);
    
    % Plot the results.
    subplot(2,2,i)
    plot(data(index1,1),data(index1,2),'ob')
    hold on
    plot(data(index2,1),data(index2,2),'or')
    plot(data(index3,1),data(index3,2),'xk','LineWidth',2)
    plot(centers(1,1),centers(1,2),'xb','MarkerSize',15,'LineWidth',3)
    plot(centers(2,1),centers(2,2),'xr','MarkerSize',15,'LineWidth',3)
    hold off
    title(['M = ' num2str(M(i)) ', Ave. Max. = ' num2str(averageMax,3)])
end

A given data point is classified into the cluster for which it has the highest membership value, as indicated by maxU. A maximum membership value of 0.5 indicates that the point belongs to both clusters equally. The data points marked with a black x have maximum membership values below 0.6. These points have a greater degree of uncertainty in their cluster membership.

More data points with low maximum membership values indicate a greater degree of fuzzy overlap in the clustering result. The average maximum membership value, averageMax, provides a quantitative description of the overlap. An averageMax value of 1 indicates crisp clusters, with smaller values indicating more overlap.

See Also

Related Topics