Learning Cluster Type and Dissimilarity Metric for Each Cluster Using a Set of Possible Cluster Types

Arash Arami, Babak Nadjar Araabi, Caro Lucas, Majid Nili Ahmadabadi

Abstract

One of the shortcomings of the existing clustering methods is their problems dealing with different shape and size clusters. On the other hand, most of these methods are designed for especial cluster types or have good performance dealing with particular size and shape of clusters. The main problem in this connection is how to define a dissimilarity criterion to make this algorithm capable of clustering general data, which include clusters of different shapes and sizes. Another important objective that must be considered is the computational complexity of any new algorithms. In this paper a new approach to fuzzy clustering is proposed in which a model for each cluster is estimated during learning. Gradually besides, dissimilarity metric for each cluster is defined, updated and used for the next step. In our approach, instead of associating a single cluster type to each cluster, we assume a set of possible cluster types for each cluster with different grades of possibility. Then, a truncation which can be expressed as an attention mechanism focuses on the most probable cluster types for each cluster. This selection step subsides the computational load dramatically while speeds up the clustering. The proposed clustering method which has the capability to deal with partial labeled data is implemented on two families of data, first in presence of partially labeled data, then with fully unlabeled data. Comparing the experimental results of this method with several important existing algorithms, demonstrates the superior performance of proposed method. The merit of this method is its ability to deal with clusters of different shape and size while it computes a fuzzy membership value to different shapes for each cluster.

Keywords

Clustering, Cluster Prototype, Mass Prototype, Linear prototype, Shell Prototype, Fuzzy Membership Function, Attention Control

Please sign in

The CSI Journal on Computer Science and Engineering

Learning Cluster Type and Dissimilarity Metric for Each Cluster Using a Set of Possible Cluster Types