Methods Of Optimization Clustering

Decent Essays

Optimization clustering As one of the most popular and widely used data mining techniques, cluster analysis is mainly divided into hierarchical clustering and partitional clustering, which are carried out in a supervised or unsupervised way to separate data into different groups based on similar characteristics. Both the hierarchical and partitional clustering have advantages and drawbacks. Especially, the efficiency and accuracy are the primary challenges that clustering analysis has to face. For example, the most efficient algorithm of hierarchical clustering is complete-linkage clustering in some special cases, the complexity of which is Ο (n2). Therefore, the hierarchical clustering usually leads to too slow efficiency for large data …show more content…

First of all, a new idea of using PSO for k-means clustering technique is presented by Vander Merwe and Engelbrecht. Different from traditional k-means clustering, which starts with multiple solutions to the problem, the new approach specifies a fixed number of particles as a swarm, each of which representing the centroids of all clusters. A new set of better solutions is generated after a successive iterations based on previous solutions. As the first proposed method using PSO based on clustering, it does not improve the efficiency of execution time, while this method provides a new way to optimize clustering. Secondly, Omran propose Dynamic Clustering (DCPSO) algorithm based on binary PSO combining with k-means clustering. In this approach, PSO is used for clustering the data while k-means is used to refine the clustering solution. At first, the number of clusters is determined automatically and the data sets are clustered based on minimal user interference. In order to decrease the effects of initial conditions, a relatively large number of clusters are generated firstly. Then the number of clusters is optimized by binary particle swarm optimization, while K-means clustering algorithm is used to select the centroids. Both synthetic and natural images are used to test the approach, which show that the optimum number of clusters are generally founded on the tested images.

Get Access