preview

The Clustering Is A Data Mining Technique

Better Essays

Abstract: The Clustering is a data mining technique used to place data elements into related groups without advance knowledge of the group description, which is a division of data into groups of similar objects. The data representing by fewer clusters necessarily loses certain fine details, but achieves generalization. It models data by its clusters. The data modeling puts clustering in a historical perspective rooted in statistics, numerical analysis and mathematics. In this paper represents the performance of three clustering algorithms such as EM, DBSCAN and SimpleKMeans are evaluated. The Diabetes dataset is used for estimating and evaluating the time factor for predicting the performance of the algorithms by using clustering …show more content…

This paper presents comparison is made to find out which analysis option is the best for clustering algorithm called EM, DBSCAN and SimpleKMeans. The test option there are four kinds of parameter like supplied test set, training set, percentage spilt and class to clusters evaluation. The training set is used to calculate the data set values. This paper uses the Diabetes dataset for comparison of those algorithms.. The section 2 describes the literature review, Section 3 describes the methodology for the Diabetes dataset and Section 4 describes the experimental result. Finally Section 5 gives the Conclusion and Future work. 2. Literature Review: J.M. Pena et al., proposed to perform the optimization of the BN parameters using an alternative approach to the EM technique. We provide experimental results to show that our proposal results in a more effective and efficient version of the Bayesian Structural EM algorithm for learning BNs for clustering [2]. C. Ambroise et al., choosing a clustering algorithm that is well-suited for dealing with spatial data. In this algorithm, derivative from the EM algorithm has been designed for penalized likelihood estimation in situations with unobserved class labels and very satisfactory empirical results lead us to believe that this algorithm converges [3]. Miin-Shen Yang et al., proposed

Get Access