Data Mining Method Of Extracting The Data From Large Database

1681 Words7 Pages
Abstract— Data mining is the method of extracting the data from large database. Various data mining techniques are clustering, classification, association analysis, regression, summarization, time series analysis and sequence analysis, etc. Clustering is one of the important tasks in mining and is said to be unsupervised classification. Clustering is the techniques which is used to group similar objects or processes. In this work four clustering algorithms (K-Means, Farthest first, EM, Hierarchal) have been analyzed to cluster the data and to find the outliers based on the number of clusters. Here the WEKA (Waikato Environment for Knowledge Analysis) for analyzing the clustering techniques. Here the time, Clustered and un-clustered…show more content…
Clustering plays an important role in data mining process. Clustering is the approach of grouping the data into classes or clusters so that the objects within each cluster have high similarity in comparison with one another[12].The common approach of clustering techniques is that to find cluster centroid and then the data are clustered. Several clustering techniques are partitioning methods, hierarchical methods, density based methods, grid based methods, model based methods and constraint based clustering. Clustering is a challenging field of research in which its potential applications pose their own requirements [4]. Clustering is also called as the data segmentation because clustering method partitions the large data sets into smaller data groups according to their similarities. The main objective of cluster analysis is to increase intra-group similarity and inter-group dissimilarity.
Detecting outlier is one of the important tasks. A failure to detect outliers or their ineffective handling can have serious ramifications on the strength of the inferences drained from the exercise [4]. Outlier detection has direct applications in a wide variety of domains such as mining for anomalies to detect network intrusions, fraud detection in mobile phone industry and recently for detecting terrorism related activities [5].Outliers are found using the filters which is offered by data mining tools. Liver disorder is also referred to as
Get Access