A Frame Work For Clustering Concept Drifting Categorical Data

3202 Words13 Pages
A Frame Work For Clustering Concept Drifting Categorical Data Authors: Raja Vaghicharla, Ravi Vemuri, Ramakrishna Rama Under guidance: Dr. Victor Shengli sheng Computer Science Department University of Central Arkansas Abstract: Data clustering is the most important technique in studying data analysis and it is also important in researching several domains regarding the analysis for which sampling has been important to improve the efficiency of clustering. However, after the sampling applied those points that are not selected in sample have their labels after the normal process and even we have so many straight forward approach in numeric domain we have the problem of allocated these unlabeled…show more content…
In order to detect the drifting concept we are using sliding window technique. Sliding window It is the one of the most important technique in data mining which removes the obsolete transactions in the current window. With this technique we can test the latest data points in the present window characteristics are similar to the last clustering result or not. 1.2 Node Importance Representative (NIR) Now-a-days usage of data is more so that will find the clusters in the huge data is a big task. As a result we are using practical categorical representative named Node Importance Representative (NIR). It represent clusters by measuring the importance of each attribute value in the clusters. Based on this we propose Drift Concept Detection (DCD). 1.3 DRIFT CONCEPT DETECTION (DCD) In DCD, the incoming categorical data points at the present sliding window are first allocated into the corresponding proper cluster at the last clustering result, and the number of outliers that are not able to be assigned into any cluster is counted. After that, the distribution of clusters and outliers between the last clustering result and the current temporal clustering result are compared with each other. If the distribution is changed (exceeding some criteria), the concepts are said to drift Otherwise the NIR will be

    More about A Frame Work For Clustering Concept Drifting Categorical Data

      Open Document