preview

Essay On Splitup Function

Decent Essays

This splitup function helps in coming out of the problem associated with DBSCAN as it aims to partition highly uniform datasets, which may be incorrectly identified as a single cluster if DBSCAN is called with a reasonable large s-reachability value (Usually the density based algorithm forms relatively less clusters based on the density even when there is a need to further divide the clusters). This splitup function is also useful for some datasets that present specific challenges for the density based algorithm such as those with large discrepancies in density. When there is a situation which leads into a dilemma in which clusters to split, this function takes the strength of the cluster as well as the distance between the new clusters …show more content…

The initial centroid for the “top” cluster is the data point far away from the original cluster’s centroid. To find the initial location of the centroid of the “bottom” cluster, we draw a vector starting at the original cluster’s centroid and ending at the top cluster’s centroid. This vector is then set to point in the opposite direction of the “top” centroid. The endpoint of this mirrored vector is used as the initial centroid for the “bottom” cluster. This entire process of splitting the cluster results in the initial centroids for “top” and “bottom” being as far away from each other as possible. Figure 3. A pseudocode of the mirroring function used in the weighted split function of our CluDataSE. The complete dataset points of the original cluster are then assigned to either the “top” or “bottom” clusters, depending on which centroid is closer. Once all the data points have been assigned to some cluster, the “top” and “bottom” clusters are then checked to have at least minimum points of data points. If either the “top” or the “bottom” cluster contains less than minimum points of data points, then the split is considered invalid. In this case, the original cluster remains the same without further classification, and, the “top” and “bottom” clusters formed using splitup function are discarded. If a split has occurred, the two new clusters, “top” and “bottom” are stored in a list of new

Get Access