Improving Decision Tree Performance Methods

Good Essays

There are several improvement methods are available to improve decision tree performance in terms of accuracy, and modelling time. Since experimenting with every available method is impossible, some of the methods are selected that are proven to increase decision tree performances. Selected improvement methods and their experimental setups are presented in this chapter.

4.1 Correlation-Based Feature Selection

Feature selection is a method used for reducing number of dimensions of a dataset by removing irrelevant and redundant attributes. Given a set of attributes F and a target class C, goal of feature selection is to find a minimum set of F that will yield highest accuracy (for C) for the classification task. Although …show more content…

Also, method is performing well for C4.5 algorithm is likely to perform well for ID3 algorithm. Previous studies show that CFS method increases accuracy for CART algorithm although not as much as the C4.5 algorithm does (Doraisamy et al., 2008).
CFS uses a search algorithm and feature evaluation algorithm which uses a heuristic that measures "goodness" of attributes subsets. Hall and Smith (1998) define this goodness heuristic as "Good feature subsets contain features highly correlated with the class, yet uncorrelated with each other." Equation 1 below shows heuristic formula. G_x=(k¯(r_ci ))/√(k+k(k-1)¯(r_ii ' ))
Where G_x is the heuristic of goodness of an attribute subset x that contains k features, ¯(r_ci ) is average attribute-class correlation which points predictive power of the attribute subset to a class, and ¯(r_ii ' ) is average attribute inter-correlation that indicates the redundancy among attributes.
A version of correlation-based attribute selection to be included in experiment setup is called Fast Correlation-Based Feature Selection (FCBF) that initially developed by Yu and Liu (2004). This algorithm is preferred over other available correlation-based attribute selection algorithms since while other implementations of CFS using forward-sequential or greedy search methods (e.g. MRMR/CFS developed by Schoewe,

Get Access

Improving Decision Tree Performance Methods

Melanoma Detection Process

Melanoma Detection Process

Evaluation Of Classification Methods For The Prediction Of Hospital Length

Evaluation Of Classification Methods For The Prediction Of Hospital Length

Decision Tree

Decision Tree

Solr Essay

Solr Essay

Analysis Of Data Analysis

Analysis Of Data Analysis

Data Warehouses, Decision Support and Data Mining

Data Warehouses, Decision Support and Data Mining

Literature Review On Sparse Group Lasso

Literature Review On Sparse Group Lasso

Feature Selection Methods : A Comparison Of Feature Selection Algorithmss

Feature Selection Methods : A Comparison Of Feature Selection Algorithmss

Boosted Decision Tree Essay

Boosted Decision Tree Essay

Variable Selection Via Penalized Likelihood Plays An Important Role Of Statistics And Machine Learning

Variable Selection Via Penalized Likelihood Plays An Important Role Of Statistics And Machine Learning

Predictive Analytics : A Gold Mine

Predictive Analytics : A Gold Mine

Linear Regressions

Linear Regressions

The Importance Of Classifiers

The Importance Of Classifiers

Classification And Novel Class Detection Approaches Of Feature Evolving Data Stream

Classification And Novel Class Detection Approaches Of Feature Evolving Data Stream

Classification Theorem : Classification Algorithm

Classification Theorem : Classification Algorithm

Related Topics