Knowledge Discovery and Data Mining pp 107-121 | Cite as
Comparative Study
Abstract
-
Dimensionality Reduction is measured by the portion of candidate-input attributes removed by the algorithm (excluded from the network) and by the size of the information-theoretic network vs. other predictive models.
-
Prediction Accuracy is the average accuracy of the network on validation cases vs. published accuracy of other classifiers.
-
Stability represents the ability of the algorithms to provide similar results from different random samples of the same dataset. The benchmark classification methods used for the comparison include:
-
Naive Bayes Classifier. This is a probabilistic method assuming conditional independence of all input attributes. See the details of the algorithm in (Mitchell, 1997).
-
C4.5. This is a state-of-the-art decision tree algorithm presented in (Quinlan, 1993). Today, most commercial tools for constructing decision trees are based on C4.5 or one of its modified versions.
-
CART ™ This is an earlier decision tree method (Breiman et al., 1984). It is used as the engine of a commercial tool, having the same name, which is available from Salford Systems ( http://www.salford-systems.com/ ).
Keywords
Dimensionality Reduction Terminal Node Training Error Conditional Entropy Input AttributePreview
Unable to display preview. Download preview PDF.