Scalable Decision Tree Construction

Scalable classification tree construction; Scalable top-down decision tree construction; Tree-structured classifier


Decision trees are popular classification models. Decision trees are usually contructed greedily top-down from a training dataset. In many modern applications, the training dataset is very large and thus decision tree construction algorithms that scale with the size of the training dataset are needed.

Historical Background

Decision trees, in particular classification trees, have a long history both in the statistics [4] and the machine learning communities [12, 13]. Scalability was not much a concern until the advent of data mining brought training datasets that were orders of magnitude larger than in traditional applications in machine learning and statistics.

Scalability concerns in classification started with the work by Agrawal et al. who presented an interval classfier that generated classification functions that distinguishes the different groups...

Recommended Reading

