Splitting Criteria Based on the McDiarmid’s Theorem
Since the Hoeffding’s inequality proved to be irrelevant in establishing splitting criteria for the information gain and the Gini gain, a new statistical tool has to be proposed. In this chapter, the McDiarmid’s inequality  is introduced, which is a generalization of the Hoeffding’s one to any nonlinear functions. Further extensions and analysis of the McDiarmid’s inequality can be found in . Based on the McDiarmid’s inequality, two theorems are presented in this book: one for the information gain and one for the Gini index. These theorems were first published in . The obtained bounds were improved in [4, 5]. In the case of the Gini index, the corresponding bound was further tightened even more in . Hence, finally this book considers the bound for the information gain taken from  and the bound for the Gini index published in .
- 1.McDiarmid, C.: On the method of bounded differences. Surveys in Combinatorics, pp. 148–188 (1989)Google Scholar
- 2.Combes, R.: An extension of McDiarmid’s inequality. CoRR (2015). arXiv:1511.05240
- 4.De Rosa, R., Cesa-Bianchi, N.: Splitting with confidence in decision trees with application to stream mining. In: 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2015)Google Scholar
- 7.Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)Google Scholar
- 8.Duda, P., Jaworski, M., Pietruczuk, L., Rutkowski, L.: A novel application of Hoeffding’s inequality to decision trees construction for data streams. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 3324–3330 (2014)Google Scholar
- 9.Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106 (2001)Google Scholar