Training Model Trees on Data Streams with Missing Values

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 584)


Model trees combine the interpretability of decision trees with the efficiency of multiple linear regressions making them useful in dynamically attaining predictive analysis on data streams. However, missing values within the data streams is an issue during the training phase of a model tree. In this article, we compare different approaches to deal with incomplete streams in order to measure their impact on the resulting model tree in terms of accuracy. Moreover, we propose an online method to estimate and adjust the missing values during the stream processing. To show the results, a prototype has been developed and tested on several benchmarks.


Data streams Model trees Missing values imputation 



The project is supported by a grant from the Ministry of Economy and External Trade, Grand-Duchy of Luxembourg, under the RDI Law. Moreover, this work has been realized in partnership with the infinAIt Solutions S.A. company (, so we would like to thank Gero Vierke and Helmut Rieder for their help.


  1. 1.
    Bache, K., Lichman, M.: UCI Machine Learning Repository (2013)Google Scholar
  2. 2.
    Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: Massive Online Analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)Google Scholar
  3. 3.
    Breiman, L., et al.: Classification and Regression Trees. Chapman & Hall, New York (1984)zbMATHGoogle Scholar
  4. 4.
    Breslow, L.A., Aha, D.W.: Simplifying decision trees: a survey. Knowl. Eng. Rev. 12(1), 1–40 (1997)CrossRefGoogle Scholar
  5. 5.
    Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J.: Modeling wine preferences by data mining from physicochemical properties. Decis. Support Syst. 47(4), 547–553 (2009). Smart Business Networks: Concepts and Empirical EvidenceCrossRefGoogle Scholar
  6. 6.
    Didry, Y., Parisot, O., Tamisier, T.: Engineering data intensive applications with cadral. In: Luo, Y. (ed.) CDVE 2015. LNCS, vol. 9320, pp. 28–35. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-24132-6_4 CrossRefGoogle Scholar
  7. 7.
    Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80. ACM (2000)Google Scholar
  8. 8.
    Enders, C.K.: Applied Missing Data Analysis. Guilford Publications, New York (2010)Google Scholar
  9. 9.
    Farhangfar, A., Kurgan, L., Dy, J.: Impact of imputation of missing values on classification error for discrete data. Pattern Recogn. 41(12), 3692–3705 (2008)CrossRefzbMATHGoogle Scholar
  10. 10.
    Féraud, R., Clérot, F.: A methodology to explain neural network classification. Neural Networks 15(2), 237–246 (2002)CrossRefGoogle Scholar
  11. 11.
    Fong, S., Yang, H.: The six technical gaps between intelligent applications, real-time data mining: a critical review. J. Emerg. Technol. Web Intell. 3(2), 63–73 (2011)Google Scholar
  12. 12.
    Frank, E., Mayo, M., Kramer, S.: Alternating model trees. In: 30th Annual ACM Symposium on Applied Computing, SAC 2015, pp. 871–878. ACM, NY (2015)Google Scholar
  13. 13.
    Gilbert, D.: The jfreechart class library: Developer Guide. Object Refinery 7 (2002)Google Scholar
  14. 14.
    Hang, Y., Fong, S.: An experimental comparison of decision trees in traditional data mining and data stream mining. In: 6th International Conference on Advanced Information Management and Service (IMS), pp. 442–447. IEEE (2010)Google Scholar
  15. 15.
    Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)CrossRefMathSciNetzbMATHGoogle Scholar
  16. 16.
    Ikonomovska, E., Gama, J.: Learning model trees from data streams. In: Boulicaut, J.-F., Berthold, M.R., Horváth, T. (eds.) DS 2008. LNCS (LNAI), vol. 5255, pp. 52–63. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  17. 17.
    Ikonomovska, E., Gama, J., Džeroski, S.: Learning model trees from evolving data streams. Data Min. Knowl. Discov. 23(1), 128–168 (2011)CrossRefMathSciNetzbMATHGoogle Scholar
  18. 18.
    Ikonomovska, E., Gama, J., Sebastião, R., Gjorgjevik, D.: Regression trees from data streams with drift detection. In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B. (eds.) DS 2009. LNCS, vol. 5808, pp. 121–135. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  19. 19.
    Junninen, H., Niska, H., Tuppurainen, K., Ruuskanen, J., Kolehmainen, M.: Methods for imputation of missing values in air quality data sets. Atmos. Environ. 38(18), 2895–2907 (2004)CrossRefGoogle Scholar
  20. 20.
    Kotsiantis, S.B.: Decision trees: a recent overview. Artif. Intell. Rev. 39(4), 261–283 (2013)CrossRefGoogle Scholar
  21. 21.
    Marwala, T., IGI Global: Computational intelligence for missing data imputation, estimation and management: knowledge optimization techniques. Information Science Reference, Herhsey (2009)Google Scholar
  22. 22.
    Muñoz, J., Felicísimo, Á.M.: Comparison of statistical methods commonly used in predictive modelling. J. Veg. Sci. 15(2), 285–292 (2004)CrossRefGoogle Scholar
  23. 23.
    Murthy, S.K.: Automatic construction of decision trees from data: a multi-disciplinary survey. Data Min. Knowl. Discov. 2(4), 345–389 (1998)CrossRefGoogle Scholar
  24. 24.
    Mwale, F.D., Adeloye, A.J., Rustum, R.: Infilling of missing rainfall and streamflow data in the Shire River basin, Malawi-a SOM approach. Phys. Chem. Earth 50, 34–43 (2012)CrossRefGoogle Scholar
  25. 25.
    O’Madadhain, J., Fisher, D., White, S., Boey, Y.: The JUNG (Java Universal Network/Graph) framework. Technical report, UCI-ICS (2003)Google Scholar
  26. 26.
    Patel, K., Mehta, R.G., Raghuvanshi, M.M., Vadnere, N.N.: Incremental missing value replacement techniques for stream data. Int. J. Comput. Appl. 122(17), 9–13 (2015)Google Scholar
  27. 27.
    Pham, N.-K., Do, T.-N., Poulet, F., Morin, A.: Treeview, exploration interactive des arbres de decision. Revue d’Intelligence Artificielle 22(3–4), 473–487 (2008)CrossRefGoogle Scholar
  28. 28.
    Quinlan, J.R.: Learning with continuous classes. In: 5th Australian joint Conference on Artificial Intelligence, vol. 92, pp. 343–348, Singapore (1992)Google Scholar
  29. 29.
    Rubin, D.B.: Inference and missing data. Biometrika 63(3), 581–592 (1976)CrossRefMathSciNetzbMATHGoogle Scholar
  30. 30.
    Saar-Tsechansky, M., Provost, F.: Handling missing values when applying classification models (2007)Google Scholar
  31. 31.
    Shmueli, G., Koppius, O.R.: Predictive analytics in information systems research. Mis Q. 35(3), 553–572 (2011)Google Scholar
  32. 32.
    Siegel, E.V.: Competitively evolving decision trees against fixed training cases for natural language processing. Adv. Genet. Program. 19, 409–423 (1994)Google Scholar
  33. 33.
    Smith, J.D., Borckardt, J.J., Nash, M.R.: Inferential precision in single-case time-series data streams: how well does the em procedure perform when missing observations occur in autocorrelated data? Behav. Ther 43(3), 679–685 (2012)CrossRefGoogle Scholar
  34. 34.
    Stiglic, G., Kocbek, S., Pernek, I., Kokol, P.: Comprehensive decision tree models in bioinformatics. PLoS ONE 7(3), e33812 (2012)CrossRefGoogle Scholar
  35. 35.
    Tfwala, S.S., Wang, Y.-M., Lin, Y.-C.: Prediction of missing flow records using multilayer perceptron and coactive neurofuzzy inference system. Sci. World J. (2013)Google Scholar
  36. 36.
    Tran, T.T., Peng, L., Diao, Y., McGregor, A., Liu, A.: Claro: modeling and processing uncertain data streams. VLDB J. Int. J. Very Large Data Bases 21(5), 651–676 (2012)CrossRefGoogle Scholar
  37. 37.
    Buuren, S.V.: Flexible Imputation of Missing Data. CRC Press, Boca Raton (2012)CrossRefzbMATHGoogle Scholar
  38. 38.
    Hulse, J.V., Khoshgoftaar, T.M.: A comprehensive empirical evaluation of missing value imputation in noisy software measurement data. J. Syst. Softw. 81(5), 691–708 (2008)CrossRefGoogle Scholar
  39. 39.
    Walters, D.K.W., Linn, R.T., Kulas, M., Cuddihy, E., Chonghua, W., Granger, C.V.: Selecting modeling techniques for outcome prediction: Comparison of artificial neural networks, classification and regression trees, and linear regression analysis for predicting medical rehabilitation outcomes. J. Am. Med. Inform. Assoc. Suppl. S, vol. 1187 (1999)Google Scholar
  40. 40.
    Wang, Y., Witten, I.H.: Induction of model trees for predicting continuous classes (1996)Google Scholar
  41. 41.
    Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques. Elsevier, San Francisco (2011)Google Scholar
  42. 42.
    Zhang, P., Zhu, X., Shi, Y., Guo, L., Xindong, W.: Robust ensemble learning for mining noisy data streams. Decis. Support Syst. 50(2), 469–479 (2011)CrossRefGoogle Scholar
  43. 43.
    Zhu, X., Xindong, W.: Class noise vs. attribute noise: a quantitative study. Artif. Intell. Rev. 22(3), 177–210 (2004)CrossRefzbMATHGoogle Scholar
  44. 44.
    Zhu, X., Zhang, P., Wu, X., He, D., Zhang, C., Shi, Y.: Cleansing noisy data streams. In: ICDM 2008, pp. 1139–1144. IEEE (2008)Google Scholar
  45. 45.
    Žliobaitė, I., Hollmén, J.: Optimizing regression models for data streams with missing values. Mach. Learn. 99(1), 47–73 (2015)CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Luxembourg Institute of Science and Technology (LIST)BelvauxLuxembourg

Personalised recommendations