Data Wrangling: A Decisive Step for Compact Regression Trees

  • Olivier Parisot
  • Yoanne Didry
  • Thomas Tamisier
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8683)


Nowadays, modern visualization and decision support platforms provide advanced and interactive tools for data wrangling, in order to facilitate data analysis. Nonetheless, it is a tedious process that requires a deep experience in data transformation. In this paper, we propose an automated data wrangling method, based on a genetic algorithm, that helps to obtain simpler regression trees.


data wrangling genetic algorithms decision support 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bache, K., Lichman, M.: UCI machine learning repository (2013)Google Scholar
  2. 2.
    Bdack, T., Hoffmeister, F., Schwefel, H.P.: A survey of evolution strategies (1991)Google Scholar
  3. 3.
    Breiman, L., et al.: Classification and Regression Trees. Chapman & Hall (1984)Google Scholar
  4. 4.
    Engels, R., Theusinger, C.: Using a data metric for preprocessing advice for data mining applications. In: ECAI, pp. 430–434 (1998)Google Scholar
  5. 5.
    Kandel, S., Heer, J., Plaisant, C., Kennedy, J., van Ham, F., Riche, N.H., Weaver, C., Lee, B., Brodbeck, D., Buono, P.: Research directions in data wrangling: Visualizations and transformations for usable and credible data. Inf. Vis. 10(4), 271–288 (2011)CrossRefGoogle Scholar
  6. 6.
    Kandel, S., Paepcke, A., Hellerstein, J., Heer, J.: Wrangler: Interactive visual specification of data transformation scripts. In: Proceedings of the SIGCHI Conference on H.F.C.S., CHI 2011, NY, USA, pp. 3363–3372 (2011)Google Scholar
  7. 7.
    Kotsiantis, S.B.: Decision trees: a recent overview. Artificial Intelligence Review 39(4), 261–283 (2013)CrossRefGoogle Scholar
  8. 8.
    Parisot, O., Bruneau, P., Didry, Y., Tamisier, T.: User-driven data preprocessing for decision support. In: Luo, Y. (ed.) CDVE 2013. LNCS, vol. 8091, pp. 81–84. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  9. 9.
  10. 10.
    Wang, S., Wang, H.: Mining data quality in completeness. In: ICIQ, pp. 295–300 (2007)Google Scholar
  11. 11.
    Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques. Elsevier (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Olivier Parisot
    • 1
  • Yoanne Didry
    • 1
  • Thomas Tamisier
    • 1
  1. 1.Public Research Centre Gabriel LippmannBelvauxLuxembourg

Personalised recommendations