Preprocessing for Data Mining and Decision Support

  • Olga Štěpánková
  • Petr Aubrecht
  • Zdeněk Kouba
  • Petr Mikšovský
Part of the The Springer International Series in Engineering and Computer Science book series (SECS, volume 745)


The goal of this chapter is to identify data preprocessing tasks that can benefit from the existence of software support, and to describe the basic requirements on the tool, which can serve this purpose. These requirements are implemented in the data transformation tool, SumatraTT. The design principles and basic functionality of SumatraTT are explained. The chapter concludes by a brief evaluation of experience gained using SumatraTT was in different tasks, and with a summary of plans for its further development.


Data Mining Data Transformation Data Preparation Data Mining Algorithm Data Mining Process 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Aubrecht, P. and Kouba, Z. (2001). Meta-Data Driven Data Transformation. Proc. 5th World Muti-conference on Systemics, Cybernetics and Informatics.Google Scholar
  2. Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C. and Wirth, R. (2000). CRISP-DM 1.0: Step-by-step data mining guide, CRISP-DM consortium, Google Scholar
  3. CRISP-DM (1999). Crosss Industry Standard Process for Data-Mining, Google Scholar
  4. Kietz, J. U., Vaduva, A. and Zücker, R. (2001). MiningMart: Metadata-Driven Preprocessing. Proc. ECMUPKDD Workshop on Database Support for KDD.Google Scholar
  5. Mikšovský, P. and Kouba, Z. (2001). GOLAP — Geographical Online Analytical Processing, Database and Expert Systems Applications, Vol. 1, 442–449.CrossRefGoogle Scholar
  6. MiningMart (2003). MiningMart project:,
  7. Morik, K. and Scholz, M. (2003). The MiningMart Approach to Knowledge Discovery in Databases, In (eds. Zhong, N. and Liu, J.), Handbook of Intelligent IT, IOS Press.Google Scholar
  8. PMML (2001). Predictive Model Markup Language specification, Scholar
  9. Pyle, D. (1999), Data Preparation for Data Mining, Morgan Kaufmann.Google Scholar
  10. SumatraTT (2003). available at,
  11. Zücker, R. and Kietz, J.-U. (2000). How to preprocess large databases, In Proc. PKDD 2000 Workshop on Data Mining, Decision Support, Meta-learning and ILP: Forum for Practical Problem Presentation and Prospective Solutions, (eds. Brazdil P., Jorge A.), University of Porto.Google Scholar

Copyright information

© Springer Science+Business Media New York 2003

Authors and Affiliations

  • Olga Štěpánková
  • Petr Aubrecht
  • Zdeněk Kouba
  • Petr Mikšovský

There are no affiliations available

Personalised recommendations