Data Mining with Multilayer Perceptrons and Support Vector Machines

  • Paulo Cortez
Part of the Intelligent Systems Reference Library book series (ISRL, volume 24)


Multilayer perceptrons (MLPs) and support vector machines (SVMs) are flexible machine learning techniques that can fit complex nonlinear mappings. MLPs are the most popular neural network type, consisting on a feedforward network of processing neurons that are grouped into layers and connected by weighted links. On the other hand, SVM transforms the input variables into a high dimensional feature space and then finds the best hyperplane that models the data in the feature space. Both MLP and SVM are gaining an increase attention within the data mining (DM) field and are particularly useful when more simpler DM models fail to provide satisfactory predictive models. This tutorial chapter describes basic MLP and SVM concepts, under the CRISP-DM methodology, and shows how such learning tools can be applied to real-world classification and regression DM applications.


Support Vector Machine Root Mean Square Error Data Mining True Positive Rate Wine Quality 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Asuncion, A., Newman, D.: UCI Machine Learning Repository, Univ. of California Irvine (2007),
  2. 2.
    Barakat, N., Diederich, J.: Learning-based rule-extraction from support vector machines. In: 14th International Conference on Computer Theory and Applications ICCTA, Citeseer, vol. 2004 (2004)Google Scholar
  3. 3.
    Bishop, C.M., et al.: Pattern recognition and machine learning. Springer, New York (2006)zbMATHGoogle Scholar
  4. 4.
    Brown, M., Kros, J.: Data mining and the impact of missing data. Industrial Management & Data Systems 103(8), 611–621 (2003)CrossRefGoogle Scholar
  5. 5.
    Chang, C., Hsu, C., Lin, C.: A Practical Guide to Support Vector Classification. Technical report, National Taiwan University (2003)Google Scholar
  6. 6.
    Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., Wirth, R.: CRISP-DM 1.0: Step-by-step data mining guide. CRISP-DM consortium (2000)Google Scholar
  7. 7.
    Cherkassy, V., Ma, Y.: Practical Selection of SVM Parameters and Noise Estimation for SVM Regression. Neural Networks 17(1), 113–126 (2004)CrossRefGoogle Scholar
  8. 8.
    CRISP-DM consortium. CRISP-DM Web Site (2010),
  9. 9.
    Cortes, C., Vapnik, V.: Support Vector Networks. Machine Learning 20(3), 273–297 (1995)zbMATHGoogle Scholar
  10. 10.
    Cortez, P.: Data Mining with Neural Networks and Support Vector Machines Using the R/rminer Tool. In: Perner, P. (ed.) ICDM 2010. LNCS(LNAI), vol. 6171, pp. 572–583. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  11. 11.
    Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J.: Modeling wine preferences by data mining from physicochemical properties. Decision Support Systems 47(4), 547–553 (2009)CrossRefGoogle Scholar
  12. 12.
    Cortez, P., Correia, A., Sousa, P., Rocha, M., Rio, M.: Spam Email Filtering Using Network-Level Properties. In: Perner, P. (ed.) ICDM 2010. LNCS(LNAI), vol. 6171, pp. 476–489. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  13. 13.
    Cortez, P., Teixeira, J., Cerdeira, A., Almeida, F., Matos, T., Reis, J.: Using data mining for wine quality assessment. In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B. (eds.) DS 2009. LNCS, vol. 5808, pp. 66–79. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  14. 14.
    Dietterich, T.: Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation 10(7), 1895–1923 (1998)CrossRefGoogle Scholar
  15. 15.
    Fawcett, T.: An introduction to ROC analysis. Pattern Recognition Letters 27, 861–874 (2006)CrossRefGoogle Scholar
  16. 16.
    Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge (1996)Google Scholar
  17. 17.
    Flexer, A.: Statistical Evaluation of Neural Networks Experiments: Minimum Requirements and Current Practice. In: Proceedings of the 13th European Meeting on Cybernetics and Systems Research, Vienna, Austria, vol. 2, pp. 1005–1008 (1996)Google Scholar
  18. 18.
    Grossman, R., Hornick, M., Meyer, G.: Data Mining Standards Initiatives. Communications of ACM 45(8), 59–61 (2002)CrossRefGoogle Scholar
  19. 19.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)zbMATHGoogle Scholar
  20. 20.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, Heidelberg (2008)Google Scholar
  21. 21.
    Haykin, S.S.: Neural networks and learning machines. Prentice-Hall, Englewood Cliffs (2009)Google Scholar
  22. 22.
    Kewley, R., Embrechts, M., Breneman, C.: Data Strip Mining for the Virtual Design of Pharmaceuticals with Neural Networks. IEEE Trans. Neural Networks 11(3), 668–679 (2000)CrossRefGoogle Scholar
  23. 23.
    Kohavi, R., Provost, F.: Glossary of Terms. Machine Learning 30(2/3), 271–274 (1998)CrossRefGoogle Scholar
  24. 24.
    Mendes, R., Cortez, P., Rocha, M., Neves, J.: Particle Swarms for Feedforward Neural Network Training. In: Proceedings of The 2002 International Joint Conference on Neural Networks (IJCNN 2002), May 2002, pp. 1895–1899. IEEE Computer Society Press, Honolulu, Havai, USA (2002)Google Scholar
  25. 25.
    Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)zbMATHGoogle Scholar
  26. 26.
    Piatetsky-Shapiro, G.: Software Suites for Data Mining, Analytics, and Knowledge Discovery (2010),
  27. 27.
    Provost, F., Domingos, P.: Tree Induction for Probability-Based Ranking. Machine Learning 52(3), 199–215 (2003)zbMATHCrossRefGoogle Scholar
  28. 28.
    Pyle, D.: Data Preparation for Data Mining. Morgan Kaufmann, San Francisco (1999)Google Scholar
  29. 29.
    Rocha, M., Cortez, P., Neves, J.: Evolution of Neural Networks for Classification and Regression. Neurocomputing 70, 2809–2816 (2007)CrossRefGoogle Scholar
  30. 30.
    Sarle, W.: Neural Network Frequently Asked Questions (2002),
  31. 31.
    Setiono, R.: Techniques for Extracting Classification and Regression Rules from Artificial Neural Networks. In: Fogel, D., Robinson, C. (eds.) Computational Intelligence: The Experts Speak, pp. 99–114. IEEE, Piscataway (2003)Google Scholar
  32. 32.
    Silva, Á., Cortez, P., Santos, M.F., Gomes, L., Neves, J.: Rating organ failure via adverse events using data mining in the intensive care unit. Artificial Intelligence in Medicine 43(3), 179–193 (2008)CrossRefGoogle Scholar
  33. 33.
    Smola, A., Schölkopf, B.: A tutorial on support vector regression. Statistics and Computing 14, 199–222 (2004)MathSciNetCrossRefGoogle Scholar
  34. 34.
    Smola, A., Schölkopf, B.: Kernel-Machines.Org (2010),
  35. 35.
    Turban, E., Sharda, R., Delen, D.: Decision Support and Business Intelligence Systems, 9th edn. Prentice Hall, Englewood Cliffs (2010)Google Scholar
  36. 36.
    Venables, W., Ripley, B.: Modern Applied Statistics with S, 4th edn. Springer, Heidelberg (2003)Google Scholar
  37. 37.
    Wang, W., Xu, Z., Lu, W., Zhang, X.: Determination of the spread parameter in the Gaussian kernel for classification and regression. Neurocomputing 55(3), 643–663 (2003)CrossRefGoogle Scholar
  38. 38.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (2005)Google Scholar
  39. 39.
    Wu, T.F., Lin, C.J., Weng, R.C.: Probability estimates for multi-class classification by pairwise coupling. The Journal of Machine Learning Research 5, 975–1005 (2004)MathSciNetzbMATHGoogle Scholar
  40. 40.
    Wu, X., Kumar, V., Quinlan, J., Gosh, J., Yang, Q., Motoda, H., MacLachlan, G., Ng, A., Liu, B., Yu, P., Zhou, Z., Steinbach, M., Hand, D., Steinberg, D.: Top 10 algorithms in data mining. Knowledge and Information Systems 14(1), 1–37 (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Paulo Cortez
    • 1
  1. 1.Centro Algoritmi, Departamento de Sistemas de InformaçãoUniversidade do MinhoGuimarãesPortugal

Personalised recommendations