Advances in Predictive Data Mining Methods

  • Se June Hong
  • Sholom M. Weiss
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1715)


Predictive models have been widely used long before the development of the new field that we call data mining. Expanding application demand for data mining of ever increasing data warehouses, and the need for understandability of predictive models with increased accuracy of prediction, all have fueled recent advances in automated predictive methods. We first examine a few successful application areas and technical challenges they present. We discuss some theoretical developments in PAC learning and statistical learning theory leading to the emergence of support vector machines. We then examine some technical advances made in enhancing the performance of the models both in accuracy (boosting, bagging, stacking) and scalability of modeling through distributed model generation.


Support Vector Machine Text Mining Fraud Detection Optimal Hyperplane Multiple Decision Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Gallagher C., “Risk Classication Aided by New Software Tool (CHAID ­Chi­ Squared Automatic Interaction Detector”, National Underwriter Property & Casualty ­ Risk and Benets Management, Vol. 17, No. 19, April 1992.Google Scholar
  2. 2.
    Breiman L., Friedman J.H., Olshen R.A. & Stone C.J., Classication and Regression Trees, Wadsworth International Group, 1984.Google Scholar
  3. 3.
    Quinlan J.R., C4.5 programs for machine learning, Morgan Kaufmann, 1993.Google Scholar
  4. 4.
    Shafer J., Agrawal R, Mehta M., “SPRINT: A Scalable Parallel Classier for data Mining”, Procc. of the 22nd ICVLDB, pp. 544–555, 1996.Google Scholar
  5. 5.
    Apte C., Grossman E., Pednault E., Rosen B., Tipu F., White B, “Insurance Risk Modeling Using Data Mining Technology”, Tech. Report RC-21314, IBMResearch Division, 1998. To appear in Proc. of PADD99.Google Scholar
  6. 6.
    Stolfo S.J., Prodromidis A., Tselepis S., Lee W., Fan W. & Chan P., “JAM: Java Agents for Meta-Learning over Distributed Databases”, Proc. of KDDM97, pp. 74–81, 1997.Google Scholar
  7. 7.
    Hayes P.J. & Weinstein S.,“Adding Value to Financial News by Computer”, Proc. of the First International Conference on Artificial Intelligence Applications on Wall Street, pp. 2–8, 1991.Google Scholar
  8. 8.
    Hayes P.J., Andersen P.M., Nirenburg I.B., & Schmandt L.M., “TCS: A Shell for Content-Based Text Categorization”, Proc. of the Sixth IEEE CAIA, pp. 320–326, 1990.Google Scholar
  9. 9.
    Weiss S. & Indurkhya N., Predictive Data Mining: A Practical guide,Morgan Kaufmann, 1998.Google Scholar
  10. 10.
    Hosking J.R.M., Pednault E.P.D. & Sudan M., “A Statistical Perspective on Data Mining”, Future Generation Computer Systems: Special issue on Data Mining, Vol. 3, Nos. 2-3, pp. 117–134., 1997.CrossRefGoogle Scholar
  11. 11.
    Vapnik V.N., Statistical Learning Theory, Wiley, 1998Google Scholar
  12. 12.
    Breiman L., “Bagging Predictors”,Machine Learning, Vol. 24, pp.123–140, 1996.Google Scholar
  13. 13.
    Freund Y. & Schapire R., “Experiments with a New Boosting Algorithm”, Proc. of the International Machine Learning Conference, Morgan Kaufmann, pp. 148–156, 1996.Google Scholar
  14. 14.
    Wolpert D., “Stacked Generalization”,Neural Networks, Vol. 5, No. 2, pp. 241–260, 1992.CrossRefGoogle Scholar
  15. 15.
    Dietterich, T.D., “Machine learning Research: Four Current Directions”, AI Magazine, Vol. 18, No. 4, pp. 97–136, 1997.Google Scholar
  16. 16.
    Domingos P. & Pazzani M., “on the Optimality of the Simple Bayesian Classifier under Zero-One Loss”, Machine Learning, Vol. 29, pp. 103–130, 1997.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Se June Hong
    • 1
  • Sholom M. Weiss
    • 1
  1. 1.IBM T.J. Watson Research CenterYorktown HeightsUSA

Personalised recommendations