Advertisement

Parallel Predictor Generation

  • D. B. Skillicorn
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1759)

Abstract

Classification and regression are fundamental data mining techniques. The goal of such techniques is to build predictors based on a training dataset and use them to predict the properties of new data. For a wide range of techniques, combining predictors built on samples from the training dataset provides lower error rates, faster construction, or both, than a predictor built from the entire training dataset. This provides a natural parallelization strategy in which predictors based on samples are built independently and hence concurrently. We discuss the performance implications for two subclasses: those in which predictors are independent, and those in which knowing a set of predictors reduces the difficulty of finding a new one.

Keywords

Training Dataset Linear Speedup Sequential Algorithm Inductive Logic Inductive Logic Programming 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    C. Bishop. Neural networks for pattern recognition. Oxford University Press, 1995. 193Google Scholar
  2. 2.
    P.S. Bradley, U.M. Fayyad, and O.L. Mangasarian. Mathematical programming for data mining: Formulations and challenges. INFORMS Journal of Computing, 11:217–238, 1999. 190zbMATHMathSciNetCrossRefGoogle Scholar
  3. 3.
    L. Breiman. Bagging predictors. Machine Learning, 24:123–140, 1996. 192, 194zbMATHMathSciNetGoogle Scholar
  4. 4.
    L. Breiman. Arcing classifiers. Annals of Statistics, 26(3):801–849, 1998. 194zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    L. Breiman. Pasting bites together for prediction in large data sets and on-line. Machine Learning, 36(1&2), 1999. 192Google Scholar
  6. 6.
    L. Breiman and N. Shang. Born again trees. Technical report, Department of Statistics, University of California, Berkeley, 1996. 192Google Scholar
  7. 7.
    Y. Freund and R. Schapire. Experiments with a new boosting algorithm. In Proceedings of the 13th International Conference on Machine Learning, pages 148–156, 1996. 194Google Scholar
  8. 8.
    S. Muggleton. Inductive logic programming: Issues, results and the LLL challenge. Artificial Intelligence, 1999. 193Google Scholar
  9. 9.
    S. Muggleton. Scientific knowledge discovery using inductive logic programming. Communications of the ACM, 1999. 193Google Scholar
  10. 10.
    R.O. Rogers and D.B. Skillicorn. Using the BSP cost model for optimal parallel neural network training. Future Generation Computer Systems, 14:409–424, 1998. 193, 196CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • D. B. Skillicorn
    • 1
  1. 1.Department of Computing and Information ScienceQueen’s UniversityKingstonCanada

Personalised recommendations