Skip to main content

Parallel Predictor Generation

  • Conference paper
  • First Online:
Large-Scale Parallel Data Mining

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1759))

Abstract

Classification and regression are fundamental data mining techniques. The goal of such techniques is to build predictors based on a training dataset and use them to predict the properties of new data. For a wide range of techniques, combining predictors built on samples from the training dataset provides lower error rates, faster construction, or both, than a predictor built from the entire training dataset. This provides a natural parallelization strategy in which predictors based on samples are built independently and hence concurrently. We discuss the performance implications for two subclasses: those in which predictors are independent, and those in which knowing a set of predictors reduces the difficulty of finding a new one.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. C. Bishop. Neural networks for pattern recognition. Oxford University Press, 1995. 193

    Google Scholar 

  2. P.S. Bradley, U.M. Fayyad, and O.L. Mangasarian. Mathematical programming for data mining: Formulations and challenges. INFORMS Journal of Computing, 11:217–238, 1999. 190

    Article  MATH  MathSciNet  Google Scholar 

  3. L. Breiman. Bagging predictors. Machine Learning, 24:123–140, 1996. 192, 194

    MATH  MathSciNet  Google Scholar 

  4. L. Breiman. Arcing classifiers. Annals of Statistics, 26(3):801–849, 1998. 194

    Article  MATH  MathSciNet  Google Scholar 

  5. L. Breiman. Pasting bites together for prediction in large data sets and on-line. Machine Learning, 36(1&2), 1999. 192

    Google Scholar 

  6. L. Breiman and N. Shang. Born again trees. Technical report, Department of Statistics, University of California, Berkeley, 1996. 192

    Google Scholar 

  7. Y. Freund and R. Schapire. Experiments with a new boosting algorithm. In Proceedings of the 13th International Conference on Machine Learning, pages 148–156, 1996. 194

    Google Scholar 

  8. S. Muggleton. Inductive logic programming: Issues, results and the LLL challenge. Artificial Intelligence, 1999. 193

    Google Scholar 

  9. S. Muggleton. Scientific knowledge discovery using inductive logic programming. Communications of the ACM, 1999. 193

    Google Scholar 

  10. R.O. Rogers and D.B. Skillicorn. Using the BSP cost model for optimal parallel neural network training. Future Generation Computer Systems, 14:409–424, 1998. 193, 196

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Skillicorn, D.B. (2002). Parallel Predictor Generation. In: Zaki, M.J., Ho, CT. (eds) Large-Scale Parallel Data Mining. Lecture Notes in Computer Science(), vol 1759. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46502-2_9

Download citation

  • DOI: https://doi.org/10.1007/3-540-46502-2_9

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67194-7

  • Online ISBN: 978-3-540-46502-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics