Parallel Predictor Generation

Skillicorn, D. B.

doi:10.1007/3-540-46502-2_9

D. B. Skillicorn³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1759))

706 Accesses
1 Citations

Abstract

Classification and regression are fundamental data mining techniques. The goal of such techniques is to build predictors based on a training dataset and use them to predict the properties of new data. For a wide range of techniques, combining predictors built on samples from the training dataset provides lower error rates, faster construction, or both, than a predictor built from the entire training dataset. This provides a natural parallelization strategy in which predictors based on samples are built independently and hence concurrently. We discuss the performance implications for two subclasses: those in which predictors are independent, and those in which knowing a set of predictors reduces the difficulty of finding a new one.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

C. Bishop. Neural networks for pattern recognition. Oxford University Press, 1995. 193
Google Scholar
P.S. Bradley, U.M. Fayyad, and O.L. Mangasarian. Mathematical programming for data mining: Formulations and challenges. INFORMS Journal of Computing, 11:217–238, 1999. 190
Article MATH MathSciNet Google Scholar
L. Breiman. Bagging predictors. Machine Learning, 24:123–140, 1996. 192, 194
MATH MathSciNet Google Scholar
L. Breiman. Arcing classifiers. Annals of Statistics, 26(3):801–849, 1998. 194
Article MATH MathSciNet Google Scholar
L. Breiman. Pasting bites together for prediction in large data sets and on-line. Machine Learning, 36(1&2), 1999. 192
Google Scholar
L. Breiman and N. Shang. Born again trees. Technical report, Department of Statistics, University of California, Berkeley, 1996. 192
Google Scholar
Y. Freund and R. Schapire. Experiments with a new boosting algorithm. In Proceedings of the 13th International Conference on Machine Learning, pages 148–156, 1996. 194
Google Scholar
S. Muggleton. Inductive logic programming: Issues, results and the LLL challenge. Artificial Intelligence, 1999. 193
Google Scholar
S. Muggleton. Scientific knowledge discovery using inductive logic programming. Communications of the ACM, 1999. 193
Google Scholar
R.O. Rogers and D.B. Skillicorn. Using the BSP cost model for optimal parallel neural network training. Future Generation Computer Systems, 14:409–424, 1998. 193, 196
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing and Information Science, Queen’s University, Kingston, Canada
D. B. Skillicorn

Authors

D. B. Skillicorn
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY, 12180, USA
Mohammed J. Zaki
K55/B1, IBM Almaden Research Center, 650 Harry Road, San Jose, CA, 95120, USA
Ching-Tien Ho

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Skillicorn, D.B. (2002). Parallel Predictor Generation. In: Zaki, M.J., Ho, CT. (eds) Large-Scale Parallel Data Mining. Lecture Notes in Computer Science(), vol 1759. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46502-2_9

Download citation

DOI: https://doi.org/10.1007/3-540-46502-2_9
Published: 17 May 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67194-7
Online ISBN: 978-3-540-46502-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics