Skip to main content

A Fast Data Preprocessing Procedure for Support Vector Regression

  • Conference paper
Intelligent Data Engineering and Automated Learning – IDEAL 2006 (IDEAL 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4224))

  • 1304 Accesses

Abstract

A fast data preprocessing procedure (FDPP) for support vector regression (SVR) is proposed in this paper. In the presented method, the dataset is firstly divided into several subsets and then K-means clustering is implemented in each subset. The clusters are classified by their group size. The centroids with small group size are eliminated and the rest centroids are used for SVR training. The relationships between the group sizes and the noisy clusters are discussed and simulations are also given. Results show that FDPP cleans most of the noises, preserves the useful statistical information and reduces the training samples. Most importantly, FDPP runs very fast and maintains the good regression performance of SVR.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Vapnik, V.: The Nature of Statistical Learning Theory. John Wiley, New York (1995)

    MATH  Google Scholar 

  • Wu, C.H.: Travel-Time Prediction with Support Vector Regression. IEEE Transactions on Intelligent Transportation Systems 5, 276–281 (2004)

    Article  Google Scholar 

  • Yang, H.Q., Chan, L.W., King, I.: Support Vector Machine Regression for Volatile Stock Market Prediction. In: Proceedings of the Third Intelligent Data Engineering and Automated Learning, pp. 391–396 (2002)

    Google Scholar 

  • Frie, T.T., Chistianini, V.N., Campbell, C.: The Kernel Adatron Algorithm: A Fast and Simple Learning Procedure for Support Vector Machines. In: Proceedings of the 15th International Conference of Machine Learning. Morgan Kaufmann, San Fransisco (1998)

    Google Scholar 

  • Vapnik, V.: Estimation of Dependence Based on Empirical Data. Springer, New York (1982)

    Google Scholar 

  • Joachims, T.: Making large-scale support vector machine learning practical. In: Advances in Kernel Methods: Support Vector Learning, pp. 169–184. MIT Press, Cambridge (1998)

    Google Scholar 

  • Mangasarian, O.L., Musicant, D.R.: Successive Overrelaxation for Support Vector Machines. IEEE Transactions on Neural Networks 10, 1032–1037 (1999)

    Article  Google Scholar 

  • Yu, H.J., Yang, J., Han, J.W., Li, X.L.: Making SVMs Scalable to Large Data Sets using Hierarchical Cluster Indexing. Data Mining and Knowledge Discovery (2005) (Published online)

    Google Scholar 

  • Wang, W.J., Xu, Z.B.: A Heuristic Training for Support Vector Regression. Neurocomputing 61, 259–275 (2004)

    Article  Google Scholar 

  • Quan, Y., Yang, J., Yao, L.X., Ye, C.Z.: Successive Overrelaxation for Support Vector Regression. Journal of Software on 15, 200–206 (2004)

    MATH  Google Scholar 

  • Platt, J.: Fast Training of Support Vector Machines Using Sequential Minimal Optimizationg. Advances in Kernel Methods: Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1998)

    Google Scholar 

  • Webb, A.R.: K-means clustering, Statistical Pattern Recognition, pp. 296–299. John Wiley & Sons, Inc., Chichester (2002)

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hao, Z., Wen, W., Yang, X., Lu, J., Zhang, G. (2006). A Fast Data Preprocessing Procedure for Support Vector Regression. In: Corchado, E., Yin, H., Botti, V., Fyfe, C. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2006. IDEAL 2006. Lecture Notes in Computer Science, vol 4224. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11875581_6

Download citation

  • DOI: https://doi.org/10.1007/11875581_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45485-4

  • Online ISBN: 978-3-540-45487-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics