Skip to main content

Reduced Large Datasets by Fuzzy C-Mean Clustering Using Minimal Enclosing Ball

  • Conference paper
Management Intelligent Systems

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 171))

  • 1011 Accesses

Abstract

Minimal Enclosing Ball (MEB) is a spherically shaped boundary around a normal dataset, it is used to separate this set from abnormal data. MEB has a limitation for dealing with a large dataset in which computational load drastically increases as training data size becomes large. To handle this problem in huge dataset used in different domains, we propose two approaches using Fuzzy C-mean clustering method. These approaches find the concentric balls with minimum volume of data description to reduce the chance of accepting abnormal data that contain most of the training samples. Our method uses a divide-and-conquer strategy; trains each decomposed sub-problems to get support vectors and retrains with the support vectors to find a global data description of a whole target class. Our study is experimented on speech information to eliminate all noise data and reducing time training. For this, the training data, learned by Support Vector Machines (SVMs), is partitioned among several data sources. Computation of such SVMs can be achieved by finding a core-set for the image of the data. Numerical experiments on some real-world datasets verify the usefulness of our approaches for data mining.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)

    Google Scholar 

  2. Kocsor, A., Kwork, J., Tsang, I.: Simpler core vector machines with enclosing balls. In: ICML 2007, pp. 911–918. ACM (2007)

    Google Scholar 

  3. Cheung, P.M., Kwok, J., Tsang, I.: Core vector machines: Fast SVM training on very large datasets. Journal of Machine Learning Research (6), 363–392 (2005)

    MathSciNet  MATH  Google Scholar 

  4. Tay, F.E.H., Cao, L.J.: Application of support vector machines in financial time seriesforecasting. Omega, 309–317 (2001)

    Google Scholar 

  5. Lai, K.K., Yu, L., Huang, W., Wang, S.: A Novel Support Vector Machine Metamodel for Business Risk Identification. In: Yang, Q., Webb, G. (eds.) PRICAI 2006. LNCS (LNAI), vol. 4099, pp. 980–984. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Bãdoiu, M., Clarkson, K.L.: Optimal core-sets for balls. Computing Geometry Theory Application 1(40), 14–22 (2008)

    Article  Google Scholar 

  7. Asharaf, S., Murty, M., Shevade, S.K.: Multiclass core vector machine. In: ICML 2007, pp. 41–48. ACM (2007)

    Google Scholar 

  8. Al-Zoubi, M.B., Hudaib, A., Al-Shboul, B.: A fastfuzzyclusteringalgorithm. In: Proceedings of the 6th WSEAS Int. Conf. on Artificial Intelligence, Knowledge Engineering and Data Bases, Corfu Island, Greece, pp. 28–32 (2007)

    Google Scholar 

  9. Alkanhal, M., Alghamdi, M., Muzaffar, Z.: SpeakerVerification-based on SaudiAcceted Arabic Database. In: ISSPA 2007, 9th International Symposium on Signal Processing and its Applications, Sharjah, United Arab Emirate, pp. 1–4 ( February 2007)

    Google Scholar 

  10. Speaker corpus in, http://www.ll.mit.edu/mission/communication/ist/corpora/SpeechCorpora.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lachachi Nour-Eddine .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nour-Eddine, L., Abdelkader, A. (2012). Reduced Large Datasets by Fuzzy C-Mean Clustering Using Minimal Enclosing Ball. In: Casillas, J., Martínez-López, F., Corchado Rodríguez, J. (eds) Management Intelligent Systems. Advances in Intelligent Systems and Computing, vol 171. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30864-2_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-30864-2_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-30863-5

  • Online ISBN: 978-3-642-30864-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics