Skip to main content

Semi-supervised Clustering Ensemble Evolved by Genetic Algorithm for Web Video Categorization

  • Conference paper
Advanced Data Mining and Applications (ADMA 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8347))

Included in the following conference series:

Abstract

Genetic Algorithms (GAs) have been widely used in optimization problems for their high ability in seeking better and acceptable solutions within limited time. Clustering ensemble has emerged as another flavor of optimal solutions for generating more stable and robust partition from existing clusters. GAs have proved a major contribution to find consensus cluster partitions during clustering ensemble. Currently, web video categorization has been an ever challenging research area with the popularity of the social web. In this paper, we propose a framework for web video categorization using their textual features, video relations and web support. There are three contributions in this research work. First, we expand the traditional Vector Space Model (VSM) in a more generic manner as Semantic VSM (S-VSM) by including the semantic similarity between the feature terms. This new model has improved the clustering quality in terms of compactness (high intra-cluster similarity) and clearness (low inter-cluster similarity). Second, we optimize the clustering ensemble process with the help of GA using a novel approach of the fitness function. We define a new measure, Pre-Paired Percentage (PPP), to be used as the fitness function during the genetic cycle for optimization of clustering ensemble process. Third, the most important and crucial step of the GA is to define the genetic operators, crossover and mutation. We express these operators by an intelligent mechanism of clustering ensemble. This approach has produced more logical offspring solutions. Above stated all three contributions have shown remarkable results in their corresponding areas. Experiments on real world social-web data have been performed to validate our new incremental novelties.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. YouTube, http://www.youtube.com

  2. Brezeale, D., Cook, D.J.: Automatic video classification: A survey of the literature. IEEE Transactions on Systems, Man, and Cybernetics, 416–430 (2008)

    Google Scholar 

  3. Jiang, Y.-G., Ngo, C.-W., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: Proceedings of 6th ACM International Conference on Image and Video Retrieval, pp. 494–501. ACM, New York (2007)

    Chapter  Google Scholar 

  4. Mahmood, A., Li, T., Yang, Y., Wang, H., Afzal, M.: Semi-supervised Clustering Ensemble for Web Video Categorization. In: Zhou, Z.-H., Roli, F., Kittler, J. (eds.) MCS 2013. LNCS, vol. 7872, pp. 190–200. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  5. WordNet by Princeton, http://wordnet.princeton.edu

  6. Hong, Y., Kwong, S.: To combine steady-state genetic algorithm and ensemble learning for data clustering. Pattern Recogn. Lett. J. 29(9), 1416–1423 (2008)

    Article  Google Scholar 

  7. Ramachandran, C., Malik, R., Jin, X., Gao, J., Nahrstedt, K., Han, J.: Videomule: A consensus learning approach to multi-label classification from noisy user-generated videos. In: Proceedings of 17th ACM International Conference on Multimedia, pp. 721–724. ACM, New York (2009)

    Google Scholar 

  8. Schindler, G., Zitnick, L., Brown, M.: Internet video category recognition. In: Proceedings of Computer Vision and Pattern Recognition Workshops, Atlanta. Georgia Institute of Technology, pp. 1–7 (2008)

    Google Scholar 

  9. Zanetti, S., Zelnik-Manor, L., Perona, P.: A walk through the web’s video clips. In: Proceedings of Computer Vision and Pattern Recognition Workshops, Pasadena. California Institute of Technology, pp. 1–8 (2008)

    Google Scholar 

  10. Wu, X., Zhao, W.L., Ngo, C.-W.: Towards google challenge: combining contextual and social information for web video categorization. In: Proceedings of 17th ACM International Conference on Multimedia, pp. 1109–1110. ACM, New York (2009)

    Google Scholar 

  11. Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained k-means clustering with background knowledge. In: International Conference on Machine Learning, New York, pp. 577–584 (2001)

    Google Scholar 

  12. Zhou, Z.-H., Li, M.: Semi-supervised learning by disagreement. Knowledge and Information Systems, 415–439 (2010)

    Google Scholar 

  13. Zhou, Z.-H.: Ensemble Methods: Foundations and Algorithms. Chapman & Hall/CRC, Boca Raton (2012)

    Google Scholar 

  14. Wang, H., et al.: Semi-Supervised Cluster Ensemble Model Based on Bayesian Network. Journal of Software 21(11), 2814–2825 (2010) (in Chinese)

    Article  Google Scholar 

  15. Yang, Y., Tan, W., Li, T., Ruan, D.: Consensus Clustering Based on Constrained Self-Organizing Map and Improved Cop-Kmeans Ensemble in Intelligent Decision Support Systems. Knowledge-Based Systems 32, 101–115 (2012)

    Article  Google Scholar 

  16. Yang, Y., Wang, H., Lin, C., Zhang, J.: Semi-supervised clustering ensemble based on multi-ant colonies algorithm. In: Li, T., Nguyen, H.S., Wang, G., Grzymala-Busse, J., Janicki, R., Hassanien, A.E., Yu, H. (eds.) RSKT 2012. LNCS (LNAI), vol. 7414, pp. 302–309. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  17. Leacock, C., Chodorow, M.: Combining local context and WordNet similarity for word sense identification. WordNet: An Electronic Lexical Database 49, 265–283 (1998)

    Google Scholar 

  18. Wu, Z., Palmer, M.: Verb semantics and lexical selection. In: Proceedings of 32nd Annual Meeting of the Association for Computational Linguistics, pp. 133–138 (1994)

    Google Scholar 

  19. Zhang, Z., Otterbacher, J., Radev, D.: Learning cross-document structural relationships using boosting. In: Proceedings of the 12th International Conference on Information and Knowledge Management, pp. 124–130 (2003)

    Google Scholar 

  20. McCarthy, D., Koeling, R., Weeds, J.: Ranking WordNet senses automatically. Technical Report CSRP 569, University of Sussex (2004)

    Google Scholar 

  21. Baldwin, T., Bannard, C., Tanaka, T., Widdows, D.: An empirical model of multiword expression decomposability. In: Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, vol. 18, pp. 89–96 (2003)

    Google Scholar 

  22. Azimi, J., Mohammadi, M., Movaghar, A., Analoui, M.: Clustering ensembles using genetic algorithm. In: Proceedings of the International Workshop on Computer Architecture for Machine Perception and Sensing, pp. 119–123. IEEE (2007)

    Google Scholar 

  23. Yoon, H.-S., Lee, S.-H., Cho, S.-B., Kim, J.H.: Integration analysis of diverse genomic data using multi-clustering results. In: Maglaveras, N., Chouvarda, I., Koutkias, V., Brause, R. (eds.) ISBMDA 2006. LNCS (LNBI), vol. 4345, pp. 37–48. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  24. Ramanathan, K., Guan, S.-U.: Recursive self-organizing maps with hybrid clustering. In: Proceedings of the IEEE Conference on Cybernetics and Intelligent Systems, pp. 1–6 (2006)

    Google Scholar 

  25. Faceli, K., de Carvalho, A.C.P.L.F., de Souto, M.C.P.: Multi-objective clustering ensemble with prior knowledge. In: Sagot, M.-F., Walter, M.E.M.T. (eds.) BSB 2007. LNCS (LNBI), vol. 4643, pp. 34–45. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  26. Ozyer, T., Alhajj, R.: Parallel clustering of high dimensional data by integrating multi-objective genetic algorithm with divide and conquer. Applied Intelligence 31, 318–331 (2009)

    Article  Google Scholar 

  27. Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  28. Strehl, A., Ghosh, J.: Cluster ensembles – a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)

    MathSciNet  Google Scholar 

  29. Salton, G., Buckley, C.: Term-weighing approache sin automatic text retrieval. Information Processing & Management 24(5), 513–523 (1988)

    Article  Google Scholar 

  30. Cao, J., Zhang, Y.-D., Song, Y.-C., Chen, Z.-N., Zhang, X., Li, J.-T.: MCG-WEBV: A Benchmark Dataset for Web Video Analysis. Technical Report, ICT-MCG-09-001 (2009)

    Google Scholar 

  31. Zhou, Z., Tang, W.: Clusterer ensemble. Knowledge Based System 19(1), 77–83 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mahmood, A., Li, T., Yang, Y., Wang, H. (2013). Semi-supervised Clustering Ensemble Evolved by Genetic Algorithm for Web Video Categorization. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds) Advanced Data Mining and Applications. ADMA 2013. Lecture Notes in Computer Science(), vol 8347. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53917-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-53917-6_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-53916-9

  • Online ISBN: 978-3-642-53917-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics