Skip to main content

Socially-Enriched Multimedia Data Co-clustering

  • Chapter
  • First Online:
  • 633 Accesses

Part of the book series: Advanced Information and Knowledge Processing ((AI&KP))

Abstract

Heterogeneous data co-clustering is a commonly used technique for tapping the rich meta-information of multimedia web documents, including category, annotation, and description, for associative discovery. However, most co-clustering methods proposed for heterogeneous data do not consider the representation problem of short and noisy text and their performance is limited by the empirical weighting of the multimodal features. This chapter explains how to use the Generalized Heterogeneous Fusion Adaptive Resonance Theory (GHF-ART) for clustering large-scale web multimedia documents. Specifically, GHF-ART is designed to handle multimedia data with an arbitrarily rich level of meta-information. For handling short and noisy text, GHF-ART employs the representation and learning methods of PF-ART as described in Sect. 3.5, which identify key tags for cluster prototype modeling by learning the probabilistic distribution of tag occurrences of clusters. More importantly, GHF-ART incorporates an adaptive method for effective fusion of the multimodal features, which weights the features of multiple data sources by incrementally measuring the importance of feature modalities through the intra-cluster scatters. Extensive experiments on two web image datasets and one text document set have shown that GHF-ART achieves significantly better clustering performance and is much faster than many existing state-of-the-art algorithms. The content of this chapter is summarized and extended from [12] (©2014 IEEE. Reprinted, with permission, from [12]), and the Python codes of GHF-ART are available at https://github.com/Lei-Meng/GHF-ART.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://qwone.com/~jason/20Newsgroups/.

References

  1. Bekkerman R, Jeon J (2007) Multi-modal clustering for multimedia collections. In: CVPR, pp 1–8

    Google Scholar 

  2. Carpenter GA, Grossberg S, Reynolds J (1991) ARTMAP: supervised real-time learning and classification of nonstationary data by a self-organizing neural network. Neural Netw. 4(5):565–588

    Article  Google Scholar 

  3. Chen Y, Wang L, Dong M (2010) Non-negative matrix factorization for semisupervised heterogeneous data coclustering. TKDE 22(10):1459–1474

    Google Scholar 

  4. Chua T, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: a real-world web image database from national university of singapore. In: CIVR, pp 1–9

    Google Scholar 

  5. Gao B, Liu TY, Zheng X, Cheng QS, Ma WY (2005) Consistent bipartite graph co-partitioning for star-structured high-order heterogeneous data co-clustering. In: Proceedings of international conference on knowledge discovery and data mining, pp 41–50

    Google Scholar 

  6. He J, Tan AH, Tan CL, Sung SY (2003) On quantitative evaluation of clustering systems. Clustering and information retrieval. Kluwer Academic Publishers, Netherland, pp 105–133

    Google Scholar 

  7. Hu X, Sun N, Zhang C, Chua TS (2009) Exploiting internal and external semantics for the clustering of short texts using world knowledge. In: Proceedings of ACM conference on information and knowledge management, pp 919–928

    Google Scholar 

  8. Lang K (2005) Newsweeder: Learning to filter netnews. In: Proceedings international conference machine learning, pp 331–339

    Chapter  Google Scholar 

  9. Li X, Snoek CGM, Worring M (2008) Learning tag relevance by neighbor voting for social image retrieval. Proceedings of ACM multimedia, pp 180–187

    Google Scholar 

  10. Liu D, Hua X, Yang L, Wang M, Zhang, H (2009) Tag ranking. In: Proceedings of international conference on World Wide Web, pp 351–360

    Google Scholar 

  11. Long B, Wu X, Zhang Z, Yu PS (2006) Spectral clustering for multi-type relational data. In: ICML, pp 585–592

    Google Scholar 

  12. Meng L, Tan AH, Xu D (2014) Semi-supervised heterogeneous fusion for multimedia data co-clustering. IEEE Trans Knowl Data Eng 26(9):2293–2306

    Article  Google Scholar 

  13. Rege M, Dong M, Hua J (2008) Graph theoretical framework for simultaneously integrating visual and textual features for efficient web image clustering. In: Proceedings of international conference on World Wide Web, pp 317–326

    Google Scholar 

  14. Tan AH (1995) Adaptive resonance associative map. Neural Netw. 8(3):437–446

    Article  Google Scholar 

  15. Xu R, II DCW (2011) BARTMAP: A viable structure for biclustering. Neural Netw. 709–716

    Article  Google Scholar 

  16. Zhao Y, Karypis G (2001) Criterion functions for document clustering: experiments and analysis. Technical report, Department of computer science, University of Minnesota

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Meng .

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Meng, L., Tan, AH., Wunsch II, D.C. (2019). Socially-Enriched Multimedia Data Co-clustering. In: Adaptive Resonance Theory in Social Media Data Clustering. Advanced Information and Knowledge Processing. Springer, Cham. https://doi.org/10.1007/978-3-030-02985-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-02985-2_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-02984-5

  • Online ISBN: 978-3-030-02985-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics