Skip to main content

Automatic Document Topic Identification Using Social Knowledge Network

  • Living reference work entry
  • First Online:
  • 115 Accesses

Synonyms

Automatic document topic identification; Clustering; Ontology; Social knowledge network; Wikipedia

Glossary

ADTI:

Stands for automatic document topic identification

Ontology:

“A model for describing the world, that consists of a set of types (concepts), properties, and relationship types” (Garshol 2004)

SKN:

Stands for social knowledge network

WHO:

Stands for Wikipedia Hierarchical Ontology

TF-IDF:

A term weighting methodology that is commonly used in text mining and in information retrieval. It stands for term frequency-inverse document frequency

hi5:

An online social networking website

RDF:

Stands for Resource Description Framework. It is a method of representing information to facilitate the data interchange on the Web

ASR:

Stands for automatic speech recognition

NMI:

Stands for normalized mutual information. It is a well-known document clustering performance measure

NMF:

Stands for nonnegative matrix factorization. Nonnegative matrix factorization is a family of algorithms...

This is a preview of subscription content, log in via an institution.

References

  • Auer S, Lehmann J (2007) What have Innsbruck and Leipzig in common? Extracting semantics from wiki content. In: Franconi E, Kifer M, May W (eds) The semantic web: research and applications. Springer, Berlin/New York, pp 503–517

    Chapter  Google Scholar 

  • Coursey K, Mihalcea R (2009) Topic identification using Wikipedia graph centrality. In: Proceedings of human language technologies: the 2009 annual conference of the North American chapter of the Association for Computational Linguistics, companion volume: short papers, Association for Computational Linguistics, Boulder, pp 117–120

    Google Scholar 

  • Coursey K, Mihalcea R, Moen W (2009) Using encyclopedic knowledge for automatic topic identification. In: Proceedings of the thirteenth conference on computational natural language learning, Association for Computational Linguistics, Boulder, pp 210–218

    Google Scholar 

  • European Travel Commission (2013) Social networking and UGC. http://www.newmediatrendwatch.com/world-overview/137-social-networking-and-ugc, June 2013. Online; Accessed 25 Oct 2013

  • Garshol L (2004) Metadata? Thesauri? Taxonomies? Topic maps! Making sense of it all. J Inf Sci 30(4):378

    Article  Google Scholar 

  • Giles J (2005) Internet encyclopaedias go head to head. Nature 438(7070):900–901

    Article  Google Scholar 

  • Hassan M (2013) Automatic document topic identification using hierarchical ontology extracted from human background knowledge. PhD dissertation, University of Waterloo

    Google Scholar 

  • Huynh D, Cao T, Pham P, Hoang T (2009) Using hyperlink texts to improve quality of identifying document topics based on Wikipedia. In: International conference on knowledge and systems engineering, 2009 (KSE’09), IEEE, Hanoi, pp 249–254

    Google Scholar 

  • Janik M, Kochut K (2008a) Training-less Ontology-based Text Categorization. In: workshop on exploiting semantic annotations in information retrieval (ESAIR 2008) at the 30th European Conference on Information Retrieval, ECIR

    Google Scholar 

  • Janik M, Kochut K (2008b) Wikipedia in action: ontological knowledge in text categorization. In: IEEE international conference on semantic computing, 2008, IEEE, Santa Clara, pp 268–275

    Google Scholar 

  • Korfiatis NT, Poulos M, Bokos G (2006) Evaluating authoritative sources using social networks: an insight from Wikipedia. Online Inf Rev 30(3):252–262

    Article  Google Scholar 

  • Kuhn HW (2005) The Hungarian method for the assignment problem. Nav Res Logist 52(1):7–21

    Article  Google Scholar 

  • Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–137

    Article  MathSciNet  MATH  Google Scholar 

  • Medelyan O (2009) Human-competitive automatic topic indexing. PhD dissertation, The University of Waikato

    Google Scholar 

  • Medelyan O, Witten I, Milne D (2008) Topic indexing with Wikipedia. In: Proceedings of AAAI workshop on Wikipedia and artificial intelligence: an evolving synergy, AAAI, Chicago, pp 19–24

    Google Scholar 

  • Ng A, Jordan M, Weiss Y et al (2002) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 2:849–856

    Google Scholar 

  • Popescul A, Ungar LH (2000) Automatic labeling of document clusters. http://citeseer.ist.psu.edu/viewdoc/download? doi:10.1.1.33.141&rep=rep1&type=pdf

  • Schönhofen P (2009) Identifying document topics using the Wikipedia category network. Web Intell Agent Syst 7(2):195–207

    Google Scholar 

  • Xu W, Gong Y (2004) Document clustering by concept factorization. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, ACM, Sheffield, pp 202–209

    Google Scholar 

  • Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, ACM, Toronto, pp 267–273

    Google Scholar 

  • Zhao Y, Karypis G, Fayyad U (2005) Hierarchical clustering algorithms for document datasets. Data Min Knowl Discov 10(2):141–168

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mostafa M. Hassan .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media LLC

About this entry

Cite this entry

Hassan, M.M., Karray, F., Kamel, M.S. (2017). Automatic Document Topic Identification Using Social Knowledge Network. In: Alhajj, R., Rokne, J. (eds) Encyclopedia of Social Network Analysis and Mining. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7163-9_352-1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-7163-9_352-1

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-7163-9

  • Online ISBN: 978-1-4614-7163-9

  • eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics