Automatic Document Topic Identification Using Social Knowledge Network

Hassan, Mostafa M.; Karray, Fakhreddine; Kamel, Mohamed S.

doi:10.1007/978-1-4614-7163-9_352-1

Automatic Document Topic Identification Using Social Knowledge Network

Mostafa M. Hassan³,
Fakhreddine Karray⁴ &
Mohamed S. Kamel⁴

Living reference work entry
First Online: 16 May 2017

115 Accesses

Synonyms

Automatic document topic identification; Clustering; Ontology; Social knowledge network; Wikipedia

Glossary

ADTI:: Stands for automatic document topic identification
Ontology:: “A model for describing the world, that consists of a set of types (concepts), properties, and relationship types” (Garshol 2004)
SKN:: Stands for social knowledge network
WHO:: Stands for Wikipedia Hierarchical Ontology
TF-IDF:: A term weighting methodology that is commonly used in text mining and in information retrieval. It stands for term frequency-inverse document frequency
hi5:: An online social networking website
RDF:: Stands for Resource Description Framework. It is a method of representing information to facilitate the data interchange on the Web
ASR:: Stands for automatic speech recognition
NMI:: Stands for normalized mutual information. It is a well-known document clustering performance measure
NMF:: Stands for nonnegative matrix factorization. Nonnegative matrix factorization is a family of algorithms...

This is a preview of subscription content, log in via an institution.

References

Auer S, Lehmann J (2007) What have Innsbruck and Leipzig in common? Extracting semantics from wiki content. In: Franconi E, Kifer M, May W (eds) The semantic web: research and applications. Springer, Berlin/New York, pp 503–517
Chapter Google Scholar
Coursey K, Mihalcea R (2009) Topic identification using Wikipedia graph centrality. In: Proceedings of human language technologies: the 2009 annual conference of the North American chapter of the Association for Computational Linguistics, companion volume: short papers, Association for Computational Linguistics, Boulder, pp 117–120
Google Scholar
Coursey K, Mihalcea R, Moen W (2009) Using encyclopedic knowledge for automatic topic identification. In: Proceedings of the thirteenth conference on computational natural language learning, Association for Computational Linguistics, Boulder, pp 210–218
Google Scholar
European Travel Commission (2013) Social networking and UGC. http://www.newmediatrendwatch.com/world-overview/137-social-networking-and-ugc, June 2013. Online; Accessed 25 Oct 2013
Garshol L (2004) Metadata? Thesauri? Taxonomies? Topic maps! Making sense of it all. J Inf Sci 30(4):378
Article Google Scholar
Giles J (2005) Internet encyclopaedias go head to head. Nature 438(7070):900–901
Article Google Scholar
Hassan M (2013) Automatic document topic identification using hierarchical ontology extracted from human background knowledge. PhD dissertation, University of Waterloo
Google Scholar
Huynh D, Cao T, Pham P, Hoang T (2009) Using hyperlink texts to improve quality of identifying document topics based on Wikipedia. In: International conference on knowledge and systems engineering, 2009 (KSE’09), IEEE, Hanoi, pp 249–254
Google Scholar
Janik M, Kochut K (2008a) Training-less Ontology-based Text Categorization. In: workshop on exploiting semantic annotations in information retrieval (ESAIR 2008) at the 30th European Conference on Information Retrieval, ECIR
Google Scholar
Janik M, Kochut K (2008b) Wikipedia in action: ontological knowledge in text categorization. In: IEEE international conference on semantic computing, 2008, IEEE, Santa Clara, pp 268–275
Google Scholar
Korfiatis NT, Poulos M, Bokos G (2006) Evaluating authoritative sources using social networks: an insight from Wikipedia. Online Inf Rev 30(3):252–262
Article Google Scholar
Kuhn HW (2005) The Hungarian method for the assignment problem. Nav Res Logist 52(1):7–21
Article Google Scholar
Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–137
Article MathSciNet MATH Google Scholar
Medelyan O (2009) Human-competitive automatic topic indexing. PhD dissertation, The University of Waikato
Google Scholar
Medelyan O, Witten I, Milne D (2008) Topic indexing with Wikipedia. In: Proceedings of AAAI workshop on Wikipedia and artificial intelligence: an evolving synergy, AAAI, Chicago, pp 19–24
Google Scholar
Ng A, Jordan M, Weiss Y et al (2002) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 2:849–856
Google Scholar
Popescul A, Ungar LH (2000) Automatic labeling of document clusters. http://citeseer.ist.psu.edu/viewdoc/download? doi:10.1.1.33.141&rep=rep1&type=pdf
Schönhofen P (2009) Identifying document topics using the Wikipedia category network. Web Intell Agent Syst 7(2):195–207
Google Scholar
Xu W, Gong Y (2004) Document clustering by concept factorization. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, ACM, Sheffield, pp 202–209
Google Scholar
Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, ACM, Toronto, pp 267–273
Google Scholar
Zhao Y, Karypis G, Fayyad U (2005) Hierarchical clustering algorithms for document datasets. Data Min Knowl Discov 10(2):141–168
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Sandvine Inc., Waterloo, ON, Canada
Mostafa M. Hassan
Department of Electrical and Computer Engineering, Centre for Pattern Analysis and Machine Intelligence (CPAMI), University of Waterloo, Waterloo, ON, Canada
Fakhreddine Karray & Mohamed S. Kamel

Authors

Mostafa M. Hassan
View author publications
You can also search for this author in PubMed Google Scholar
Fakhreddine Karray
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed S. Kamel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mostafa M. Hassan .

Editor information

Editors and Affiliations

Computer Science, University of Calgary, Calgary, Alberta, Canada
Reda Alhajj
University of Calgary Computer Science, Calgary, Canada
Jon Rokne

Section Editor information

Department of Electrical and Computer Engineering, Centre for Pattern Analysis and Machine Intelligence (CPAMI), University of Waterloo, Waterloo, ON, Canada
Fakhreddine Karray

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Hassan, M.M., Karray, F., Kamel, M.S. (2017). Automatic Document Topic Identification Using Social Knowledge Network. In: Alhajj, R., Rokne, J. (eds) Encyclopedia of Social Network Analysis and Mining. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7163-9_352-1

Download citation

DOI: https://doi.org/10.1007/978-1-4614-7163-9_352-1
Received: 27 March 2017
Accepted: 24 April 2017
Published: 16 May 2017
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-7163-9
Online ISBN: 978-1-4614-7163-9
eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics