Advertisement

Semantic Search Using Computer Science Ontology Based on Edge Counting and N-Grams

  • Thanyaporn Boonyoung
  • Anirach Mingkhwan
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 265)

Abstract

Traditional Information Retrieval systems (keyword-based search) suffer several problems. For instance, synonyms or hyponym are not taken into consideration when retrieving documents that are important for a user’s query. This study adopts an ontology of computer science and proposes an ontology indexing weight based on Wu and Palmer’s edge counting measure for solving this problem. This paper used the N-grams method for computing a family of word similarity. The study also compares the subsumption weight between Hliaoutakis and Nicola’s weight and query keywords (Decision Making, Genetic Algorithm, Machine Learning, Heuristic). A probability value (p-values) from the t-test (p = 0.105) is higher 0.05 and indicates no significant evidence, of not differences between both weights methods. The experimental results show that the document similarity score between a user’s query and the paper suggests that the new measures were effectively ranked.

Keywords

Computer Science Ontology Semantic Search Ontology Indexing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Lai, L.-F., Wu, C.-C., Lin, P.-Y.: Developing a Fuzzy Search Engine Based on Fuzzy Ontology and Semantic Search. In: IEEE International Conference on Fuzzy, pp. 2684–2689. IEEE Press, Taipei (2011)Google Scholar
  2. 2.
    Hliaoutakis, A., Varelas, G., Voutsakis, E., Petrakis, E.G.M., Milios, E.: Information Retrieval by Semantic Similarity. International Journal on Semantic Web and Information Systems (IJSWIS) 2(3) (2006)CrossRefGoogle Scholar
  3. 3.
    Varelas, G., Voutsakis, E., Raftopoulou, P., Petrakis, E.G.M., Milios, E.: Semantic Similarity Methods in Wordnet and their Application to Information Retrieval on the web. In: ACM International Workshop on Web Information and Data Management, pp. 10–130. ACM, Bremen (2005)Google Scholar
  4. 4.
    Shenoy, K.M., Shet, K.C., Acharya, U.D.: A New Similarity Measure for Taxonomy based on Edge Counting. International Journal of Web & Semantic Technology (JJWesT) 3(4), 23–30 (2012)CrossRefGoogle Scholar
  5. 5.
    Schwering, A., Kuhn, W.: A Hybrid Semantic Similarity Measure for Spatial Information Retrieval. An Interdisciplinary Journal of Spatial Cognition & Computation 9(1), 30–63 (2009)CrossRefGoogle Scholar
  6. 6.
    Fernandez, M., Cantador, I., Lopez, V., Vallet, D., Castells, P., Motta, E.: Semantically enhanced Information Retrieval: An ontology-based approach. Journal of Web Semantics: Science, Services and Agents on the World Wide Web 9, 434–452 (2010)CrossRefGoogle Scholar
  7. 7.
    Weng, S.-S., Tsai, H.-J., Hsu, C.-H.: Ontology construction for information classification. Journal of Expert Systems with Applications 31(1), 1–12 (2006)CrossRefGoogle Scholar
  8. 8.
    John, T.: What is Semantic Search and how it works with Google search, http://www.techulator.com/resources/5933-What-Semantic-Search.aspx
  9. 9.
    Batet, M., Sanchez, D., Valls, A.: An Ontology-based measure to compute semantic similarity in biomedicine. Journal of Biomedical Informatics 44, 118–125 (2011)CrossRefGoogle Scholar
  10. 10.
    Wu, Z., Palmer, M.: Verb semantics and lexical selection. In: Proceeding of the 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, New Mexico, vol. 13, pp. 133–138 (1994)Google Scholar
  11. 11.
    Kondrak, G.: N-Gram Similarity and Distance. In: Consens, M.P., Navarro, G. (eds.) SPIRE 2005. LNCS, vol. 3772, pp. 115–126. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  12. 12.
    Sembok, T.M., Bakar, Z.A.: Effectiveness of Stemming and N-grams String Similarity Matching on Malay Documents. International Journal of Applied Mathematics and Informatics 5(3), 208–215 (2011)Google Scholar
  13. 13.
    Stoke, N.: Applications of Lexical Cohesion Analysis in the Topic Detection and Tracking Domain. A thesis submitted for the degree of Doctor of Philosophy in Computer Science Department of Computer Science Faculty of Science National University of Ireland, Dublin (2004)Google Scholar
  14. 14.
    Watthananon, J., Mingkhwan, A.: A Comparative Efficiency of Correlation Plot Data Classification. The Journal of KMUTNB 22(1) (2012)Google Scholar
  15. 15.
    Lertmahakrit, W., Mingkhoan, A.: The Innovation of Multiple Relations Information Retrieval. The Journal of KMUTNB 20(3) (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Information TechnologyKing Mongkut’s University of Technology North BangkokBangkokThailand
  2. 2.Industrial and Technology ManagementKing Mongkut’s University of Technology North BangkokBangkokThailand

Personalised recommendations