Skip to main content

An Ontology-Based Latent Semantic Indexing Approach Using Long Short-Term Memory Networks

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10366))

Abstract

Nowadays, online data shows an astonishing increase and the issue of semantic indexing remains an open question. Ontologies and knowledge bases have been widely used to optimize performance. However, researchers are placing increased emphasis on internal relations of ontologies but neglect latent semantic relations between ontologies and documents. They generally annotate instances mentioned in documents, which are related to concepts in ontologies. In this paper, we propose an Ontology-based Latent Semantic Indexing approach utilizing Long Short-Term Memory networks (LSTM-OLSI). We utilize an importance-aware topic model to extract document-level semantic features and leverage ontologies to extract word-level contextual features. Then we encode the above two levels of features and match their embedding vectors utilizing LSTM networks. Finally, the experimental results reveal that LSTM-OLSI outperforms existing techniques and demonstrates deep comprehension of instances and articles.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: ACL (2011)

    Google Scholar 

  2. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S.: Tensorflow: large-scale machine learning on heterogeneous distributed systems (2016). arXiv preprint arXiv:1603.04467

  3. Alec, C., Reynaud-Delaître, C., Safar, B.: An ontology-driven approach for semantic annotation of documents with specific concepts. In: ISWC 2016 (2016)

    Google Scholar 

  4. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(January), 993–1022 (2003)

    MATH  Google Scholar 

  5. Borisov, A., Serdyukov, P., de Rijke, M.: Using metafeatures to increase the effectiveness of latent semantic models in web search. In: WWW, April 2016

    Google Scholar 

  6. Chebil, W., Soualmia, L.F., Omri, M.N., Darmoni, S.J.: Indexing biomedical documents with a possibilistic network. J. Assoc. Inf. Sci. Technol. 67, 928–941 (2015)

    Article  Google Scholar 

  7. Dai, A.M., Le, Q.V.: Semi-supervised sequence learning. In: NIPS, pp. 3079–3087 (2015)

    Google Scholar 

  8. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391 (1990)

    Article  Google Scholar 

  9. Fernández, M., Cantador, I., López, V., Vallet, D., Castells, P., Motta, E.: Semantically enhanced information retrieval: an ontology-based approach. Web Semant.: Sci. Serv. Agents World Wide Web 9(4), 434–452 (2011)

    Article  Google Scholar 

  10. Hahm, G.J., Yi, M.Y., Lee, J.H., Suh, H.W.: A personalized query expansion approach for engineering document retrieval. Adv. Eng. Inform. 28(4), 344–359 (2014)

    Article  Google Scholar 

  11. Gödert, W.: An ontology-based model for indexing and retrieval. J. Assoc. Inf. Sci. Technol. 67(3), 594–609 (2016)

    Article  Google Scholar 

  12. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  13. Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of SIGIR, pp. 50–57. ACM, August 1999

    Google Scholar 

  14. Lee, J., Min, J.K., Oh, A., Chung, C.W.: Effective ranking and search techniques for web resources considering semantic relationships. IPM 50(1), 132–155 (2014)

    Google Scholar 

  15. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Bizer, C.: DBpedia-a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)

    Google Scholar 

  16. Ma, B., Zhang, N., Liu, G., Li, L., Yuan, H.: Semantic search for public opinions on urban affairs: a probabilistic topic modeling-based approach. IPM 52(3), 430–445 (2016)

    Google Scholar 

  17. Mukherjee, S., Ajmera, J., Joshi, S.: Unsupervised approach for shallow domain ontology construction from corpus. In: Proceedings of WWW, pp. 349–350. ACM, April 2014

    Google Scholar 

  18. Newman, D., Koilada, N., Lau, J.H., Baldwin, T.: Bayesian text segmentation for index term identification and keyphrase extraction. In: COLING, pp. 2077–2092, December 2012

    Google Scholar 

  19. Posch, L.: Enriching ontologies with encyclopedic background knowledge for document indexing. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8797, pp. 537–544. Springer, Cham (2014). doi:10.1007/978-3-319-11915-1_36

    Google Scholar 

  20. Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28(1), 11–21 (1972)

    Article  Google Scholar 

  21. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NIPS, pp. 3104–3112 (2014)

    Google Scholar 

  22. Wang, Q., Xu, J., Li, H., Craswell, N.: Regularized latent semantic indexing. In: Proceedings of SIGIR, pp. 685–694. ACM, July 2011

    Google Scholar 

  23. Wei, X., Croft, W.B.: LDA-based document models for ad-hoc retrieval. In: Proceedings of SIGIR, pp. 178–185. ACM, August 2006

    Google Scholar 

Download references

Acknowledgments

This research is supported by National Natural Science Foundation of China (Grant No. 61375054), Natural Science Foundation of Guangdong Province Grant No. 2014A030313745 Basic Scientific Research Program of Shenzhen City Grant No. JCYJ20160331184440545), and Cross fund of Graduate School at Shenzhen, Tsinghua University (Grant No. JC20140001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hai-Tao Zheng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Ma, N., Zheng, HT., Xiao, X. (2017). An Ontology-Based Latent Semantic Indexing Approach Using Long Short-Term Memory Networks. In: Chen, L., Jensen, C., Shahabi, C., Yang, X., Lian, X. (eds) Web and Big Data. APWeb-WAIM 2017. Lecture Notes in Computer Science(), vol 10366. Springer, Cham. https://doi.org/10.1007/978-3-319-63579-8_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63579-8_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63578-1

  • Online ISBN: 978-3-319-63579-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics