Skip to main content

Similarity Evaluation with Wikipedia Features

  • Conference paper
  • First Online:
  • 557 Accesses

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 581))

Abstract

Wikipedia provides rich semantic features e.g., text, link, and category structure. These features can be used to compute semantic similarity (SS) between words or concepts. However, some existing Wikipedia-based SS methods either rely on a single feature or do not incorporate the underlying statistics of different features. We propose novel vector representations of Wikipedia concepts by integrating their multiple semantic features. We utilize the available statistics of these features in Wikipedia to compute their weights. These weights signify the contribution of each feature in similarity evaluation according to its level of importance. The experimental evaluation shows that our new methods obtain better results on SS datasets in comparison with state-of-the-art SS methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Agirre, E., Alfonseca, E., Hall, K., Kravalova, J., Paşca, M., Soroa, A.: A study on similarity and relatedness using distributional and wordnet-based approaches. In: Proceedings of Human Language Technologies, pp. 19–27 (2009)

    Google Scholar 

  2. Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: IJcAI, vol. 7, pp. 1606–1611 (2007)

    Google Scholar 

  3. Hill, F., Reichart, R., Korhonen, A.: SimLex-999: evaluating semantic models with (genuine) similarity estimation. Comput. Linguist. 41(4), 665–695 (2015)

    Article  MathSciNet  Google Scholar 

  4. Hussain, M.J., Wasti, S.H., Huang, G., Wei, L., Jiang, Y., Tang, Y.: An approach for measuring semantic similarity between Wikipedia concepts using multiple inheritances. Inf. Process. Manag. 57(3), 102188 (2020)

    Article  Google Scholar 

  5. Jiang, Y., Bai, W., Zhang, X., Hu, J.: Wikipedia-based information content and semantic similarity computation. Inf. Process. Manag. 53(1), 248–265 (2017)

    Article  Google Scholar 

  6. Jiang, Y., Zhang, X., Tang, Y., Nie, R.: Feature-based approaches to semantic similarity assessment of concepts using Wikipedia. Inf. Process. Manag. 51(3), 215–234 (2015)

    Article  Google Scholar 

  7. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. Comput. Sci. (2013)

    Google Scholar 

  8. Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Lang. Cogn. Processes 6(1), 1–28 (1991)

    Article  MathSciNet  Google Scholar 

  9. Qu, R., Fang, Y., Bai, W., Jiang, Y.: Computing semantic similarity based on novel models of semantic representation using Wikipedia. Inf. Process. Manag. 54(6), 1002–1021 (2018)

    Article  Google Scholar 

  10. Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Commun. ACM 8(10), 627–633 (1965)

    Article  Google Scholar 

  11. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)

    Article  MathSciNet  Google Scholar 

  12. Wasti, S.H., Hussain, M.J., Huang, G., Akram, A., Jiang, Y., Tang, Y.: Assessing semantic similarity between concepts: a weighted-feature-based approach. Concurr. Comput.: Pract. Exp. 32(7), e5594 (2020)

    Article  Google Scholar 

  13. Zhu, G., Iglesias, C.A.: Computing semantic similarity of concepts in knowledge graphs. IEEE Trans. Knowl. Data Eng. 29(1), 72–85 (2017)

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by The National Natural Science Foundation of China under Grant Nos. 61772210 and U1911201; Guangdong Province Universities Pearl River Scholar Funded Scheme (2018); The Project of Science and Technology in Guangzhou in China under Grant No. 201807010043.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuncheng Jiang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wasti, S., Hussain, J., Huang, G., Jiang, Y. (2020). Similarity Evaluation with Wikipedia Features. In: Shi, Z., Vadera, S., Chang, E. (eds) Intelligent Information Processing X. IIP 2020. IFIP Advances in Information and Communication Technology, vol 581. Springer, Cham. https://doi.org/10.1007/978-3-030-46931-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-46931-3_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-46930-6

  • Online ISBN: 978-3-030-46931-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics