Advertisement

Semantic similarity aggregators for very short textual expressions: a case study on landmarks and points of interest

  • Jorge Martinez-GilEmail author
Article

Abstract

Semantic similarity measurement aims to automatically compute the degree of similarity between two textual expressions that use different representations for naming the same concepts. However, very short textual expressions cannot always follow the syntax of a written language and, in general, do not provide enough information to support proper analysis. This means that in some fields, such as the processing of landmarks and points of interest, results are not entirely satisfactory. In order to overcome this situation, we explore the idea of aggregating existing methods by means of two novel aggregation operators aiming to model an appropriate interaction between the similarity measures. As a result, we have been able to improve the results of existing techniques when solving the GeReSiD and the SDTS, two of the most popular benchmark datasets for dealing with geographical information.

Keywords

Knowledge engineering Data integration Semantic similarity measurement 

Notes

Acknowledgments

We would like to thank the anonymous reviewers for their helpful and constructive comments that greatly contributed to improve this work. The research reported in this paper has been supported by the Austrian Ministry for Transport, Innovation and Technology, the Federal Ministry of Science, Research and Economy, and the Province of Upper Austria under the frame of the COMET Center SCCH [FFG: 844597].

References

  1. Ahlgren, P., Jarneving, B., Rousseau, R. (2003). Requirements for a cocitation similarity measure, with special reference to pearson’s correlation coefficient. JASIST, 54(6), 550–560.CrossRefGoogle Scholar
  2. Amir, S., Tanasescu, A., Zighed, D.A. (2017). Sentence similarity based on semantic kernels for intelligent text retrieval. Journal of Intelligent Information System, 48(3), 675–689.CrossRefGoogle Scholar
  3. Aouicha, M.B., & Taieb, M.A.H. (2015). G2WS: gloss-based wordnet and wiktionary semantic similarity measure. In 12Th IEEE/ACS international conference of computer systems and applications, AICCSA 2015, november 17-20, 2015 (pp. 1–7). Marrakech.Google Scholar
  4. Ballatore, A., Bertolotto, M., Wilson, D.C. (2013). Geographic knowledge extraction and semantic similarity in openstreetmap. Knowledge and Information Systems, 37(1), 61–81.CrossRefGoogle Scholar
  5. Ballatore, A., Bertolotto, M., Wilson, D.C. (2014). An evaluative baseline for geo-semantic relatedness and similarity. GeoInformatica, 18(4), 747–767.CrossRefGoogle Scholar
  6. Ballatore, A., Wilson, D.C., Bertolotto, M. (2013). Computing the semantic similarity of geographic terms using volunteered lexical definitions. International Journal of Geographical Information Science, 27(10), 2099–2118.CrossRefGoogle Scholar
  7. Buscaldi, D., Roux, J.L., Flores, J.J.G., Popescu, A. (2013). LIPN-CORE: Semantic text similarity using n-grams, wordnet, syntactic analysis, ESA and information retrieval based features. In Proceedings of the Second Joint Conference on Lexical and Computational Semantics, *SEM 2013, June 13-14, 2013 (pp. 162–168). Atlanta.Google Scholar
  8. Chaves-González, J.M., & Martinez-Gil, J. (2013). Evolutionary algorithm based on different semantic similarity functions for synonym recognition in the biomedical domain. Knowledge-Based Systems, 37, 62–69.CrossRefGoogle Scholar
  9. Feng, C., & Flewelling, D.M. (2004). Assessment of semantic similarity between land use/land cover classification systems. Computers, Environment and Urban Systems, 28(3), 229–246.CrossRefGoogle Scholar
  10. Grabisch, M., Marichal, J., Mesiar, R., Pap, E. (2011). Aggregation functions: Construction methods, conjunctive, disjunctive and mixed classes. Information Sciences, 181(1), 23–43.MathSciNetCrossRefzbMATHGoogle Scholar
  11. Han, L., Finin, T., McNamee, P., Joshi, A., Yesha Y. (2013). Improving word similarity by augmenting PMI with estimates of word polysemy. IEEE Transactions on Knowledge and Data Engineering, 25(6), 1307–1322.CrossRefGoogle Scholar
  12. Hobel, H., Fogliaroni, P., Frank, A.U. (2016). Deriving the geographic footprint of cognitive regions. in Geospatial Data in a Changing World - Selected Papers of the 19th AGILE Conference on Geographic Information Science, 14-17 june 2016 (pp 67–84). Helsinki.Google Scholar
  13. Hsu, H., & Chen, C. (1996). Aggregation of fuzzy opinions under group decision making. Fuzzy Sets and Systems, 79(3), 279–285.MathSciNetCrossRefGoogle Scholar
  14. Janowicz, K., Raubal, M., Kuhn, W. (2011). The semantics of similarity in geographic information retrieval. Journal Spatial Information Science, 2(1), 29–57.Google Scholar
  15. Janowicz, K., Raubal, M., Schwering, A., Kuhn, W. (2008). Semantic similarity measurement and geospatial applications. Transactions in GIS, 12(6), 651–659.CrossRefGoogle Scholar
  16. Ji, Q., Haase, P., Qi, G. (2011). Combination of similarity measures in ontology matching using the OWA operator. In Recent Developments in the Ordered Weighted Averaging Operators: Theory and Practice (pp. 281–295).Google Scholar
  17. Kuncheva, L. (2001). Using measures of similarity and inclusion for multiple classifier fusion by decision templates. Fuzzy Sets and Systems, 122(3), 401–407.MathSciNetCrossRefzbMATHGoogle Scholar
  18. Landauer, T.K., & Psotka, J. (2000). Simulating text understanding for educational applications with latent semantic analysis: Introduction to LSA. Interactive Learning Environments, 8(2), 73–86.CrossRefGoogle Scholar
  19. Lastra-Díaz, J.J., & García-Serrano, A. (2015a). A new family of information content models with an experimental survey on wordnet. Knowledge-Based Systems, 89, 509–526.Google Scholar
  20. Lastra-Díaz, J.J., & García-Serrano, A. (2015b). A novel family of ic-based similarity measures with a detailed experimental survey on wordnet. Engineering Applications of AI, 46, 140–153.Google Scholar
  21. Li, X., Cong, G., Li, X., Pham, T.N., Krishnaswamy, S. (2015). Rank-geofm: A ranking based geographical factorization method for point of interest recommendation. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, August 9-13, 2015 (pp. 433–442). Santiago.Google Scholar
  22. Li, Y., McLean, D., Bandar, Z., O’Shea, J., Crockett, K.A. (2006). Sentence similarity based on semantic nets and corpus statistics. IEEE Transactions on Knowledge and Data Engineering, 18(8), 1138–1150.CrossRefGoogle Scholar
  23. Lim, K.H., Chan, J., Leckie, C., Karunasekera, S. (2015). Personalized tour recommendation based on user interests and points of interest visit durations. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, July 25-31, 2015 (pp. 1778–1784). Buenos Aires.Google Scholar
  24. Lin, D. (1998). An information-theoretic definition of similarity. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), July 24-27, 1998 (pp. 296–304). Madison.Google Scholar
  25. Martinez-Gil, J. (2014). An overview of textual semantic similarity measures based on web intelligence. Artificial Intelligence Review, 42(4), 935–943.CrossRefGoogle Scholar
  26. Martinez-Gil, J. (2016a). Accurate semantic similarity measurement of biomedical nomenclature by means of fuzzy logic. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 24(2), 291–306.Google Scholar
  27. Martinez-Gil, J. (2016b). Coto: a novel approach for fuzzy aggregation of semantic similarity measures. Cognitive Systems Research, 40, 8–17.Google Scholar
  28. Martinez-Gil, J., & Chaves-Gonzalez, J.M. (2019). Automatic design of semantic similarity controllers based on fuzzy logics. Expert Systems with Applications, 131, 45–59.CrossRefGoogle Scholar
  29. Medina-Hernández, J.A., Gomez-castañeda, F., Moreno-Cadenas, J.A. (2009). An evolving fuzzy neural network based on the mapping of similarities. IEEE Transactions on Fuzzy Systems, 17(6), 1379–1396.CrossRefGoogle Scholar
  30. Miller, G., & Charles, W. (1991). Contextual correlates of semantic similarity. Language and Cognitive Processes, 6(1), 1–28.MathSciNetCrossRefGoogle Scholar
  31. Musavi, M.T., Kalantri, K., Ahmed, W., Chan, K.H. (1993). A minimum error neural network (MNN). Neural Networks, 6(3), 397–407.CrossRefGoogle Scholar
  32. Pilehvar, M.T., & Navigli, R. (2015). From senses to texts: An all-in-one graph-based approach for measuring semantic similarity. Artificial Intelligence, 228, 95–128.MathSciNetCrossRefzbMATHGoogle Scholar
  33. Pirró, G. (2009). A semantic similarity metric combining features and intrinsic information content. Data and Knowledge Engineering, 68(11), 1289–1308.CrossRefGoogle Scholar
  34. Ranjbar, N., Mashhadirajab, F., Shamsfard, M., pour, R.H., Pour, A.V. (2017). Mahtab at semeval-2017 task 2 Combination of corpus-based and knowledge-based methods to measure semantic word similarity. In Proceedings of the 11th International Workshop on Semantic Evaluation, SemEval@ACL 2017, August 3-4, 2017 (pp. 256–260). Vancouver.Google Scholar
  35. Resnik, P. (1995). Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, IJCAI 95, Montréal Québec, Canada, August 20-25 1995, (Vol. 2 pp. 448–453).Google Scholar
  36. Rodríguez, M.A., & Egenhofer, M.J. (2003). Determining semantic similarity among entity classes from different ontologies. IEEE Transactions on Knowledge and Data Engineering, 15(2), 442–456.CrossRefGoogle Scholar
  37. Rus, V., Lintean, M.C., Banjade, R., Niraula, N.B., Stefanescu, D. (2013). SEMILAR: The semantic similarity toolkit. In 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, Proceedings of the Conference System Demonstrations, 4-9 August 2013 (pp. 163–168 ). Sofia.Google Scholar
  38. Rybinski, M., & Aldana-Montes, J.F. (2017). Domesa: a novel approach for extending domain-oriented lexical relatedness calculations with domain-specific semantics. Journal of Intelligent Information System, 49(3), 315–331.CrossRefGoogle Scholar
  39. Setiono, R. (2001). Generating linear regression rules from neural networks using local least squares approximation. In Connectionist Models of Neurons, Learning Processes and Artificial Intelligence, 6th International Work-conference on Artificial and Natural Neural Networks, IWANN 2001 granada, spain, june 13-15, 2001, proceedings, Part I (pp. 277–284).Google Scholar
  40. Turney, P.D. (2013). Distributional semantics beyond words: Supervised learning of analogy and paraphrase. TACL, 1, 353–366.Google Scholar
  41. Webb, G.I., & Zheng, Z. (2004). Multistrategy ensemble learning: Reducing error by combining ensemble learning techniques. IEEE Transactions on Knowledge and Data Engineering, 16(8), 980–991.CrossRefGoogle Scholar
  42. Zhang, P., Zhang, Z., Zhang, W., Wu, C. (2014). Semantic similarity computation based on multi-feature combination using hownet. JSW, 9(9), 2461–2466.MathSciNetGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Software Competence Center Hagenberg GmbHHagenbergAustria

Personalised recommendations