Advertisement

Multimedia Tools and Applications

, Volume 78, Issue 14, pp 19807–19837 | Cite as

Semantic similarity measures for formal concept analysis using linked data and WordNet

  • Yuncheng JiangEmail author
  • Mingxuan Yang
  • Rong Qu
Article

Abstract

Formal Concept Analysis (FCA) is a field of applied mathematics with its roots in order theory, in particular the theory of complete lattices. It is not only a method for data analysis and knowledge representation, but also a formal formulation for concept formation and learning. Over the past 20 years, FCA has been widely studied. In this paper, the current research progresses and the existing problems of similarity measures in FCA are analyzed. To address the drawbacks of the existing methods, we propose a kind of novel semantic similarity measure for FCA by using Linked Data and WordNet. We aim to develop a method that is fully automatic without requiring predefined domain ontologies and can be used independently of the domain in applications requiring semantic similarity measures in FCA. To realize the semantic similarity estimation for FCA, we firstly extend the similarity assessment methods for resources (or entities) in Linked Data into semantic cases by using WordNet. Furthermore, we propose two kinds of semantic similarity measures (i.e., context-free method and context-aware method) for FCA concepts and concept lattices, respectively. Compared with the existing similarity measure methods in FCA, the proposed approach uses concept of possibility theory to determine lower and upper bounds of similarity intervals. Finally, we evaluate the proposed similarity assessment approaches by applying them to real-worlds datasets.

Keywords

Semantic similarity Linked data WordNet Possibility theory Formal concept analysis 

Notes

Acknowledgements

The authors would like to thank the anonymous referees for their valuable comments and suggestions which greatly improved the exposition of the paper. The works described in this paper are supported by The National Natural Science Foundation of China under Grant Nos. 61772210 and 61272066; Guangdong Province Universities Pearl River Scholar Funded Scheme (2018); The Project of Science and Technology in Guangzhou in China under Grant No. 201807010043; The key project in universities in Guangdong Province of China under Grant No. 2016KZDXM024.

References

  1. 1.
    Alam M, Buzmakov A, Napoli A (2018) Exploratory knowledge discovery over web of data. Discret Appl Math 249:2–17MathSciNetzbMATHGoogle Scholar
  2. 2.
    Alqadah F, Bhatnagar R (2011) Similarity measures in formal concept analysis. Ann Math Artif Intell 61(3):245–256MathSciNetzbMATHGoogle Scholar
  3. 3.
    Benferhat S, Kaci S (2003) Logical representation and fusion of prioritized information based on guaranteed possibility measures: application to the distance-based merging of classical bases. Artif Intell 148(1–2):291–333MathSciNetzbMATHGoogle Scholar
  4. 4.
    Benferhat S, Dubois D, Kaci S, Prade H (2006) Bipolar possibility theory in preference modeling: representation, fusion and optimal solutions. Inform Fusion 7(1):135–150Google Scholar
  5. 5.
    Berners-Lee T (2009) Linked Data—W3C design issues. http://www.w3.org/DesignIssues/LinkedData.html
  6. 6.
    Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am 284(5):34–43Google Scholar
  7. 7.
    Bizer C (2009) The emerging web of linked data. IEEE Intell Syst 24(5):87–92Google Scholar
  8. 8.
    Bizer C, Heath T, Berners-Lee T (2009) Linked data - the story so far. Int J Semantic Web Inform Syst 5(3):1–22Google Scholar
  9. 9.
    Budanitsky A, Hirst G (2006) Evaluating WordNet-based measures of lexical semantic relatedness. Comput Linguist 32(1):13–47zbMATHGoogle Scholar
  10. 10.
    Buil-Aranda C, Arenas M, Corcho O, Polleres A (2013) Federating queries in SPARQL 1.1: syntax, semantics and evaluation. J Web Semant 18(1):1–17Google Scholar
  11. 11.
    Chen H, Trouve A, Murakami KJ, Fukuda A (2018) Semantic image retrieval for complex queries using a knowledge parser. Multimed Tools Appl 77(9):10733–10751Google Scholar
  12. 12.
    Cilibrasi RL, Vitanyi PMB (2007) The Google similarity distance. IEEE Trans Knowl Data Eng 19(3):370–383Google Scholar
  13. 13.
    Dezani-Ciancaglini M, Horne R, Sassone V (2012) Tracing where and who provenance in linked data: a calculus. Theor Comput Sci 464:113–129MathSciNetzbMATHGoogle Scholar
  14. 14.
    Distel F, Sertkaya B (2011) On the complexity of enumerating pseudo-intents. Discret Appl Math 159(6):450–466MathSciNetzbMATHGoogle Scholar
  15. 15.
    Du Y, Hai Y (2013) Semantic ranking of web pages based on formal concept analysis. J Syst Softw 86(1):187–197Google Scholar
  16. 16.
    Dubois D, Prade H (1998) Possibility theory: qualitative and quantitative aspects. In: Quantified representation of uncertainty and imprecision, handbook of defeasible reasoning and uncertainty management systems, vol 1. Kluwer Academic Publishers, The Netherlands, pp 169–226Google Scholar
  17. 17.
    Dubois D, Prade H (2004) Possibilistic logic: a retrospective and prospective view. Fuzzy Sets Syst 144(1):3–23MathSciNetzbMATHGoogle Scholar
  18. 18.
    Dubois D, Prade H (2012) Possibility theory and formal concept analysis: characterizing independent sub-contexts. Fuzzy Sets Syst 196:4–16MathSciNetzbMATHGoogle Scholar
  19. 19.
    Dubois D, Prade H, Harding E (1988) Possibility theory: an approach to computerized processing of uncertainty. Plenum press, New YorkGoogle Scholar
  20. 20.
    Dubois D, Prade H, Sabbadin R (2001) Decision-theoretic foundations of qualitative possibility theory. Eur J Oper Res 128(3):459–478MathSciNetzbMATHGoogle Scholar
  21. 21.
    Fellbaum C (1998) A semantic network of English: the mother of all wordnets. Comput Hum 32(2–3):209–220Google Scholar
  22. 22.
    Fellbaum C (1998) WordNet: an electronic lexical database. Academic Press, Cambridge, MAzbMATHGoogle Scholar
  23. 23.
    Formica A (2006) Ontology-based concept similarity in formal concept analysis. Inf Sci 176(18):2624–2641MathSciNetzbMATHGoogle Scholar
  24. 24.
    Formica A (2008) Concept similarity in formal concept analysis: an information content approach. Knowl-Based Syst 21(1):80–87MathSciNetGoogle Scholar
  25. 25.
    Formica A (2013) Similarity reasoning for the semantic web based on fuzzy concept lattices: an informal approach. Inf Syst Front 15(3):511–520Google Scholar
  26. 26.
    Formica A (2018) Similarity reasoning in formal concept analysis: from one- to many-valued contexts. Knowl Inf Syst, in press.  https://doi.org/10.1007/s10115-018-1252-4
  27. 27.
    Formica A, Missikoff M (2002) Concept similarity in SymOntos: an enterprise ontology management tool. Comput J 45(6):583–594zbMATHGoogle Scholar
  28. 28.
    Francis WN, Kucera H (1982) Frequency analysis of English usage: lexicon and grammar. Houghton MifflinGoogle Scholar
  29. 29.
    Ganter B, Wille R (1999) Formal concept analysis: mathematical foundations. Springer-Verlag, Berlin, GermanyzbMATHGoogle Scholar
  30. 30.
    Giang PH, Shenoy PP (2005) Two axiomatic approaches to decision making using possibility theory. Eur J Oper Res 162(2):450–467MathSciNetzbMATHGoogle Scholar
  31. 31.
    Guezguez W, Amor NB, Mellouli K (2009) Qualitative possibilistic influence diagrams based on qualitative possibilistic utilities. Eur J Oper Res 195(1):223–238MathSciNetzbMATHGoogle Scholar
  32. 32.
    Hogan A, Harth A, Umbrich J, Kinsella S, Polleres A, Decker S (2011) Searching and browsing linked data with SWSE: the semantic web search engine. J Web Semant 9(4):365–401Google Scholar
  33. 33.
    Hogan A, Umbrich J, Harth A, Cyganiak R, Polleres A, Decker S (2012) An empirical survey of linked data conformance. J Web Semant 14:14–44Google Scholar
  34. 34.
    Hossein Zadeh PD, Reformat MZ (2013) Context-aware similarity assessment within semantic space formed in linked data. J Ambient Intell Humaniz Comput 4(4):515–532Google Scholar
  35. 35.
    Hou S, Zhou S, Liu W, Zheng Y (2018) Classifying advertising video by topicalizing high-level semantic concepts. Multimed Tools Appl 77(19):25475–25511Google Scholar
  36. 36.
    Islam A, Inkpen D (2008) Semantic text similarity using corpus-based word similarity and string similarity. ACM Trans Knowl Discov Data 2(2):Article 10Google Scholar
  37. 37.
    Jaschke R, Hotho A, Schmitz C, Ganter B, Stumme G (2008) Discovering shared conceptualizations in folksonomies. J Web Semant 6(1):38–53Google Scholar
  38. 38.
    Jiang Y, Zhang X, Tang Y, Nie R (2015) Feature-based approaches to semantic similarity assessment of concepts using Wikipedia. Inf Process Manag 51(3):215–234Google Scholar
  39. 39.
    Jiang Y, Bai W, Zhang X, Hu J (2017) Wikipedia-based information content and semantic similarity computation. Inf Process Manag 53(1):248–265Google Scholar
  40. 40.
    Kontokostas D, Bratsas C, Auer S, Hellmann S, Antoniou I, Metakides G (2012) Internationalization of linked data: the case of the Greek DBpedia edition. J Web Semant 15:51–61Google Scholar
  41. 41.
    Lee JG, Ko YW (2018) Retrieve similar cell images in OpenSlide file. Multimed Tools Appl, in press.  https://doi.org/10.1007/s11042-017-5508-x
  42. 42.
    Lee S, Huh SY, McNiel RD (2008) Automatic generation of concept hierarchies using WordNet. Expert Syst Appl 35(3):1132–1144Google Scholar
  43. 43.
    Lehmann J, Bizer C, Kobilarov G, Auer S, Becker C, Cyganiak R, Hellmann S (2009) DBpedia — a crystallization point for the web of data. J Web Semant 7(3):154–165Google Scholar
  44. 44.
    Li X, Sun Y (2018) Joint structural similarity and entropy estimation for coded-exposure image restoration. Multimed Tools Appl 77(22):29811–29828Google Scholar
  45. 45.
    Li J, Mei C, Lv Y (2013) Incomplete decision contexts: approximate concept construction, rule acquisition and knowledge reduction. Int J Approx Reason 54(1):149–165MathSciNetzbMATHGoogle Scholar
  46. 46.
    Liu H, Bao H, Xu D (2012) Concept vector for semantic similarity and relatedness based on WordNet structure. J Syst Softw 85(2):370–381Google Scholar
  47. 47.
    Loia V, Orciuoli F, Pedrycz W (2018) Towards a granular computing approach based on formal concept analysis for discovering periodicities in data. Knowl-Based Syst 146:1–11Google Scholar
  48. 48.
    Milne D, Witten IH (2013) An open-source toolkit for mining Wikipedia. Artif Intell 194:222–239MathSciNetGoogle Scholar
  49. 49.
    Muangprathub J, Boonjing V, Pattaraintakorn P (2013) A new case-based classification using incremental concept lattice knowledge. Data Knowl Eng 83:39–53Google Scholar
  50. 50.
    Negm E, AbdelRahman S, Bahgat R (2017) PREFCA: a portal retrieval engine based on formal concept analysis. Inf Process Manag 53(1):203–222Google Scholar
  51. 51.
    Pociello E, Agirre E, Aldezabal I (2011) Methodology and construction of the Basque WordNet. Lang Resour Eval 45(2):121–142Google Scholar
  52. 52.
    Qu K, Zhai Y (2008) Generating complete set of implications for formal contexts. Knowl-Based Syst 21(5):429–433Google Scholar
  53. 53.
    Sampath S, Sprenkle S, Gibson E, Pollock L, Greenwald AS (2007) Applying concept analysis to user-session-based testing of web applications. IEEE Trans Softw Eng 33(10):643–658Google Scholar
  54. 54.
    Sanchez D, Batet M, Isern D (2011) Ontology-based information content computation. Knowl-Based Syst 24(2):297–303Google Scholar
  55. 55.
    Seo HC, Chung H, Rim HC, Myaeng SH, Kim SH (2004) Unsupervised word sense disambiguation using WordNet relatives. Comput Speech Lang 18(3):253–273Google Scholar
  56. 56.
    Snelting G, Tip F (2000) Understanding class hierarchies using concept analysis. ACM Trans Program Lang Syst 22(3):540–582Google Scholar
  57. 57.
    Suchanek FM, Kasneci G, Weikum G (2008) YAGO: a large ontology from Wikipedia and WordNet. J Web Semant 6(3):203–217Google Scholar
  58. 58.
    Tadrat J, Boonjing V, Pattaraintakorn P (2012) A new similarity measure in formal concept analysis for case-based reasoning. Expert Syst Appl 39(1):967–972Google Scholar
  59. 59.
    Wang L, Liu X (2008) A new model of evaluating concept similarity. Knowl-Based Syst 21(8):842–846Google Scholar
  60. 60.
    Wei S, Zhao Y, Yang T, Zhou Z, Ge S (2018) Enhancing heterogeneous similarity estimation via neighborhood reversibility. Multimed Tools Appl 77(1):1437–1452Google Scholar
  61. 61.
    Wille R (2009) Restructuring lattice theory: an approach based on hierarchies of concepts. In Proceedings of the 7th international conference on formal concept analysis (ICFCA 2009), lecture notes in artificial intelligence, vol. 5548. Springer-Verlag, pp. 314–339Google Scholar
  62. 62.
    Wu W, Leung Y, Mi J (2009) Granular computing and knowledge reduction in formal contexts. IEEE Trans Knowl Data Eng 21(10):1461–1474Google Scholar
  63. 63.
    Zadeh LA (1978) Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets Syst 1(1):3–28MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Computer ScienceSouth China Normal UniversityGuangzhouChina

Personalised recommendations