Towards a Term Clustering Framework for Modular Ontology Learning

  • Ziwei XuEmail author
  • Mounira HarzallahEmail author
  • Fabrice GuilletEmail author
  • Ryutaro IchiseEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1222)


This paper aims to analyze and adopt the term clustering method for building a modular ontology according to its core ontology. The acquisition of semantic knowledge focuses on noun phrase appearing with the same syntactic roles in relation to a verb or its preposition combination in a sentence. The construction of this co-occurrence matrix from context helps to build feature space of noun phrases, which is then transformed to several encoding representations including feature selection and dimensionality reduction. In addition, word embedding techniques are also presented as feature representation. These representations are clustered respectively with K-Means, K-Medoids, Affinity Propagation, DBscan and co-clustering algorithms. The feature representation and clustering methods constitute the major sections of term clustering frameworks. Due to the randomness of clustering approaches, iteration efforts are adopted to find the optimal parameter and provide convinced value for evaluation. The DBscan and affinity propagation show their outstanding effectiveness for term clustering and NMF encoding technique and word embedding representation are salient by its promising facilities in feature compression.


Text mining Feature extraction Ontology learning Term clustering 


  1. 1.
    Aggarwal, C.C., Zhai, C.: A survey of text clustering algorithms. In: Aggarwal, C., Zhai, C. (eds.) Mining Text Data. Springer, Boston (2012). Scholar
  2. 2.
    Arnold, T.: A tidy data model for natural language processing using cleanNLP. R J. 9(2), 1–20 (2017).
  3. 3.
    Buitelaar, P., Cimiano, P., Magnini, B.: Ontology learning from text: an overview. In: Buitelaar, P., Cimiano, P., Magnini, B. (eds.) Ontology Learning from Text: Methods, Evaluation and Applications, vol. 123, pp. 3–12. IOS press, Amsterdam (2005)Google Scholar
  4. 4.
    Buitelaar, P., Olejnik, D., Sintek, M.: A protégé plug-in for ontology extraction from text based on linguistic analysis. In: Bussler, C.J., Davies, J., Fensel, D., Studer, R. (eds.) ESWS 2004. LNCS, vol. 3053, pp. 31–44. Springer, Heidelberg (2004). Scholar
  5. 5.
    Burita, L., Gardavsky, P., Vejlupek, T.: K-gate ontology driven knowledge based system for decision support. J. Syst. Integr. 3(1), 19–31 (2012)Google Scholar
  6. 6.
    Camacho-Collados, J., et al.: SemEval-2018 Task 9: hypernym discovery. In: Proceedings of the 12th International Workshop on Semantic Evaluation, SemEval-2018, New Orleans, LA, United States. Association for Computational Linguistics (2018)Google Scholar
  7. 7.
    Chulyadyo, R., Harzallah, M., Berio, G.: Core ontology based approach for treating the flatness of automatically built ontology. In: KEOD, Portugal, pp. 316–323, September 2013Google Scholar
  8. 8.
    Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. Comput. Linguist. 16(1), 22–29 (1990)Google Scholar
  9. 9.
    Cimiano P., de Mantaras, R.L., Saitia, L.: Comparing conceptual, divisive and agglomerative clustering for learning taxonomies from text. In: 16th European Conference on Artificial Intelligence Conference Proceedings, vol. 110, p. 435 (2004)Google Scholar
  10. 10.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc.: Ser. B (Methodol.) 39(1), 1–22 (1977)MathSciNetzbMATHGoogle Scholar
  11. 11.
    Despress, S., Szulman, S.: Merging of legal micro-ontologies from Europen directives. Artif. Intell. Law 15(2), 187–200 (2007). Scholar
  12. 12.
    Dunn, J.C.: Well-separated clusters and optimal fuzzy partitions. J. Cybern. 4(1), 95–104 (1974)MathSciNetzbMATHCrossRefGoogle Scholar
  13. 13.
    El Ghosh, M., Naja, H., Abdulrab, H., Khalil, M.: Application of ontology modularization for building a criminal domain ontology. In: Pagallo, U., Palmirani, M., Casanovas, P., Sartor, G., Villata, S. (eds.) AICOL 2015-2017. LNCS (LNAI), vol. 10791, pp. 394–409. Springer, Cham (2018). Scholar
  14. 14.
    Esposito, F., Fanizzi, N., d’Amato, C.: Partitional conceptual clustering of web resources annotated with ontology languages. In: Berendt, B., et al. (eds.) Knowledge Discovery Enhanced with Semantic and Social Information. Studies in Computational Intelligence, vol. 220. Springer, Heidelberg (2009). Scholar
  15. 15.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. KDD 96, 226–231 (1996)Google Scholar
  16. 16.
    Faure, D., Nédellec, C., Rouveirol, C.: Acquisition of semantic knowledge using machine learning methods: The system “asium”. Universite Paris Sud, Technical report (1998)Google Scholar
  17. 17.
    Fernández-López, M., Gómez-Pérez, A., Juristo, N.: Methontology: From ontological art towards ontological engineering. In: AAAI (1997)Google Scholar
  18. 18.
    Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
  19. 19.
    Gábor, K., Zargayouna, H., Tellier, I., Buscaldi, D., Charnois, T.: Unsupervised relation extraction in specialized corpora using sequence mining. In: Boström, H., Knobbe, A., Soares, C., Papapetrou, P. (eds.) IDA 2016. LNCS, vol. 9897, pp. 237–248. Springer, Cham (2016). Scholar
  20. 20.
    Gamallo, P., Bordag, S.: Is singular value decomposition useful for word similarity extraction? Lang. Resour. Eval. 45(2), 95–119 (2011). Scholar
  21. 21.
    Gangemi, A., Catenacci, C., Battaglia, M.: Inflammation ontology design pattern: an exercise in building a core biomedical ontology with descriptions and situations. Stud. Health Technol. Inform. 102, 64–80 (2004)Google Scholar
  22. 22.
    Gangemi, A., Catenacci, C., Ciaramita, M., Lehmann, J.: Modelling ontology evaluation and validation. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 140–154. Springer, Heidelberg (2006). Scholar
  23. 23.
    Govaert, G., Nadif, M.: Latent block model for contingency table. Commun. Stat. Theory Methods 39(3), 416–425 (2010)MathSciNetzbMATHCrossRefGoogle Scholar
  24. 24.
    Grau, B.C., Horrocks, I., Kazakov, Y., Sattler, U.: A logical framework for modularity of ontologies. IJCAI 114, 298–303 (2007)Google Scholar
  25. 25.
    Gruber, T.R.: A translation approach to portable ontology specifications. Knowl. Acquis. 5(2), 199–220 (1993)CrossRefGoogle Scholar
  26. 26.
    Hao, J., Zhang, C., Wang, H.: Using keywords clustering to construct ontological hierarchies. In: Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology-Volume 03, pp. 247–250. IEEE Computer Society (2009)Google Scholar
  27. 27.
    Harris, Z.: Distributional structure. Word 10(23), 146–162 (1954)CrossRefGoogle Scholar
  28. 28.
    Hartigan, J.A., Wong, M.A.: Algorithm as 136: a k-means clustering algorithm. J. Roy. Stat. Soc.: Ser. C (Appl. Stat.) 28(1), 100–108 (1979)zbMATHGoogle Scholar
  29. 29.
    Hois, J., Bhatt, M., Kutz, O.: Modular ontologies for architectural design. In: FOMI, pp. 66–77 (2009)Google Scholar
  30. 30.
    Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985). Scholar
  31. 31.
    Jiang, X., Tan, A.H.: Mining ontological knowledge from domain-specific text documents. In: Fifth IEEE International Conference on Data Mining, pp. 665–668. IEEE (2005)Google Scholar
  32. 32.
    Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis, vol. 344. Wiley, Hoboken (2009)zbMATHGoogle Scholar
  33. 33.
    Kutz, O., Hois, J.: Modularity in ontologies. Appl. Ontol. 7, 109–112 (2012). Scholar
  34. 34.
    Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788 (1999)zbMATHCrossRefGoogle Scholar
  35. 35.
    Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: Association for Computational Linguistics (ACL) System Demonstrations, pp. 55–60 (2014)Google Scholar
  36. 36.
    Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
  37. 37.
    Nancy, P., Ramani, R.G.: Discovery of patterns and evaluation of clustering algorithms in socialnetwork data (face book 100 universities) through data mining techniques and methods. Int. J. Data Min. Knowl. Manage. Process 2(5), 71 (2012)CrossRefGoogle Scholar
  38. 38.
    Oberle, D., Lamparter, S., Grimm, S., Vrandečić, D., Staab, S., Gangemi, A.: Towards ontologies for formalizing modularization and communication in large software systems. Appl. Ontol. 1(2), 163–202 (2006)Google Scholar
  39. 39.
    Opdahl, A., Berio, G., Harzallah, M., Matulevičius, R.: Ontology for enterprise and information systems modelling. Appl. Ontol. 7, 49–92 (2011)CrossRefGoogle Scholar
  40. 40.
    O’Connor, L., Feizi, S.: Biclustering using message passing. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 3617–3625. Curran Associates, Inc. (2014).
  41. 41.
    Pal, N.R., Biswas, J.: Cluster validation using graph theoretic concepts. Pattern Recogn. 30(6), 847–857 (1997)CrossRefGoogle Scholar
  42. 42.
    Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971)CrossRefGoogle Scholar
  43. 43.
    Rani, M., Dhar, A.K., Vyas, O.: Semi-automatic terminology ontology learning based on topic modeling. Eng. Appl. Artif. Intell. 63, 108–125 (2017)CrossRefGoogle Scholar
  44. 44. Silhouette: Compute or extract silhouette information from clustering (2019). Accessed 10 May 2019
  45. 45.
    Rios-Alvarado, A.B., Lopez-Arevalo, I., Sosa-Sosa, V.J.: Learning concept hierarchies from textual resources for ontologies construction. Expert Syst. Appl. 40(15), 5907–5915 (2013)CrossRefGoogle Scholar
  46. 46.
    Scherpa, A., Saathoffa, C., Franza, T., Staaba, S.: Designing core ontologies. Appl. Ontol. 3, 1–3 (2009)Google Scholar
  47. 47.
    Song, Q., Liu, J., Wang, X., Wang, J.: A novel automatic ontology construction method based on web data. In: 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 762–765. IEEE (2014)Google Scholar
  48. 48.
    spaCy: Spacy:industrial-strength natural language processing (NLP) with python and cython, explosion AI (2019). Accessed 10 May 2019
  49. 49.
    Wagner, S., Wagner, D.: Comparing clusterings: an overview. Universität Karlsruhe, Fakultät für Informatik Karlsruhe (2007)Google Scholar
  50. 50.
    Wang, W., Barnaghi, P.M., Bargiela, A.: Learning SKOS relations for terminological ontologies from text. In: Wong, W., Liu, W., Bennamoun, M. (eds.) Ontology Learning and Knowledge Discovery Using the Web: Challenges and Recent Advances, pp. 129–152. IGI Global, Hershey (2011)CrossRefGoogle Scholar
  51. 51.
    Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Burlington (2016)Google Scholar
  52. 52.
    XU, Z., Harzallah, M., Guillet, F.: Comparing of term clustering frameworks for modular ontology learning. In: Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD, Seville, Spain, pp. 128–135. SCITEPRESS - Science and Technology Publications, September 2018Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.LS2N, Ecole Polytechnique de l’Université de NantesNantesFrance
  2. 2.National Institute of InformaticsTokyoJapan

Personalised recommendations