Abstract
This paper aims to analyze and adopt the term clustering method for building a modular ontology according to its core ontology. The acquisition of semantic knowledge focuses on noun phrase appearing with the same syntactic roles in relation to a verb or its preposition combination in a sentence. The construction of this co-occurrence matrix from context helps to build feature space of noun phrases, which is then transformed to several encoding representations including feature selection and dimensionality reduction. In addition, word embedding techniques are also presented as feature representation. These representations are clustered respectively with K-Means, K-Medoids, Affinity Propagation, DBscan and co-clustering algorithms. The feature representation and clustering methods constitute the major sections of term clustering frameworks. Due to the randomness of clustering approaches, iteration efforts are adopted to find the optimal parameter and provide convinced value for evaluation. The DBscan and affinity propagation show their outstanding effectiveness for term clustering and NMF encoding technique and word embedding representation are salient by its promising facilities in feature compression.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, C.C., Zhai, C.: A survey of text clustering algorithms. In: Aggarwal, C., Zhai, C. (eds.) Mining Text Data. Springer, Boston (2012). https://doi.org/10.1007/978-1-4614-3223-4_4
Arnold, T.: A tidy data model for natural language processing using cleanNLP. R J. 9(2), 1–20 (2017). https://journal.r-project.org/archive/2017/RJ-2017-035/index.html
Buitelaar, P., Cimiano, P., Magnini, B.: Ontology learning from text: an overview. In: Buitelaar, P., Cimiano, P., Magnini, B. (eds.) Ontology Learning from Text: Methods, Evaluation and Applications, vol. 123, pp. 3–12. IOS press, Amsterdam (2005)
Buitelaar, P., Olejnik, D., Sintek, M.: A protégé plug-in for ontology extraction from text based on linguistic analysis. In: Bussler, C.J., Davies, J., Fensel, D., Studer, R. (eds.) ESWS 2004. LNCS, vol. 3053, pp. 31–44. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25956-5_3
Burita, L., Gardavsky, P., Vejlupek, T.: K-gate ontology driven knowledge based system for decision support. J. Syst. Integr. 3(1), 19–31 (2012)
Camacho-Collados, J., et al.: SemEval-2018 Task 9: hypernym discovery. In: Proceedings of the 12th International Workshop on Semantic Evaluation, SemEval-2018, New Orleans, LA, United States. Association for Computational Linguistics (2018)
Chulyadyo, R., Harzallah, M., Berio, G.: Core ontology based approach for treating the flatness of automatically built ontology. In: KEOD, Portugal, pp. 316–323, September 2013
Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. Comput. Linguist. 16(1), 22–29 (1990)
Cimiano P., de Mantaras, R.L., Saitia, L.: Comparing conceptual, divisive and agglomerative clustering for learning taxonomies from text. In: 16th European Conference on Artificial Intelligence Conference Proceedings, vol. 110, p. 435 (2004)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc.: Ser. B (Methodol.) 39(1), 1–22 (1977)
Despress, S., Szulman, S.: Merging of legal micro-ontologies from Europen directives. Artif. Intell. Law 15(2), 187–200 (2007). https://doi.org/10.1007/s10506-007-9028-2
Dunn, J.C.: Well-separated clusters and optimal fuzzy partitions. J. Cybern. 4(1), 95–104 (1974)
El Ghosh, M., Naja, H., Abdulrab, H., Khalil, M.: Application of ontology modularization for building a criminal domain ontology. In: Pagallo, U., Palmirani, M., Casanovas, P., Sartor, G., Villata, S. (eds.) AICOL 2015-2017. LNCS (LNAI), vol. 10791, pp. 394–409. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00178-0_27
Esposito, F., Fanizzi, N., d’Amato, C.: Partitional conceptual clustering of web resources annotated with ontology languages. In: Berendt, B., et al. (eds.) Knowledge Discovery Enhanced with Semantic and Social Information. Studies in Computational Intelligence, vol. 220. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01891-6_4
Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. KDD 96, 226–231 (1996)
Faure, D., Nédellec, C., Rouveirol, C.: Acquisition of semantic knowledge using machine learning methods: The system “asium”. Universite Paris Sud, Technical report (1998)
Fernández-López, M., Gómez-Pérez, A., Juristo, N.: Methontology: From ontological art towards ontological engineering. In: AAAI (1997)
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)
Gábor, K., Zargayouna, H., Tellier, I., Buscaldi, D., Charnois, T.: Unsupervised relation extraction in specialized corpora using sequence mining. In: Boström, H., Knobbe, A., Soares, C., Papapetrou, P. (eds.) IDA 2016. LNCS, vol. 9897, pp. 237–248. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46349-0_21
Gamallo, P., Bordag, S.: Is singular value decomposition useful for word similarity extraction? Lang. Resour. Eval. 45(2), 95–119 (2011). https://doi.org/10.1007/s10579-010-9129-5
Gangemi, A., Catenacci, C., Battaglia, M.: Inflammation ontology design pattern: an exercise in building a core biomedical ontology with descriptions and situations. Stud. Health Technol. Inform. 102, 64–80 (2004)
Gangemi, A., Catenacci, C., Ciaramita, M., Lehmann, J.: Modelling ontology evaluation and validation. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 140–154. Springer, Heidelberg (2006). https://doi.org/10.1007/11762256_13
Govaert, G., Nadif, M.: Latent block model for contingency table. Commun. Stat. Theory Methods 39(3), 416–425 (2010)
Grau, B.C., Horrocks, I., Kazakov, Y., Sattler, U.: A logical framework for modularity of ontologies. IJCAI 114, 298–303 (2007)
Gruber, T.R.: A translation approach to portable ontology specifications. Knowl. Acquis. 5(2), 199–220 (1993)
Hao, J., Zhang, C., Wang, H.: Using keywords clustering to construct ontological hierarchies. In: Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology-Volume 03, pp. 247–250. IEEE Computer Society (2009)
Harris, Z.: Distributional structure. Word 10(23), 146–162 (1954)
Hartigan, J.A., Wong, M.A.: Algorithm as 136: a k-means clustering algorithm. J. Roy. Stat. Soc.: Ser. C (Appl. Stat.) 28(1), 100–108 (1979)
Hois, J., Bhatt, M., Kutz, O.: Modular ontologies for architectural design. In: FOMI, pp. 66–77 (2009)
Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985). https://doi.org/10.1007/BF01908075
Jiang, X., Tan, A.H.: Mining ontological knowledge from domain-specific text documents. In: Fifth IEEE International Conference on Data Mining, pp. 665–668. IEEE (2005)
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis, vol. 344. Wiley, Hoboken (2009)
Kutz, O., Hois, J.: Modularity in ontologies. Appl. Ontol. 7, 109–112 (2012). https://doi.org/10.3233/AO-2012-0109
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788 (1999)
Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: Association for Computational Linguistics (ACL) System Demonstrations, pp. 55–60 (2014)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Nancy, P., Ramani, R.G.: Discovery of patterns and evaluation of clustering algorithms in socialnetwork data (face book 100 universities) through data mining techniques and methods. Int. J. Data Min. Knowl. Manage. Process 2(5), 71 (2012)
Oberle, D., Lamparter, S., Grimm, S., Vrandečić, D., Staab, S., Gangemi, A.: Towards ontologies for formalizing modularization and communication in large software systems. Appl. Ontol. 1(2), 163–202 (2006)
Opdahl, A., Berio, G., Harzallah, M., Matulevičius, R.: Ontology for enterprise and information systems modelling. Appl. Ontol. 7, 49–92 (2011)
O’Connor, L., Feizi, S.: Biclustering using message passing. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 3617–3625. Curran Associates, Inc. (2014). http://papers.nips.cc/paper/5603-biclustering-using-message-passing.pdf
Pal, N.R., Biswas, J.: Cluster validation using graph theoretic concepts. Pattern Recogn. 30(6), 847–857 (1997)
Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971)
Rani, M., Dhar, A.K., Vyas, O.: Semi-automatic terminology ontology learning based on topic modeling. Eng. Appl. Artif. Intell. 63, 108–125 (2017)
Rdrr.io: Silhouette: Compute or extract silhouette information from clustering (2019). https://rdrr.io/cran/cluster/man/silhouette.html. Accessed 10 May 2019
Rios-Alvarado, A.B., Lopez-Arevalo, I., Sosa-Sosa, V.J.: Learning concept hierarchies from textual resources for ontologies construction. Expert Syst. Appl. 40(15), 5907–5915 (2013)
Scherpa, A., Saathoffa, C., Franza, T., Staaba, S.: Designing core ontologies. Appl. Ontol. 3, 1–3 (2009)
Song, Q., Liu, J., Wang, X., Wang, J.: A novel automatic ontology construction method based on web data. In: 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 762–765. IEEE (2014)
spaCy: Spacy:industrial-strength natural language processing (NLP) with python and cython, explosion AI (2019). https://github.com/explosion/spaCy. Accessed 10 May 2019
Wagner, S., Wagner, D.: Comparing clusterings: an overview. Universität Karlsruhe, Fakultät für Informatik Karlsruhe (2007)
Wang, W., Barnaghi, P.M., Bargiela, A.: Learning SKOS relations for terminological ontologies from text. In: Wong, W., Liu, W., Bennamoun, M. (eds.) Ontology Learning and Knowledge Discovery Using the Web: Challenges and Recent Advances, pp. 129–152. IGI Global, Hershey (2011)
Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Burlington (2016)
XU, Z., Harzallah, M., Guillet, F.: Comparing of term clustering frameworks for modular ontology learning. In: Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 2: KEOD, Seville, Spain, pp. 128–135. SCITEPRESS - Science and Technology Publications, September 2018
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Xu, Z., Harzallah, M., Guillet, F., Ichise, R. (2020). Towards a Term Clustering Framework for Modular Ontology Learning. In: Fred, A., Salgado, A., Aveiro, D., Dietz, J., Bernardino, J., Filipe, J. (eds) Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2018. Communications in Computer and Information Science, vol 1222. Springer, Cham. https://doi.org/10.1007/978-3-030-49559-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-49559-6_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49558-9
Online ISBN: 978-3-030-49559-6
eBook Packages: Computer ScienceComputer Science (R0)