Abstract
The main claim of this paper is that machine learning can help integrate the construction of ontologies and extraction grammars and lead us closer to the Semantic Web vision. The proposed approach is a bootstrapping process that combines ontology and grammar learning, in order to semi-automate the knowledge acquisition process. After providing a survey of the most relevant work towards this goal, recent research of the Software and Knowledge Engineering Laboratory (SKEL) of NCSR “Demokritos” in the areas of Web information integration, information extraction, grammar induction and ontology enrichment is presented. The paper concludes with a number of interesting issues that need to be addressed in order to realize the advocated bootstrapping process.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Appelt, D.E., Hobbs, J.R., Bear, J., Israel, D.J., Tyson, M.: Fastus: A finite-state processor for information extraction from real-world text. In: Bajcsy, R. (ed.) IJCAI, pp. 1172–1178 (1993)
Bikel, D.M., Miller, S., Schwartz, R.L., Weischedel, R.M.: Nymble: a high-performance learning name-finder. In: ANLP, pp. 194–201 (1997)
Brewster, C., Ciravegna, F., Wilks, Y.: User-centred ontology learning for knowledge management. In: Andersson, B., Bergholtz, M., Johannesson, P. (eds.) NLDB 2002. LNCS, vol. 2553, pp. 203–207. Springer, Heidelberg (2002)
Buitelaar, P., Handschuh, S., Magnini, B. (eds.): Proceedings of the ECAI Ontology Learning and Population Workshop, Valencia, Spain, August 22-24 (2004)
Buitelaar, P., Olejnik, D., Sintek, M.: A protégé plug-in for ontology extraction from text based on linguistic analysis. In: Bussler, C.J., Davies, J., Fensel, D., Studer, R. (eds.) ESWS 2004. LNCS, vol. 3053, pp. 31–44. Springer, Heidelberg (2004)
Cimiano, P., Hotho, A., Stumme, G., Tane, J.: Conceptual knowledge processing with formal concept analysis and ontologies. In: Eklund, P. (ed.) ICFCA 2004. LNCS (LNAI), vol. 2961, pp. 189–207. Springer, Heidelberg (2004)
Cimiano, P., Schmidt-Thieme, L., Pivk, A., Staab, S.: Learning taxonomic relations from heterogeneous evidence. In: Buitelaar et al. [4]
Ciravegna, F.: Adaptive information extraction from text by rule induction and generalisation. In: Nebel, B. (ed.) IJCAI, pp. 1251–1256. Morgan Kaufmann, San Francisco (2001)
Ciravegna, F., Dingli, A., Guthrie, D., Wilks, Y.: Integrating information to bootstrap information extraction from web sites. In: Kambhampati, S., Knoblock, C.A. (eds.) IIWeb, pp. 9–14 (2003)
Corbett, D.: Interoperability of ontologies using conceptual graph theory. In: Wolff, K.E., Pfeiffer, H.D., Delugach, H.S. (eds.) ICCS 2004. LNCS (LNAI), vol. 3127, pp. 375–387. Springer, Heidelberg (2004)
Crysmann, B., Frank, A., Kiefer, B., Mueller, S., Neumann, G., Piskorski, J., Schäfer, U., Siegel, M., Uszkoreit, H., Xu, F., Becker, M., Krieger, H.-U.: An integrated archictecture for shallow and deep processing. In: ACL, pp. 441–448 (2002)
Delteil, A., Faron, C., Dieng, R.: Building concept lattices by learning concepts from rdf graphs annotating web documents. In: Priss, et al. [39], pp. 191–204
Dietterich, T.G.: Learning at the Knowledge Level. Machine Learning 1(3), 287–316 (1986)
Embley, D.W.: Programming with data frames for everyday data items. In: NCC, p. 301305 (1980)
Embley, D.W.: Towards semantic understanding – an approach based on information extraction ontologies. In: Schewe, K.-D., Williams, H.E. (eds.) ADC. CRPIT, vol. 27, p. 3. Australian Computer Society (2004)
Faure, D., Nedellec, C.: Knowledge acquisition of predicate argument structures from technical texts using machine learning: The system ASIUM. In: Fensel, D., Studer, R. (eds.) EKAW 1999. LNCS (LNAI), vol. 1621, pp. 329–334. Springer, Heidelberg (1999)
Fellbaum, C. (ed.): WordNet An Electronic Lexical Database. Bradford Books (1998)
Gaizauskas, R., Wakao, T., Humphreys, K., Cunningham, H., Wilks, Y.: University of sheffield: Description of the lasie system as used for muc-6. In: MUC-6, pp. 207–220 (1995)
Hahn, U., Markó, K.G.: An integrated, dual learner for grammars and ontologies. Data Knowl. Eng. 42(3), 273–291 (2002)
Hakeem, A., Sheikh, Y., Shah, M.: Casee: A hierarchical event representation for the analysis of videos. In: McGuinness, D.L., Ferguson, G. (eds.) AAAI, pp. 263–268. AAAI Press / The MIT Press (2004)
Hess, J., Cyre, W.R.: A cg-based behavior extraction system. In: Tepfenhart, W.M., Cyre, W.R. (eds.) ICCS 1999. LNCS, vol. 1640, pp. 127–139. Springer, Heidelberg (1999)
Jacobs, P.S., Rau, L.F.: Scisor: Extracting information from on-line news. Communications of the ACM 33(11), 88–97 (1990)
Karkaletsis, V., Paliouras, G., Spyropoulos, C.D.: A bootstrapping approach to knowledge acquisition from multimedia content with ontology evolution. In: Honkela, T., Simula, O. (eds.) AKRR. Helsinki University of Technology (2005)
Karkaletsis, V., Spyropoulos, C.D.: Cross-lingual information management from web pages. In: PCI (2003)
Kushmerick, N.: Wrapper induction: Efficiency and expressiveness. Artificial Intelligence 118(1-2), 15–68 (2000)
Langley, P., Stromsten, S.: Learning context-free grammars with a simplicity bias. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 220–228. Springer, Heidelberg (2000)
Maedche, A., Staab, S.: Mining ontologies from text. In: Dieng, R., Corby, O. (eds.) EKAW 2000. LNCS (LNAI), vol. 1937, pp. 189–202. Springer, Heidelberg (2000)
Modayil, J., Kuipers, B.: Bootstrap learning for object discovery. In: IROS. IEEE Press, Los Alamitos (2004)
Muslea, I., Minton, S., Knoblock, C.A.: A hierarchical approach to wrapper induction. In: Agents, pp. 190–197 (1999)
Navigli, R., Velardi, P.: Learning domain ontologies from document warehouses and dedicated websites. Computational Linguistics 30(2) (2004)
Neumann, G., Xu, F.: Course on intelligent information extraction. In: ESSLI (2004)
Neumann, G., Piskorski, J.: A shallow text processing core engine. Computational Intelligence 18(3), 451–476 (2002)
Nicolas, S., Moulin, B., Mineau, G.W.: Sesei: A cg-based filter for internet search engines. In: Ganter, B., de Moor, A., Lex, W. (eds.) ICCS 2003. LNCS, vol. 2746, pp. 362–377. Springer, Heidelberg (2003)
Ogata, N., Collier, N.: Ontology express: Statistical and non-monotonic learning of domain ontologies from text. In: Buitelaar, et al. [4], pp. 19–24
Patrick, J.: The scamseek project: Text mining for finanical scams on the internet. In: Simoff, S.J., Williams, G.J. (eds.) ADMC, pp. 33–38 (2004)
Petasis, G., Cucchiarelli, A., Velardi, P., Paliouras, G., Karkaletsis, V., Spyropoulos, C.D.: Automatic adaptation of proper noun dictionaries through cooperation of machine learning and probabilistic methods. In: Belkin, N.J., Ingwersen, P., Leong, M.-K. (eds.) SIGIR, pp. 128–135. ACM, New York (2000)
Petasis, G., Paliouras, G., Karkaletsis, V., Halatsis, C., Spyropoulos, C.D.: e-grids: Computationally efficient grammatical inference from positive examples. Grammars (2004)
Petasis, G., Paliouras, G., Spyropoulos, C.D., Halatsis, C.: eg-grids: Context-free grammatical inference from positive examples using genetic search. In: Paliouras, G., Sakakibara, Y. (eds.) ICGI 2004. LNCS (LNAI), vol. 3264, pp. 223–234. Springer, Heidelberg (2004)
Priss, U., Corbett, D., Angelova, G. (eds.): ICCS 2002. LNCS (LNAI), vol. 2393. Springer, Heidelberg (2002)
Reeve, L., Han, H.: The survey of semantic annotation platforms. In: ACM/SAC (2005)
Reidsma, D., Kuper, J., Declerck, T., Saggion, H., Cunningham, H.: Cross document ontology based information extraction for multimedia retrieval. In: Supplementary proceedings of the ICCS 2003, Dresden (2003)
Reinberger, M.-L., Spyns, P.: Discovering knowledge in texts for the learning of dogma-inspired ontologies. In: Buitelaar, et al. [4], pp. 19–24
Richards, D.: Addressing the ontology acquisition bottleneck through reverse ontological engineering. Knowledge and Information Systems 6(4), 402–427 (2004)
Riloff, E., Jones, R.: Learning dictionaries for information extraction by multi-level bootstrapping. In: AAAI/IAAI, pp. 474–479 (1999)
Angelova, G., Boytcheva, S., Dobrev, P.: Cgextract: Towards extraction of conceptual graphs from controlled english. In: Supplementary proceedings of the ICCS 2001, Stanford, USA (2001)
Sigletos, G., Paliouras, G., Spyropoulos, C.D., Stamapoulos, T.: Stacked generalization for information extraction. In: de Mántaras, R.L., Saitta, L. (eds.) ECAI, pp. 549–553. IOS Press, Amsterdam (2004)
Spyropoulos, C.D., Karkaletsis, V., Grover, C., Pazienza, M.-T., Souflis, D., Coch, J.: Final report of the project crossmarc (cross-lingual multi agent retail comparison). Technical report (2003)
Valarakos, A.G., Paliouras, G., Karkaletsis, V., Vouros, G.A.: Enhancing ontological knowledge through ontology population and enrichment. In: Motta, E., Shadbolt, N.R., Stutt, A., Gibbins, N. (eds.) EKAW 2004. LNCS (LNAI), vol. 3257, pp. 144–156. Springer, Heidelberg (2004)
Valarakos, A.G., Paliouras, G., Karkaletsis, V., Vouros, G.A.: A name-matching algorithm for supporting ontology enrichment. In: Vouros, G.A., Panayiotopoulos, T. (eds.) SETN 2004. LNCS (LNAI), vol. 3025, pp. 381–389. Springer, Heidelberg (2004)
Wolff, G.: Grammar discovery as data compression. In: AISB/GI, pp. 375–379 (1978)
Xu, F., Kurz, D., Piskorski, J., Schmeier, S.: Term extraction and mining of term relations from unrestricted texts in the financial domain. In: BIS (2002)
Montes y Gómez, M., Gelbukh, A.F., López-López, A.: Text mining at detail level using conceptual graphs. In: Priss, et al. [39], pp. 122–136
Yangarber, R., Lin, W., Grishman, R.: Unsupervised learning of generalized names. In: COLING (2002)
Zhang, L., Yu, Y.: Learning to generate cgs from domain specific sentences. In: Delugach, H.S., Stumme, G. (eds.) ICCS 2001. LNCS (LNAI), vol. 2120, pp. 44–57. Springer, Heidelberg (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Paliouras, G. (2005). On the Need to Bootstrap Ontology Learning with Extraction Grammar Learning. In: Dau, F., Mugnier, ML., Stumme, G. (eds) Conceptual Structures: Common Semantics for Sharing Knowledge. ICCS 2005. Lecture Notes in Computer Science(), vol 3596. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11524564_8
Download citation
DOI: https://doi.org/10.1007/11524564_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27783-5
Online ISBN: 978-3-540-31885-9
eBook Packages: Computer ScienceComputer Science (R0)