Skip to main content

On the Need to Bootstrap Ontology Learning with Extraction Grammar Learning

  • Conference paper
Book cover Conceptual Structures: Common Semantics for Sharing Knowledge (ICCS 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3596))

Included in the following conference series:

Abstract

The main claim of this paper is that machine learning can help integrate the construction of ontologies and extraction grammars and lead us closer to the Semantic Web vision. The proposed approach is a bootstrapping process that combines ontology and grammar learning, in order to semi-automate the knowledge acquisition process. After providing a survey of the most relevant work towards this goal, recent research of the Software and Knowledge Engineering Laboratory (SKEL) of NCSR “Demokritos” in the areas of Web information integration, information extraction, grammar induction and ontology enrichment is presented. The paper concludes with a number of interesting issues that need to be addressed in order to realize the advocated bootstrapping process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Appelt, D.E., Hobbs, J.R., Bear, J., Israel, D.J., Tyson, M.: Fastus: A finite-state processor for information extraction from real-world text. In: Bajcsy, R. (ed.) IJCAI, pp. 1172–1178 (1993)

    Google Scholar 

  2. Bikel, D.M., Miller, S., Schwartz, R.L., Weischedel, R.M.: Nymble: a high-performance learning name-finder. In: ANLP, pp. 194–201 (1997)

    Google Scholar 

  3. Brewster, C., Ciravegna, F., Wilks, Y.: User-centred ontology learning for knowledge management. In: Andersson, B., Bergholtz, M., Johannesson, P. (eds.) NLDB 2002. LNCS, vol. 2553, pp. 203–207. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  4. Buitelaar, P., Handschuh, S., Magnini, B. (eds.): Proceedings of the ECAI Ontology Learning and Population Workshop, Valencia, Spain, August 22-24 (2004)

    Google Scholar 

  5. Buitelaar, P., Olejnik, D., Sintek, M.: A protégé plug-in for ontology extraction from text based on linguistic analysis. In: Bussler, C.J., Davies, J., Fensel, D., Studer, R. (eds.) ESWS 2004. LNCS, vol. 3053, pp. 31–44. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  6. Cimiano, P., Hotho, A., Stumme, G., Tane, J.: Conceptual knowledge processing with formal concept analysis and ontologies. In: Eklund, P. (ed.) ICFCA 2004. LNCS (LNAI), vol. 2961, pp. 189–207. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  7. Cimiano, P., Schmidt-Thieme, L., Pivk, A., Staab, S.: Learning taxonomic relations from heterogeneous evidence. In: Buitelaar et al. [4]

    Google Scholar 

  8. Ciravegna, F.: Adaptive information extraction from text by rule induction and generalisation. In: Nebel, B. (ed.) IJCAI, pp. 1251–1256. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  9. Ciravegna, F., Dingli, A., Guthrie, D., Wilks, Y.: Integrating information to bootstrap information extraction from web sites. In: Kambhampati, S., Knoblock, C.A. (eds.) IIWeb, pp. 9–14 (2003)

    Google Scholar 

  10. Corbett, D.: Interoperability of ontologies using conceptual graph theory. In: Wolff, K.E., Pfeiffer, H.D., Delugach, H.S. (eds.) ICCS 2004. LNCS (LNAI), vol. 3127, pp. 375–387. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  11. Crysmann, B., Frank, A., Kiefer, B., Mueller, S., Neumann, G., Piskorski, J., Schäfer, U., Siegel, M., Uszkoreit, H., Xu, F., Becker, M., Krieger, H.-U.: An integrated archictecture for shallow and deep processing. In: ACL, pp. 441–448 (2002)

    Google Scholar 

  12. Delteil, A., Faron, C., Dieng, R.: Building concept lattices by learning concepts from rdf graphs annotating web documents. In: Priss, et al. [39], pp. 191–204

    Google Scholar 

  13. Dietterich, T.G.: Learning at the Knowledge Level. Machine Learning 1(3), 287–316 (1986)

    Google Scholar 

  14. Embley, D.W.: Programming with data frames for everyday data items. In: NCC, p. 301305 (1980)

    Google Scholar 

  15. Embley, D.W.: Towards semantic understanding – an approach based on information extraction ontologies. In: Schewe, K.-D., Williams, H.E. (eds.) ADC. CRPIT, vol. 27, p. 3. Australian Computer Society (2004)

    Google Scholar 

  16. Faure, D., Nedellec, C.: Knowledge acquisition of predicate argument structures from technical texts using machine learning: The system ASIUM. In: Fensel, D., Studer, R. (eds.) EKAW 1999. LNCS (LNAI), vol. 1621, pp. 329–334. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  17. Fellbaum, C. (ed.): WordNet An Electronic Lexical Database. Bradford Books (1998)

    Google Scholar 

  18. Gaizauskas, R., Wakao, T., Humphreys, K., Cunningham, H., Wilks, Y.: University of sheffield: Description of the lasie system as used for muc-6. In: MUC-6, pp. 207–220 (1995)

    Google Scholar 

  19. Hahn, U., Markó, K.G.: An integrated, dual learner for grammars and ontologies. Data Knowl. Eng. 42(3), 273–291 (2002)

    Article  MATH  Google Scholar 

  20. Hakeem, A., Sheikh, Y., Shah, M.: Casee: A hierarchical event representation for the analysis of videos. In: McGuinness, D.L., Ferguson, G. (eds.) AAAI, pp. 263–268. AAAI Press / The MIT Press (2004)

    Google Scholar 

  21. Hess, J., Cyre, W.R.: A cg-based behavior extraction system. In: Tepfenhart, W.M., Cyre, W.R. (eds.) ICCS 1999. LNCS, vol. 1640, pp. 127–139. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  22. Jacobs, P.S., Rau, L.F.: Scisor: Extracting information from on-line news. Communications of the ACM 33(11), 88–97 (1990)

    Article  Google Scholar 

  23. Karkaletsis, V., Paliouras, G., Spyropoulos, C.D.: A bootstrapping approach to knowledge acquisition from multimedia content with ontology evolution. In: Honkela, T., Simula, O. (eds.) AKRR. Helsinki University of Technology (2005)

    Google Scholar 

  24. Karkaletsis, V., Spyropoulos, C.D.: Cross-lingual information management from web pages. In: PCI (2003)

    Google Scholar 

  25. Kushmerick, N.: Wrapper induction: Efficiency and expressiveness. Artificial Intelligence 118(1-2), 15–68 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  26. Langley, P., Stromsten, S.: Learning context-free grammars with a simplicity bias. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 220–228. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  27. Maedche, A., Staab, S.: Mining ontologies from text. In: Dieng, R., Corby, O. (eds.) EKAW 2000. LNCS (LNAI), vol. 1937, pp. 189–202. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  28. Modayil, J., Kuipers, B.: Bootstrap learning for object discovery. In: IROS. IEEE Press, Los Alamitos (2004)

    Google Scholar 

  29. Muslea, I., Minton, S., Knoblock, C.A.: A hierarchical approach to wrapper induction. In: Agents, pp. 190–197 (1999)

    Google Scholar 

  30. Navigli, R., Velardi, P.: Learning domain ontologies from document warehouses and dedicated websites. Computational Linguistics 30(2) (2004)

    Google Scholar 

  31. Neumann, G., Xu, F.: Course on intelligent information extraction. In: ESSLI (2004)

    Google Scholar 

  32. Neumann, G., Piskorski, J.: A shallow text processing core engine. Computational Intelligence 18(3), 451–476 (2002)

    Article  Google Scholar 

  33. Nicolas, S., Moulin, B., Mineau, G.W.: Sesei: A cg-based filter for internet search engines. In: Ganter, B., de Moor, A., Lex, W. (eds.) ICCS 2003. LNCS, vol. 2746, pp. 362–377. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  34. Ogata, N., Collier, N.: Ontology express: Statistical and non-monotonic learning of domain ontologies from text. In: Buitelaar, et al. [4], pp. 19–24

    Google Scholar 

  35. Patrick, J.: The scamseek project: Text mining for finanical scams on the internet. In: Simoff, S.J., Williams, G.J. (eds.) ADMC, pp. 33–38 (2004)

    Google Scholar 

  36. Petasis, G., Cucchiarelli, A., Velardi, P., Paliouras, G., Karkaletsis, V., Spyropoulos, C.D.: Automatic adaptation of proper noun dictionaries through cooperation of machine learning and probabilistic methods. In: Belkin, N.J., Ingwersen, P., Leong, M.-K. (eds.) SIGIR, pp. 128–135. ACM, New York (2000)

    Chapter  Google Scholar 

  37. Petasis, G., Paliouras, G., Karkaletsis, V., Halatsis, C., Spyropoulos, C.D.: e-grids: Computationally efficient grammatical inference from positive examples. Grammars (2004)

    Google Scholar 

  38. Petasis, G., Paliouras, G., Spyropoulos, C.D., Halatsis, C.: eg-grids: Context-free grammatical inference from positive examples using genetic search. In: Paliouras, G., Sakakibara, Y. (eds.) ICGI 2004. LNCS (LNAI), vol. 3264, pp. 223–234. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  39. Priss, U., Corbett, D., Angelova, G. (eds.): ICCS 2002. LNCS (LNAI), vol. 2393. Springer, Heidelberg (2002)

    MATH  Google Scholar 

  40. Reeve, L., Han, H.: The survey of semantic annotation platforms. In: ACM/SAC (2005)

    Google Scholar 

  41. Reidsma, D., Kuper, J., Declerck, T., Saggion, H., Cunningham, H.: Cross document ontology based information extraction for multimedia retrieval. In: Supplementary proceedings of the ICCS 2003, Dresden (2003)

    Google Scholar 

  42. Reinberger, M.-L., Spyns, P.: Discovering knowledge in texts for the learning of dogma-inspired ontologies. In: Buitelaar, et al. [4], pp. 19–24

    Google Scholar 

  43. Richards, D.: Addressing the ontology acquisition bottleneck through reverse ontological engineering. Knowledge and Information Systems 6(4), 402–427 (2004)

    Article  MathSciNet  Google Scholar 

  44. Riloff, E., Jones, R.: Learning dictionaries for information extraction by multi-level bootstrapping. In: AAAI/IAAI, pp. 474–479 (1999)

    Google Scholar 

  45. Angelova, G., Boytcheva, S., Dobrev, P.: Cgextract: Towards extraction of conceptual graphs from controlled english. In: Supplementary proceedings of the ICCS 2001, Stanford, USA (2001)

    Google Scholar 

  46. Sigletos, G., Paliouras, G., Spyropoulos, C.D., Stamapoulos, T.: Stacked generalization for information extraction. In: de Mántaras, R.L., Saitta, L. (eds.) ECAI, pp. 549–553. IOS Press, Amsterdam (2004)

    Google Scholar 

  47. Spyropoulos, C.D., Karkaletsis, V., Grover, C., Pazienza, M.-T., Souflis, D., Coch, J.: Final report of the project crossmarc (cross-lingual multi agent retail comparison). Technical report (2003)

    Google Scholar 

  48. Valarakos, A.G., Paliouras, G., Karkaletsis, V., Vouros, G.A.: Enhancing ontological knowledge through ontology population and enrichment. In: Motta, E., Shadbolt, N.R., Stutt, A., Gibbins, N. (eds.) EKAW 2004. LNCS (LNAI), vol. 3257, pp. 144–156. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  49. Valarakos, A.G., Paliouras, G., Karkaletsis, V., Vouros, G.A.: A name-matching algorithm for supporting ontology enrichment. In: Vouros, G.A., Panayiotopoulos, T. (eds.) SETN 2004. LNCS (LNAI), vol. 3025, pp. 381–389. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  50. Wolff, G.: Grammar discovery as data compression. In: AISB/GI, pp. 375–379 (1978)

    Google Scholar 

  51. Xu, F., Kurz, D., Piskorski, J., Schmeier, S.: Term extraction and mining of term relations from unrestricted texts in the financial domain. In: BIS (2002)

    Google Scholar 

  52. Montes y Gómez, M., Gelbukh, A.F., López-López, A.: Text mining at detail level using conceptual graphs. In: Priss, et al. [39], pp. 122–136

    Google Scholar 

  53. Yangarber, R., Lin, W., Grishman, R.: Unsupervised learning of generalized names. In: COLING (2002)

    Google Scholar 

  54. Zhang, L., Yu, Y.: Learning to generate cgs from domain specific sentences. In: Delugach, H.S., Stumme, G. (eds.) ICCS 2001. LNCS (LNAI), vol. 2120, pp. 44–57. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Paliouras, G. (2005). On the Need to Bootstrap Ontology Learning with Extraction Grammar Learning. In: Dau, F., Mugnier, ML., Stumme, G. (eds) Conceptual Structures: Common Semantics for Sharing Knowledge. ICCS 2005. Lecture Notes in Computer Science(), vol 3596. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11524564_8

Download citation

  • DOI: https://doi.org/10.1007/11524564_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27783-5

  • Online ISBN: 978-3-540-31885-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics