Skip to main content

Extraction of informations from highly heterogeneous source of textual data

  • Invited Papers
  • Conference paper
  • First Online:
Cooperative Information Agents (CIA 1997)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1202))

Included in the following conference series:

Abstract

Extracting informations from multiple sources, highly heterogeneous, of textual data and integrating them in order to provide true information is a challenging research topic in the database area. In order to illustrate problems and solutions, one of the most interesting projects facing this problem, TSIMMIS, is presented. Furthermore, a Description Logics approach, able to provide interesting solutions both for data integration and data querying, is introduced.

This research has been partially funded by the MURST 40 % Italian Project: ’Basi di dati Evolute: Modelli, metodi e sistemi’.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chawathe S., H. Garcia Molina, J. Hammer, K.Ireland, Y. Papakostantinou, J.Ullman, and J. Widom. The TSIMMIS Project: Integration of Heterogeneous Information Sources. In Proceedings of IPSJ Conference, pages. 7–18, Tokyo, Japan, October 1994. (Also available via anonymous FTP from host db.stanford.edu, file /pub/chawathe/1994/tsimmis-overview.ps.).

    Google Scholar 

  2. H. Garcia-Molina et al. The TSIMMIS approach to mediation: Data models and languages. In Proceedings of 1995 NGITS workshop. (ftp://db.stanford.edu/pub/garcia/1995/tisimmis-models-languages.ps.).

    Google Scholar 

  3. Y. Papakonstantinou, H. Garcia-Molina, J. Ulman and Ashish Gupta. A query Translation scheme for rapid implementation of wrappers. avaible at ftp://db.stanford.edu/pub/papakonstantinou/1995/querytran-extended.ps.

    Google Scholar 

  4. Y. Papakonstantinou, H. Garcia-Molina, J. Ulman, ”MedMaker: A mediation system based on declarative specification”, avaible at ftp://db.stanford.edu/pub/papakonstantinou/1995/medmaker.ps.

    Google Scholar 

  5. Y. Papakonstantinou, H. Garcia-Molina and J. Widom. Object Exchange Across Heterogeneous Information Sources. In Proceedings of IEEE International Conference on Data Engineering, pages. 251–260, Taipei, Taiwan, March 1995. (Also available via anonymous FTP from host db.stanford.edu file /pub/papakonstantinou/1994/object-exchange-heterogeneous-is.ps.)

    Google Scholar 

  6. H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. Ullman, and Jennifer Widom. Integrating and Accessing Heterogeneous Information Sources in TSIMMIS. In Proceedings of the AAAI Symposium on Information Gathering, pages. 61–64, Stanford, California, March 1995. (Also available via anonymous FTP from host db.stanford.edu, file /pub/garcia/1995/tsimmis-abstract-aaai.ps.)

    Google Scholar 

  7. A. Rajaraman, Y. Sagiv, and J. Ullman. Answering Queries Using Templates with Binding Patterns. In Proceedings of the 14th ACM PODS, pages. 105–112, San Jose, California, May 1995. (Also available via anonymous FTP from host db.stanford.edu, file /pub/rajarman/1994/limited-opsets.ps).

    Google Scholar 

  8. D. Quass, A. Rajaraman, Y. Sagiv, J. Ullman, and J. Widom. Querying Semistructured Heterogeneous Information. In International Conference on Deductive and Object-Oriented Databases, 1995. (Also available via anonymous FTP from host db.stanford.edu, file /pub/quass/1994/querying-full.ps)

    Google Scholar 

  9. R. Ahmed et al. The Pegasus heterogeneous multidatabase system. IEEE Computer, 24:19–27, 1991.

    Google Scholar 

  10. E. Bertino. Integration of heterogeneous data repositories by using object-oriented views. In Proc. Intl Workshop on Interoperability in Multidatabase Systems, pages 22–29, Kyoto, Japan,1991.

    Google Scholar 

  11. Y.J. Breibart et al. Database integration in a distributed heterogeneous database system. In Proc. 2nd Intl IEEE Conf. on Data Engineering, Los Angeles, CA, February 1986.

    Google Scholar 

  12. P. Bunemann, L. Raschid, J. Ulman. Mediator Languages — a Proposal for a standard. Report of an I3/POB working group held at the University of Maryland, April 1996 (available as ftp://ftp.umiacs.umd.edu/pub/ONRrept/medmodel96.ps).

    Google Scholar 

  13. M.J. Carey et al. Towards heterogeneous multimedia information systems: the Garlic approach. Technical Report RJ 9911, IBM Almaden Research Center, 1994.

    Google Scholar 

  14. R. G. G. Cattel, et al. The Object Database Standard — ODGM93. Release 1.2. Morgan Kaufmann,1996.

    Google Scholar 

  15. Object Request Broker Task Force. The Common Object Request Broker: Architecture and Specification, December 1993. Revision 1.2, Draft 29.

    Google Scholar 

  16. U. Dayal and H. Hwuang. View definition and generalization for databse integration in a multidatabase system. In Proc. IEEE Workshop on Object-Oriented DBMS, Asilomar, CA, September 1986.

    Google Scholar 

  17. M. Freedman. WILLOW: Technical overview. Available by anonymous ftp from ftp.cac.washington.edu as the file willow/Tech-Report.ps, September 1994.

    Google Scholar 

  18. A. Gupta. Integration of Information Systems: Bridging heterogeneous Databases IEEE Press, 1989.

    Google Scholar 

  19. J. Hammer and D. McLeod. An approach to resolving semantic heterogeneity in a federation of autonomous, heterogeneous database systems. Intl Journal of Intelligent and Cooperative Information Systems, 2: 51–83, 1993.

    Google Scholar 

  20. M. Huhns et al. Enterprise information modeling and model integration in Carnot. Technical Report Carnot128-92, MCC,1992.

    Google Scholar 

  21. W. Kim et al. On resolving schematic heterogeneity in multidatabase systems. Distributed and Parallel Databases, 1:251–279, 1993.

    Google Scholar 

  22. W. Litwin, L. Mark, and N. Roussopoulos. Interoperability of multiple autonomous databases. ACM Computing Surveys, 22:267–293, 1990.

    Google Scholar 

  23. K. Shoens et al. The RUFUS system: Information organization for semistructured data. newblock In Proc. VLDB Conference, Dublin, Ireland, 1993.

    Google Scholar 

  24. G. Thomas et al. Heterogeneous distributed database systems for production use. ACM Computing Surveys, 22: 237–266,1990.

    Google Scholar 

  25. J.C. Franchitti and R. King. Amalgame: a tool for creating interoperating persistent, heterogeneous components. In Advanced Database Systems

    Google Scholar 

  26. L. A. Cardelli. Semantics of multiple inheritance. In Semantics of Data Types, pages 51–67. Springer-Verlag, Berlin, Heidelberg, New York, 1984.

    Google Scholar 

  27. D. Calvanese, G. De Giacomo and M. Lenzerini. Structured Objects: Modeling and Reasoning. In Proceedings of International Conference on Deductive and Object-Oriented Databases, 1995.

    Google Scholar 

  28. G. De Giacomo and M. Lenzerini. PDL based framework for reasoning about actions. In Proceedings of the AI*IA'95, LNAI 992, pages 103–114, Spriger Verlag 1995.

    Google Scholar 

  29. G. Wiederhold. Mediators in the architecture of future information systems. IEEE Computer, 25:38–49,1992.

    Google Scholar 

  30. W.A. Woods and J.G. Schmolze. The KL-ONE family. In F.W. Lehmann, editor, Semantic Networks in Artificial Intelligence, pages 133–178, Pergamon Press 1992.

    Google Scholar 

  31. L. Mark and N. Roussopoulos. Information interchange between self-describing databases. IEEE Data Engineering, 10:46–52,1987.

    Google Scholar 

  32. D. Beneventano and S. Bergamaschi. Incoherence and Subsumption for cyclic queries and views in Object Oriented Databases. In DKE, january, 1997.

    Google Scholar 

  33. S. Abiteboul and A. Bonner. Objects and views. In SIGMOD, pages 238–247. ACM Press, 1991.

    Google Scholar 

  34. S. Abiteboul and S. Grumbach. Col: a logic-based language for complex objects. In S. Ceri, J.W. Schmidt, and M. Missikoff, editors, EDBT '88 — Lecture Notes in Computer Science N.303, pages 271–293. Springer-Verlag, 1988.

    Google Scholar 

  35. S. Abiteboul and R. Hull. IFO: A formal semantic database model. ACM Transactions on Database Systems, 12(4):525–565, 1987.

    Article  Google Scholar 

  36. S. Abiteboul and P. Kanellakis. Object identity as a query language primitive. In SIGMOD, pages 159–173. ACM Press, 1989.

    Google Scholar 

  37. R. Agrawal. Alpha: An extension of relational algebra to express a class of recursive queries. IEEE Transactions on Software Enginering, 14(7), July 1988.

    Google Scholar 

  38. M. Atkinson et al. The object-oriented database system manifesto. In 1nd Int. Conf. on Deductive and Object-Oriented Databases. Springer-Verlag, 1989.

    Google Scholar 

  39. P. Atzeni, editor. LOGIDATA +: Deductive Databases with Complex Objects. Springer-Verlag: LNCS n. 701, Heidelberg-Germany, 1993.

    Google Scholar 

  40. F. Baader. Terminological cycles in KL-ONE-based knowledge representation languages. In 8th National Conference of the American Association for Artificial Intelligence, volume 2, pages 621–626, Boston, Mass., USA, 1990.

    Google Scholar 

  41. J. P. Ballerini, S. Bergamaschi, and C. Sartori. The ODL-DESIGNER prototype. In P. Atzeni, editor, LOGIDATA+: Deductive Databases with complex objects. Springer-Verlag, 1993.

    Google Scholar 

  42. D. Beneventano, S. Bergamaschi and C. Sartori. Using Subsumption for Semantic query optimization in OODB. International Workshop on Description Logics, DFKI Technical Report D-94-10.

    Google Scholar 

  43. D. Beneventano, S. Bergamaschi, S. Lodi, and C. Sartori. Using subsumption in semantic query optimization. In A. Napoli, editor, IJCAI Workshop on Object-Based Representation Systems, Chambery-France, August 1993.

    Google Scholar 

  44. G. Di Battista and M. Lenzerini. Deductive entity relationship modeling. IEEE Trans. on Knowledge and Data Engineering, 5(3):439–450, June 1993.

    Google Scholar 

  45. H.W. Beck, S.K. Gala, and S.B. Navathe. Classification as a query processing technique in the CANDIDE data model. In 5th Int. Conf. on Data Engineering, pages 572–581, Los Angeles, CA, 1989.

    Google Scholar 

  46. C. Beeri. Formal models for object-oriented databases. In W. Kim, J.M. Nicolas, and S. Nishio, editors, Deductive and Object-Oriented Databases, page 405:430. Elsevier Science Publisher, B.V.-North-Holland, 1990.

    Google Scholar 

  47. D. Beneventano and S. Bergamaschi. Subsumption for complex object data models. In J. Biskup and R. Hull, editors, 4th Int. Conf. on Database Theory — Berlin, pages 357–375, Heidelberg, Germany, October 1992. Springer-Verlag.

    Google Scholar 

  48. S. Bergamaschi and B. Nebel. Acquisition and validation of complex object database schemata supporting multiple inheritance. Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks and Complex Problem Solving Technologies, 1993.

    Google Scholar 

  49. S. Bergamaschi and C. Sartori. On taxonomic reasoning in conceptual design. ACM Transactions on Database Systems, 17(3):385–422, September 1992.

    Article  Google Scholar 

  50. E. Bertino, M. Negri, G. Pelagatti, and L. Sbattella. Object-oriented query languages: The notion and the issues. IEEE Trans. Knowl. and Data Engineering, 4(3):223–236, June 1992.

    Google Scholar 

  51. A. Borgida, R.J. Brachman, D.L. McGuinness, and L.A. Resnick. CLASSIC: A structural data model for objects. In SIGMOD, pages 58–67, Portland, Oregon, 1989.

    Google Scholar 

  52. M. Buchheit, M. A. Jeusfeld, W. Nutt, and M. Staudt. Subsumption between queries to object-oriented database. In EDBT, pages 348–353, 1994.

    Google Scholar 

  53. L. Cardelli. A semantics of multiple inheritance. In Semantics of Data Types — Lecture Notes in Computer Science N. 173, pages 51–67. Springer-Verlag, 1984.

    Google Scholar 

  54. R.G.G. Cattell and J. Skeen. Object operations benchmark. ACM Transactions on Database Systems, 17(1), 1990.

    Google Scholar 

  55. U. S. Chakravarthy, J. Grant, and J. Minker. Logic-based approach to semantic query optimization. ACM Transactions on Database Systems, 15(2):162–207, June 1990.

    Google Scholar 

  56. F.M. Donini, M. Lenzerini, D. Nardi, and W. Nutt. The complexity of concept languages. In J. Allen, R. Fikes, and E. Sandewall, editors, KR '91 — 2nd Int. Conf on Principles of Knowledge Representation and Reasoning, pages 151–162, Cambridge-MA, April 1991. Morgan Kauffmann Publishers, Inc.

    Google Scholar 

  57. H. Gallaire, J. Minker, and J.M. Nicholas. Logic and databases: a deductive approach. ACM Computing Surveys, 16(2), 1984.

    Google Scholar 

  58. M. Kifer, W. Kim, and Y. Sagiv. Querying object-oriented databases. In SIGMOD '92, pages 393–402. ACM, June 1992.

    Google Scholar 

  59. W. Kim and F. Lochovsky, editors. Object-Oriented Concepts, Databases, and Applications. Addison-Wesley, Reading (Mass.), 1989.

    Google Scholar 

  60. J. J. King. Quist: a system for semantic query optimization in relational databases. In 7th Int. Conf. on Very Large Databases, pages 510–517, 1981.

    Google Scholar 

  61. C. Lecluse and P. Richard. Modelling complex structures in object-oriented databases. In Symp. on Principles of Database Systems, pages 362–369, Philadelphia, PA, 1989.

    Google Scholar 

  62. C. Lécluse and P. Richard. The O2 data model. In Int. Conf. On Very Large Data Bases, pages 411–422, Amsterdam, 1989.

    Google Scholar 

  63. J.W. Lloyd. Logic Programming. Springer-Verlag, Berlin, 1987.

    Google Scholar 

  64. B. Nebel. Terminological reasoning is inherently intractable. Artificial Intelligence, 43(2), 1990.

    Google Scholar 

  65. B. Nebel. Terminological cycles: Semantics and computational properties. In J.F. Sowa, editor, Principles of Semantic Networks, chapter 11, pages 331–362. Morgan Kaufmann Publishers, Inc., 1991.

    Google Scholar 

  66. B. Nebel and C. Peltason. Terminological reasoning and information management. Technical Report 85, Tech. Univ, Berlino, October 1990.

    Google Scholar 

  67. D.A. Schmidt. Denotational Semantics: A Methodology for Language Development Allyn and Bacon, Boston, 1986.

    Google Scholar 

  68. S. Shenoy and M. Ozsoyoglu. Design and implementation of a semantic query optimizer. IEEE Trans. Knowl. and Data Engineering, 1(3):344–361, September 1989.

    Google Scholar 

  69. W.A. Woods and J.G. Schmolze. The kl-one family. In F.W. Lehman, editor, Semantic Networks in Artificial Intelligence, pages 133–178. Pergamon Press, 1992. Pubblished as a Special issue of Computers & Mathematics with Applications, Volume 23, Number 2–9.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Peter Kandzia Matthias Klusch

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bergamaschi, S. (1997). Extraction of informations from highly heterogeneous source of textual data. In: Kandzia, P., Klusch, M. (eds) Cooperative Information Agents. CIA 1997. Lecture Notes in Computer Science, vol 1202. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-62591-7_23

Download citation

  • DOI: https://doi.org/10.1007/3-540-62591-7_23

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-62591-9

  • Online ISBN: 978-3-540-68321-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics