Skip to main content

Integration of Biological Data and Quality-Driven Source Negotiation

  • Conference paper
  • First Online:
Conceptual Modeling — ER 2001 (ER 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2224))

Included in the following conference series:

Abstract

Evaluation of data non-quality in database or datawarehouse systems is a preliminary stage before any data usage and analysis, moreover in the context of data integration where several sources provide more or less redundant or contradictory information items and whose quality is often unknown, imprecise and very heterogeneous. Our application domain is bioinformatics where more than five hundred of semi-structured databanks propose biological information without any quality information (i.e. metadata and statistics describing the production and the management of the biological data). In order to facilitate the multi-source data integration in the context of distributed biological databanks, we propose a technique based on the concepts of quality contract and data source negotiation for a standard wrapper-mediator architecture. A quality source contract allows to specify quality dimensions necessary to the mediator for data extraction among several distributed resources. The source selection is dynamically computed with the contract negotiation which we propose to include into the mediation and the global query processings before data acquisition. The integration of the multi-source biological data is differed for the restitution and combination of the results ofthe global user’s query by techniques of data recommendation taking into account source quality requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Carey, L. Haas, and P. Schwarz et al. Towards heterogeneous multimedia information systems: The GARLIC approach. In RIDE-DOM, pages 124–131, March 1995.

    Google Scholar 

  2. S. Chawathe, H. Garcia-Molina, and J. Hammer et al. The TSIMMIS project: Integration of heterogeneous information sources. IPSJ, pages 7–18, October 1994.

    Google Scholar 

  3. C. Chee, Y. Arens, C. Knoblock, and C. Hsu. Retrieving and integrating data from multiple information sources. Intl. J. of Intelligent and Cooperative Information Systems, 2(2):127–158, 1993.

    Article  Google Scholar 

  4. D. Clavanese, G. De Giacomo, and M. Lenzerini et al. Data integration in datawarehousing. Tech. Rep., 1997.

    Google Scholar 

  5. S. Cluet, C. Delobel, J. Siméon, and K. Smaga. Your mediators need data conversion! In ACM SIGMOD Conf. on Management of Data, pp. 177–188, 1998.

    Google Scholar 

  6. M. Fernandez, D. Florescu, and J. Kang et al. Catching the boat with STRUDEL: Experiences with a web-site management system. In ACM SIGMOD Conf. on Management of Data, pp. 414–425, 1998.

    Google Scholar 

  7. H. Galhardas, D. Florescu, D. Shasha, E. Simon, and C. Saita. Declarative data cleaning: Language, model, and algorithms. Tech. Rep. RR-4149, INRIA, 2001.

    Google Scholar 

  8. C. Goh, S. Madnick, and M. Siegel. Context Interchange: overcoming the challenges of the large-scale interoperable database systems in a dynamic environment. In Proc. of CIKM’94, pp. 337–346, 1994.

    Google Scholar 

  9. M. Goodchild and R. Jeansoulin. Data quality in geographic information: from error to uncertainty. Hermès, 1998.

    Google Scholar 

  10. W. Hou, Z. Zhang. Enhancing database correctness: a statistical approach. In Proc. of ACM SIGMOD Conf. on Management of Data, 1995.

    Google Scholar 

  11. R. Hull. Managing semantic heterogeneity in databases: a theoretical prospective. In Proc. of PODS’97, pp. 51–61, 1997.

    Google Scholar 

  12. M. Jarke, M. Lenzerini, Y. Vassiliou, and P. Vassiliadis. Fundamentals of Data Warehouses. Springer, 1998.

    Google Scholar 

  13. S. H. Kan. Metrics and models in software quality engineering. Addison-Wesley, 1995.

    Google Scholar 

  14. A. Y. Levy, D. Srivastava, and T. Kirk. Data model and query evaluation in global information system. J. of Intelligent Information Systems, 5(2):121–143, 1995.

    Google Scholar 

  15. E.P. Lim, J. Srivastava, and S. Shekhar. Resolving attribute incompatibility in database integration: An evidential reasoning approach. In Proc. of the 10th Intl. Conference on Data Engineering (ICDE’94), 1994.

    Google Scholar 

  16. A. Monge, C. Elkan. An efficient domain-independent algorithm for detecting approximately duplicate database records. In Workshop on Research Issues on Data Mining and Knowledge Discovery, 1997.

    Google Scholar 

  17. F. Naumann, U. Leser. Quality-driven integration ofh eterogeneous information systems. In Proc. of VLDB’99, pp. 447–458, 1999.

    Google Scholar 

  18. J. Ordille, A. Levy, and A. Rajaraman. Querying heterogeneous information sources using source descriptions. In Proc. of VLDB’96, pp. 251–262, 1996.

    Google Scholar 

  19. Y. Papakonstantinou, H. Garcia-Molina, and J. Widom. Object exchange across heterogeneous information source. In Proc. of ICDE’95, pp. 251–260, 1995.

    Google Scholar 

  20. T.C. Redman. Data quality for the information age. Artech House, 1996.

    Google Scholar 

  21. J. Rothenberg. Metadata to support data quality and longevity. In Proc. of IEEE Metadata Conf., 1996.

    Google Scholar 

  22. F. Sadri. Reliability ofan swers to queries in relational databases. IEEE TKDE, 3(2):245–252, 1991.

    Google Scholar 

  23. J. Schlimmer. Learning determinations and checking databases. In Proc. of the AAAI-91 Workshop on KDD, 1991.

    Google Scholar 

  24. A. Sheth, C. Wood, and V. Kashyap. Q-data: Using deductive database technology to improve data quality. In Proc. of ILPS’93, pp. 23–56, 1993.

    Google Scholar 

  25. D. Strong, Y. Lee, and R. Wang. Data quality in context. Com. of the ACM, 40(5):103–110, 1997.

    Article  Google Scholar 

  26. G. Tayi, D. Ballou. Examining data quality. Com. of the ACM, 41(2):54–57, 1998.

    Article  Google Scholar 

  27. R. Wang. A product perspective on Total Data Quality Management. Com. of the ACM, 41(2):58–65, 1998.

    Article  Google Scholar 

  28. R. Wang, S. Madnick. A polygen model for heterogeneous database systems: the source tagging perspective. In Proc. of VLDB’90, pp. 519–538, 1990.

    Google Scholar 

  29. R. Wang, V. Storey, and C. Firth. A framework for analysis of data quality research. IEEE TKDE, 7(4):623–638, 1995.

    Google Scholar 

  30. G. Wiederhold. Mediation in information systems. ACM Computing Surveys, 27(2):265–267, 1995.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Berti-Equille, L. (2001). Integration of Biological Data and Quality-Driven Source Negotiation. In: S.Kunii, H., Jajodia, S., Sølvberg, A. (eds) Conceptual Modeling — ER 2001. ER 2001. Lecture Notes in Computer Science, vol 2224. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45581-7_20

Download citation

  • DOI: https://doi.org/10.1007/3-540-45581-7_20

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42866-4

  • Online ISBN: 978-3-540-45581-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics