Data Integration Architectures and Methodology for the Life Sciences
Given a set of data sources, data integration is the process of creating an integrated resource combining data from the data sources, in order to allow queries and analyses that could not be supported by the individual data sources alone. Biological data sources are characterized by their high degree of heterogeneity, in terms of their data model, query interfaces and query processing capabilities, data types used, and nomenclature adopted for actual data values. Coupled with the variety, complexity and volumes of biological data available, the integration of biological data sources poses many challenges, and a number of methodologies, architectures and systems have been developed to support it.
If an application requires data from different data sources to be integrated in order to support users' queries and analyses, one possible solution is for the required data transformation and aggregation functionality to be encoded into the application's...
- 5.Entrez – the life sciences search engine. Available at: http://www.ncbi.nlm.nih.gov/Entrez.
- 9.Lacroix Z, Critchlow T. Bioinformatics: managing scientific data. San Francisco: Morgan Kaufmann; 2004.Google Scholar
- 12.Maibaum M, et al. Cluster based integration of heterogeneous biological databases using the AutoMed toolkit. In: Proceedings of the 2nd International Workshop on Data Integration in the Life Sciences; 2005. p. 191–207.Google Scholar
- 13.Lenzerini M. Data integration: a theoretical perspective. In: Proceedings of the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2002. p. 233–46.Google Scholar
- 15.McBrien P, Poulovassilis A. Data integration by bi-directional schema transformation rules. In: Proceedings of the 19th International Conference on Data Engineering; 2003. p. 227–38.Google Scholar
- 17.Ives ZG. Data integration and exchange for scientific collaboration. In: Proceedings of the 6th International Workshop on Data Integration in the Life Sciences; 2009. p. 1–4.Google Scholar