Advertisement

Creating and Querying an Integrated Ontology for Molecular and Phenotypic Cereals Data

  • Sonia Bergamaschi
  • Antonio Sala

n this paper we describe the development of an ontology of molecular and phenotypic cereals data, realized by integrating existing public web databases with the database developed by the research group of the CEREALAB project (www.cerealab.org). This integration is obtained using the MOMIS system (Mediator envirOnment for Multiple Information Sources), a mediator based data integration system developed by the Database Group of the University of Modena and Reggio Emilia(www.dbgroup.unimo.it). MOMIS performs information extraction and integration from both structured and semi-structured data sources in a semi-automatic way. Information integration is performed in a semi-automatic way, by exploiting the knowledge in a Common Thesaurus (defined by the framework) and the descriptions of source schemas with a combination of clustering and Description Logics techniques. The result of the integration process is a Global Virtual Schema (GVV) of the underlying data sources for which mapping rules and integrity constraints are specified to handle heterogeneity. Each GVV element is annotated w.r.t. the WordNet lexical database(wordnet.princeton.edu). The GVV can be queried transparently with regards to integrated data sources using an easy to use graphical interface regardless of the specific languages of the source databases.

Keywords

Phenotypic Data Global Schema Integrity Constraint Data Integration System Source Schema 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Beneventano D, Bergamaschi S (2007) Semantic Web Services: Theory, Tools and Applications, Idea Group Publishing, chap Semantic Search Engines based on Data Integration SystemsGoogle Scholar
  2. 2.
    [2] Beneventano D, Bergamaschi S, Lodi S, Sartori C (1998) Consistency checking in complex object database schemata with integrity constraints. IEEE Trans Knowl Data Eng 10(4):576–598CrossRefGoogle Scholar
  3. 3.
    [3] Beneventano D, Bergamaschi S, Guerra F, Vincini M (2003) Synthesizing an integrated ontology. IEEE Internet Computing 7(5):42–51CrossRefGoogle Scholar
  4. 4.
    [4] Beneventano D, Bergamaschi S, Sartori C (2003) Description logics for semantic query optimization in object-oriented database systems. ACM Trans Database Syst 28:1–50CrossRefGoogle Scholar
  5. 5.
    Bergamaschi S, Sala A (2006) Virtual integration of existing web databases for the genotypic selection of cereal cultivars. In: Meersman R, Tari Z (eds) OTM Conferences (1), Springer, Lecture Notes in Computer Science, vol 4275, pp 909–926Google Scholar
  6. 6.
    Bergamaschi S, Beneventano D, Sartori C, Vincini M (1997) Odb-qoptimizer: A tool for semantic query optimization in oodb. In: Gray WA, Larson PA (eds) ICDE, IEEE Computer Society, p 578Google Scholar
  7. 7.
    [7] Bergamaschi S, Castano S, Vincini M (1999) Semantic integration of semistructured and structured data sources. SIGMOD Record 28(1):54–59CrossRefGoogle Scholar
  8. 8.
    [8] Bergamaschi S, Castano S, Vincini M, Beneventano D (2001) Semantic integration of heterogeneous information sources. Data Knowl Eng 36(3):215–249MATHCrossRefGoogle Scholar
  9. 9.
    Bergamaschi S, Po L, Sala A, Sorrentino S (2007) Automatic annotation for p2p data integration systems: the wordnet domains disambiguation approach. In: Fifth International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2007) to be held at VLDB 2007 33st International Conference on Very Large Data Bases. University of Vienna, Austria, September 24, 2007Google Scholar
  10. 10.
    Bergamaschi S, Po L, Sorrentino S (2007) Automatic annotation for mapping discovery in data integration systems. In: Meersman R, Tari Z (eds) OTM Conferences (1), Springer, Lecture Notes in Computer ScienceGoogle Scholar
  11. 11.
    [11] Davidson SB, Overton GC, Tannen V, Wong L (1997) Biokleisli: A digital library for biomedical researchers. Int J on Digital Libraries 1(1):36–53Google Scholar
  12. 12.
    [12] Davidson SB, Crabtree J, Brunk BP, Schug J, Tannen V, Overton GC, Jr CJS (2001) K2/kleisli and gus: Experiments in integrated access to genomic data sources. IBM Systems Journal 40(2):512–531CrossRefGoogle Scholar
  13. 13.
    Galindo-Legaria CA (1994) Outerjoins as disjunctions. In: Snodgrass RT, Winslett M (eds) SIGMOD Conference, ACM Press, pp 348–358Google Scholar
  14. 14.
    [14] Haas LM, Schwarz PM, Kodali P, Kotlar E, Rice JE, Swope WC (2001) Dis-coverylink: A system for integrated access to life sciences data sources. IBM Systems Journal 40(2):489–511Google Scholar
  15. 15.
    [15] Hernandez T, Kambhampati S (2004) Integration of biological sources: Current systems and challenges ahead. SIGMOD Record 33(3):51–60CrossRefGoogle Scholar
  16. 16.
    Lenzerini M (2002) Data integration: A theoretical perspective. In: Popa L (ed) PODS, ACM, pp 233–246Google Scholar
  17. 17.
    [17] Stevens R, Baker PG, Bechhofer S, Ng G, Jacoby A, Paton NW, Goble CA, Brass A (2000) Tambis: Transparent access to multiple bioinformatics information sources. Bioinformatics 16(2):184–186CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Sonia Bergamaschi
  • Antonio Sala

There are no affiliations available

Personalised recommendations