Uncertainty in Data Integration Systems: Automatic Generation of Probabilistic Relationships

  • Sonia Bergamaschi
  • Laura Po
  • Serena Sorrentino
  • Alberto Corni
Conference paper


This paper proposes a method for the automatic discovery of probabilistic relationships in the environment of data integration systems. Dynamic data integration systems extend the architecture of current data integration systems by modeling uncertainty at their core. Our method is based on probabilistic word sense disambiguation (PWSD), which allows to automatically lexically annotate (i.e. to perform annotation w.r.t. a thesaurus/lexical resource) the schemata of a given set of data sources to be integrated. From the annotated schemata and the relathionships defined in the thesaurus, we derived the probabilistic lexical relationships among schema elements. Lexical relationships are collected in the Probabilistic Common Thesaurus (PCT), as well as structural relationships.


Description Logic Automatic Discovery Probabilistic Relationship Lexical Resource Data Integration System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



The work reported in this paper has been funded by the MUR FIRB Network Peer for Business project ( and by the IST FP6 STREP project 2006 STASIS (


  1. 1.
    Louie B, Detwiler L, Dalvi NN, Shaker R, Tarczy-Hornoch P, Suciu D (2007) Incorporating uncertainty metrics into a general-purpose data integration system, SSDBM, 19. IEEE Computer Society, Los Alamitos, CaliforniaGoogle Scholar
  2. 2.
    Dalvi NN, Suciu D (2007) Management of probabilistic data: foundations and challenges, PODS. ACM Press, New York, pp 1–12Google Scholar
  3. 3.
    Beneventano D, Bergamaschi S, Guerra F, Vincini M (2003) Synthesizing an integrated ontology. IEEE Int Comput 7(5):42–51CrossRefGoogle Scholar
  4. 4.
    Bergamaschi S, Po L, Sorrentino S, Corni A (2009) Dealing with uncertainty in lexical annotation. ER 2009. J Theor Appl Inform 16(2):93–96Google Scholar
  5. 5.
    Benassi R, Bergamaschi S, Fergnani A, Miselli D (2004) Extending a lexicon ontology for intelligent information integration.ECAI. IOS Press, Amsterdam, pp 278–282Google Scholar
  6. 6.
    Beneventano D, Bergamaschi S, Sartori C (2003) Description logics for semantic query optimization in object-oriented database systems. ACM Trans Database Syst 28:1–50CrossRefGoogle Scholar
  7. 7.
    Bergamaschi S, Bouquet P, Giacomuzzi D, Guerra F, Po L, Vincini M (2007) An incremental method for the lexical annotation of domain ontologies. Int J Semantic Web Inf Syst 3(3):57–80Google Scholar
  8. 8.
    Bergamaschi S, Po L, Sorrentino S (2007) Automatic annotation in data integration systems. OTM Workshops (1). LNCS Springer 4805:27–28Google Scholar
  9. 9.
    Parsons S, Hunter A (1998) A review of uncertainty handling formalisms, applications of uncertainty formalisms. LNCS Springer 1455:8–37Google Scholar
  10. 10.
    Shafer G (1976) A mathematical theory of evidence. Princeton University Press, PrincetonGoogle Scholar
  11. 11.
    McCarthy D, Carroll J (2003) Disambiguating nouns, verbs, and adjectives using automatically acquired selectional preferences. Comput Linguist 29(4):639–654CrossRefGoogle Scholar
  12. 12.
    Sorrentino S, Bergamaschi S, Gawinecki M, Po L (2009) Schema normalization for improving schema matching. In: Laender AHF et al (eds) ER 2009. LNCS 5829:280–293Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Sonia Bergamaschi
    • 1
  • Laura Po
    • 1
  • Serena Sorrentino
    • 1
  • Alberto Corni
    • 1
  1. 1.Information Engineering DepartmentUniversity of Modena and Reggio EmiliaModenaItaly

Personalised recommendations