Advertisement

On Enriching User-Centered Data Integration Schemas in Service Lakes

  • Hiba AliliEmail author
  • Khalid Belhajjame
  • Daniela Grigori
  • Rim Drira
  • Henda Hajjami Ben Ghezala
Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 288)

Abstract

In the Big Data era, companies are moving away from traditional data-warehouse solutions whereby expensive and time-consuming ETL (Extract-Transform-Load) processes are used, towards data lakes, which can be viewed as storage repositories holding a vast amount of raw data. In this paper, we position ourselves in the recurrent context where a user has a local dataset that is not sufficient for processing the queries that are of interest to him. In this context, we show how the data lake, or more specifically the service lake since we are focusing on data providing services, can be leveraged to enrich the local dataset with concepts that cater for the processing of user queries. Furthermore, we present the algorithms we have developed for this purpose and showcase the working of our solution using a study case.

Keywords

User-centric data integration Data provisioning service lakes Schema enriching 

References

  1. 1.
    Anisimov, A.A.: Review of the data warehouse toolkit: the complete guide to dimensional modeling. SIGMOD Rec. 32, 101–102 (2003)CrossRefGoogle Scholar
  2. 2.
    Arens, Y., Chee, C.Y., Hsu, C., Knoblock, C.A.: Retrieving and integrating data from multiple information sources. Int. J. Coop. Inf. Syst. 2, 127–158 (1993)CrossRefGoogle Scholar
  3. 3.
    Beneventano, D., Bergamaschi, S., Castano, S., Corni, A., Guidetti, R., Malvezzi, G., Melchiori, M., Vincini, M.: Information integration: the MOMIS project demonstration. In: Proceedings of the 26th International Conference on Very Large Data Bases, pp. 611–614 (2000)Google Scholar
  4. 4.
    Budanitsky, A., Hirst, G.: Evaluating wordnet-based measures of lexical semantic relatedness. Comput. Linguist. 32, 13–47 (2006)CrossRefzbMATHGoogle Scholar
  5. 5.
    Chawathe, S.S., Garcia-Molina, H., Hammer, J., Ireland, K., Papakonstantinou, Y., Ullman, J.D., Widom, J.: The TSIMMIS project: integration of heterogeneous information sources. In: IPSJ, pp. 7–18 (1994)Google Scholar
  6. 6.
    Halevy, A.Y., Rajaraman, A., Ordille, J.J.: Data integration: the teenage years. In: Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 9–16 (2006)Google Scholar
  7. 7.
    Levy, A.Y., Rajaraman, A., Ordille, J.J.: Querying heterogeneous information sources using source descriptions. In: Proceedings of the 22th International Conference on Very Large Data Bases, pp. 251–262 (1996)Google Scholar
  8. 8.
    Liu, H., Singh, P.: Conceptnet: a practical commonsense reasoning toolkit. BT Tech. J. 22, 211–226 (2004)CrossRefGoogle Scholar
  9. 9.
    Preda, N., Kasneci, G., Suchanek, F.M., Neumann, T., Yuan, W., Weikum, G.: Active knowledge: dynamically enriching RDF knowledge bases by web services. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 399–410 (2010)Google Scholar
  10. 10.
    Quix, C.: Managing data lakes in big data era. In: Proceedings 5th International Conference on Cyber Technology in Automation, Control and Intelligent Systems, pp. 820–824 (2015)Google Scholar
  11. 11.
    Truong, H.L., Dustdar, S.: On analyzing and specifying concerns for data as a service. In: 4th IEEE Asia-Pacific Services Computing Conference, pp. 87–94 (2009)Google Scholar
  12. 12.
    Tuchinda, R., Knoblock, C.A., Szekely, P.A.: Building mashups by demonstration. Trans. Web 5, 16: 1–16: 45 (2011)Google Scholar
  13. 13.
    Ziegler, P., Dittrich, K.R.: Three decades of data integration - all problems solved? In: Jacquart, R. (ed.) Building the Information Society. IFIP International Federation for Information Processing, vol. 156, pp. 3–12. Springer, Toulouse (2004)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Hiba Alili
    • 1
    • 2
    Email author
  • Khalid Belhajjame
    • 1
  • Daniela Grigori
    • 1
  • Rim Drira
    • 2
  • Henda Hajjami Ben Ghezala
    • 2
  1. 1.Paris-Dauphine University, PSL Research University, CNRS, [UMR 7243], LAMSADEParisFrance
  2. 2.National School of Computer SciencesUniversity of Manouba, RIADIManoubaTunisia

Personalised recommendations