Skip to main content

On Enriching User-Centered Data Integration Schemas in Service Lakes

  • Conference paper
  • First Online:
Business Information Systems (BIS 2017)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 288))

Included in the following conference series:

Abstract

In the Big Data era, companies are moving away from traditional data-warehouse solutions whereby expensive and time-consuming ETL (Extract-Transform-Load) processes are used, towards data lakes, which can be viewed as storage repositories holding a vast amount of raw data. In this paper, we position ourselves in the recurrent context where a user has a local dataset that is not sufficient for processing the queries that are of interest to him. In this context, we show how the data lake, or more specifically the service lake since we are focusing on data providing services, can be leveraged to enrich the local dataset with concepts that cater for the processing of user queries. Furthermore, we present the algorithms we have developed for this purpose and showcase the working of our solution using a study case.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Anisimov, A.A.: Review of the data warehouse toolkit: the complete guide to dimensional modeling. SIGMOD Rec. 32, 101–102 (2003)

    Article  Google Scholar 

  2. Arens, Y., Chee, C.Y., Hsu, C., Knoblock, C.A.: Retrieving and integrating data from multiple information sources. Int. J. Coop. Inf. Syst. 2, 127–158 (1993)

    Article  Google Scholar 

  3. Beneventano, D., Bergamaschi, S., Castano, S., Corni, A., Guidetti, R., Malvezzi, G., Melchiori, M., Vincini, M.: Information integration: the MOMIS project demonstration. In: Proceedings of the 26th International Conference on Very Large Data Bases, pp. 611–614 (2000)

    Google Scholar 

  4. Budanitsky, A., Hirst, G.: Evaluating wordnet-based measures of lexical semantic relatedness. Comput. Linguist. 32, 13–47 (2006)

    Article  MATH  Google Scholar 

  5. Chawathe, S.S., Garcia-Molina, H., Hammer, J., Ireland, K., Papakonstantinou, Y., Ullman, J.D., Widom, J.: The TSIMMIS project: integration of heterogeneous information sources. In: IPSJ, pp. 7–18 (1994)

    Google Scholar 

  6. Halevy, A.Y., Rajaraman, A., Ordille, J.J.: Data integration: the teenage years. In: Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 9–16 (2006)

    Google Scholar 

  7. Levy, A.Y., Rajaraman, A., Ordille, J.J.: Querying heterogeneous information sources using source descriptions. In: Proceedings of the 22th International Conference on Very Large Data Bases, pp. 251–262 (1996)

    Google Scholar 

  8. Liu, H., Singh, P.: Conceptnet: a practical commonsense reasoning toolkit. BT Tech. J. 22, 211–226 (2004)

    Article  Google Scholar 

  9. Preda, N., Kasneci, G., Suchanek, F.M., Neumann, T., Yuan, W., Weikum, G.: Active knowledge: dynamically enriching RDF knowledge bases by web services. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 399–410 (2010)

    Google Scholar 

  10. Quix, C.: Managing data lakes in big data era. In: Proceedings 5th International Conference on Cyber Technology in Automation, Control and Intelligent Systems, pp. 820–824 (2015)

    Google Scholar 

  11. Truong, H.L., Dustdar, S.: On analyzing and specifying concerns for data as a service. In: 4th IEEE Asia-Pacific Services Computing Conference, pp. 87–94 (2009)

    Google Scholar 

  12. Tuchinda, R., Knoblock, C.A., Szekely, P.A.: Building mashups by demonstration. Trans. Web 5, 16: 1–16: 45 (2011)

    Google Scholar 

  13. Ziegler, P., Dittrich, K.R.: Three decades of data integration - all problems solved? In: Jacquart, R. (ed.) Building the Information Society. IFIP International Federation for Information Processing, vol. 156, pp. 3–12. Springer, Toulouse (2004)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hiba Alili .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Alili, H., Belhajjame, K., Grigori, D., Drira, R., Ghezala, H.H.B. (2017). On Enriching User-Centered Data Integration Schemas in Service Lakes. In: Abramowicz, W. (eds) Business Information Systems. BIS 2017. Lecture Notes in Business Information Processing, vol 288. Springer, Cham. https://doi.org/10.1007/978-3-319-59336-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59336-4_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59335-7

  • Online ISBN: 978-3-319-59336-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics