Query Adaptation and Privacy for Real-Time Business Intelligence
This paper (extended abstract) discusses several technical challenges and issues that need special attention when dealing with real-time business intelligence (RTBI) systems. While most contributions of previous BIRTE Workshops focused on (database) technology this extended abstract will take a more holistic view by covering technical and non-technical aspects. First, we introduce and discuss two real-world applications to derive technical and non-technical requirements that are quite diverse in the context of real-time business intelligence. Based on those requirements and based on our experience in developing the Stratosphere database management system  we outline our already existing and future approaches to query adaptation and of statistics building that are about to be implemented into Stratosphere to support RTBI.
In the second part of the extended abstract we discuss important aspects of privacy when dealing with personal data, and outline necessary requirements for implementing real-time business intelligence systems to protect people’s privacy (to some extent). It will become apparent that often there exists a trade-off between the level of privacy and the utility expected by those who perform real-time business analytics.
KeywordsReal-time business intelligence Big data Map Reduce paradigm Query optimization Privacy k-anonymity Adversary knowledge
- 1.Stratosphere. http://www.stratosphere.eu. Accessed Dec 2013
- 2.Tazaldoo/tame. http://www.tame.it. Accessed Dec 2013
- 3.Hadoop. http://hadoop.apache.org/. Accessed Dec 2013
- 4.Renew London. http://renewlondon.com. Accessed Dec 2013
- 5.PresenceOrb. http://www.presenceorb.com/. Accessed Dec 2013
- 6.Hueske, F., Peters, M., Sax, M., Rheinländer, A., Bergmann, R., Krettek, A., Tzoumas, K.: Opening the black boxes in data flow optimization. PVLDB 5(11), 1256–1267 (2012)Google Scholar
- 8.Rundensteiner, E.A., Ding, L., Sutherland, T.M., Zhu, Y., Pielech, B., Mehta, N.: CAPE: continuous query engine with heterogeneous-grained adaptivity. In: VLDB Proceedings of the Thirteenth International Conference on Very Large Data Bases, Toronto, Canada, pp. 1353–1356 (2004)Google Scholar
- 9.Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information (abstract). In: PODS 1988, p. 188 (1998)Google Scholar
- 11.Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: OSDI 2004, pp. 137–150 (2004)Google Scholar
- 12.Dölle, L.: Detecting privacy breaches when answering a sequence of queries, Ph.D. thesis (in German), Humboldt-Universität zu, Berlin (2014)Google Scholar
- 13.Bergmann, R.: Gathering statistics for query adaptation. Ph.D. thesis (in German). Humboldt-Universität zu, Berlin (2014)Google Scholar