Abstract
Big Data is an emerging area and concerns managing datasets whose size is beyond commonly used software tools ability to capture, process, and perform analyses in a timely way. The Big Data software market is growing at 32% compound annual rate, almost four times more than the whole ICT market, and the quantity of data to be analyzed is expected to double every two years.
Security and privacy are becoming very urgent Big Data aspects that need to be tackled. Indeed, users share more and more personal data and user generated content through their mobile devices and computers to social networks and cloud services, losing data and content control with a serious impact on their own privacy. Privacy is one area that had a serious debate recently, and many governments require data providers and companies to protect users’ sensitive data. To mitigate these problems, many solutions have been developed to provide data privacy but, unfortunately, they introduce some computational overhead when data is processed.
The goal of this paper is to quantitatively evaluate the performance and cost impact of multiple privacy protection mechanisms. A real industry case study concerning tax fraud detection has been considered. Many experiments have been performed to analyze the performance degradation and additional cost (required to provide a given service level) for running applications in a cloud system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
https://whatsthebigdata.com/2016/03/07/amount-of-data-created-annually-to-reach-180-zettabytes-in-2025/ Access 15 May 2017. Time: 4.30Â pm CET
Lekkas, D., Zissis, D.: Addressing cloud computing security issues, Department of Product and Systems Design Engineering, University of the Aegean, Syros, Greece, vol. 28, no. 3, pp. 538–592, December 2010
Buyya, R., Yeo, S.C., Venogopal, S.: Market-oriented cloud computing: vision, hype, and reality for delivering IT services as computing utilities. In: Proceedings of HPCC (2009)
Ciavotta, M., Gianniti, E., Ardagna, D.: D-SPACE4Cloud: a design tool for Big Data applications. In: Proceedings of ICA3PP (2016)
Moura, J., Serrao, C.: Security and privacy issues of Big Data. In: Proceedings of EDBT/ICDT (2015)
Vieria, M., Madeira, H.: Towards a security benchmark for database management systems. In: Proceedings of DSN (2015)
Ataie, E., Gianniti, E., Ardagna, D., Movaghar, A.: A combined analytical modeling machine learning approach for performance prediction of MapReduce jobs in cloud environment. In: Proceedings of SYNASC 2016 (2017)
Lazowska, D.E., et al.: Quantitative System Performance: Computer System Analysis Using Queueing Network Models. Prentice-Hall, Inc., Upper Saddle River (1984)
Jain, P., Gyanchaandani, M., Khare, N.: Big Data privacy: a technological perspective and review. J. Big Data 3, 25 (2016)
Soria-Comas, J., Domingo-Ferrer, J.: Big Data privacy: challenges to privacy principles and models. Data Sci. Eng. 1(1), 21–28 (2015)
Aleti, A., Buhnova, B., Grunske, L., Koziolek, A., Meedeniya, I.: Software architecture optimization methods: a systematic literature review. IEEE Trans. Softw. Eng. 39(5), 658–683 (2013)
Brosig, F., Meier, P., Becker, S., Koziolek, A., Koziolek, H., Kounev, S.: Quantitative evaluation of model-driven performance analysis and simulation of component based architectures. IEEE Trans. Softw. Eng. 41(2), 157–175 (2015)
Elmasri, R., Navathe, S.B.: Database Systems. Pearson - Addison Wesley, New York (2011)
Agrawal, D., Das, S., El Abbadi, A.: Big Data and cloud computing: current state and future opportunities. In: Proceedings of EDBT/ICDT (2011)
Marques, J., Serrão, C.: Improving content privacy on social networks using open digital rights management solutions. Procedia Technol. 9, 405–410 (2013)
Becker, S., Koziolek, H., Reussner, R.: The Palladio component model for model driven performance prediction. J. Syst. Softw. 82(1), 3–22 (2009)
Koziolek, A., Koziolek, H., Reussner, R.: PerOpteryx: automated application of tactics in multi-objective software architecture optimization. In: Proceedings of QoSA 2011 (2011)
Tribastone, M., Gilmore, S., Hillston, J.: Scalable differential analysis of process algebra models. IEEE Trans. Softw. Eng. 38(1), 205–219 (2012)
Herodotou, H., Lim, H., Luo, G., Borisov, N., Dong, L., Cetin, F.B., Babu, S.: Starfish: a self-tuning system for Big Data analytics. In: Proceedings of CIDR (2011)
OMG: PEPA: performance evaluation process algebra (2015). http://www.dcs.ed.ac.uk/pepa/tools/
Tian, F., Chen, K.: Towards optimal resource provisioning for running MapReduce programs in public Clouds. In: Proceedings of CLOUD (2011)
Verma, A., Cherkasova, L., Campbell, R.H.: ARIA: automatic resource inference and allocation for MapReduce environments. In: Proceedings of ICAC (2011)
Basso, T., Moraes, R., Antunes, N., Vieira, M., Santos, W., Meira Jr., W.: PRIVAaaS: privacy approach for distributed cloud-based data analytics platforms. In: Proceedings of CCGrid (2017)
Clifton, C., Tassa, T.: On syntactic anonymity and differential privacy. In: Proceedings of ICDEW (2013)
Kaisler, S., Armour, F., Espinosa, A.J., Money, W.: Big Data: issues and challenges moving forward. In: Proceedings of HICSS (2013)
Acknowledgments
The results of this paper have been partially funded by EUBra-BIGSEA (GA no. 690116) funded by the European Commission under Horizon 2020 and the Ministério de Ciência, Tecnologia e Inovação, RNP/Brazil (grant GA0000000650/04).
Eugenio Gianniti is also partially supported by the DICE H2020 research project (GA no. 644869). Spark experiments have been supported by Microsoft under the Top Compsci University Azure Adoption program.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Kalwar, S., Gianniti, E., Kinouani, J.Y., Ridene, Y., Ardagna, D. (2018). Performance Degradation and Cost Impact Evaluation of Privacy Preserving Mechanisms in Big Data Systems. In: Balsamo, S., Marin, A., Vicario, E. (eds) New Frontiers in Quantitative Methods in Informatics. InfQ 2017. Communications in Computer and Information Science, vol 825. Springer, Cham. https://doi.org/10.1007/978-3-319-91632-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-91632-3_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91631-6
Online ISBN: 978-3-319-91632-3
eBook Packages: Computer ScienceComputer Science (R0)