Abstract
Big data applications usually require flexible and scalable infrastructure for efficient processing. Cloud computing satisfies these requirements very well and has been widely adopted to provide big data services. However, outsourcing and resource sharing features of cloud computing lead to security concerns when applied to big data applications, e.g., confidentiality of data/program, and integrity of the processing procedure. On the other hand, when cloud owns the data and provides analytic service, data privacy also becomes a challenge. Security concerns and pressing demand for adopting big data technology together motivate the development of a special class of security technologies for safe big data processing in cloud environment. These approaches are roughly divided into two categories: designing new algorithms with unique security features and developing security enhanced systems to protect big data applications. In this chapter, we review the approaches for secure big data processing from both categories, evaluate and compare these technologies from different perspectives, and present a general outlook on the current state of research and development in the field of security theories for big data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal CC, Philip SY (2008) A general survey of privacy-preserving data mining models and algorithms. Springer, Berlin
Aggarwal G, Bawa M, Ganesan P, Garcia-Molina H, Kenthapadi K, Motwani R, Srivastava U, Thomas D, Xu Y (2005) Two can keep a secret: a distributed architecture for secure database services. In: Second biennial conference on innovative data systems research - CIDR 2005, pp 186–199
Agrawal R, Kiernan J, Srikant R, Xu Y (2004) Order preserving encryption for numeric data. In: ACM international conference on management of data - SIGMOD 2004. ACM, New York, pp 563–574
ARM (2009) ARM security technology building a secure system using TrustZone technology
Bethencourt J, Sahai A, Waters B (2007) Ciphertext-policy attribute-based encryption. In: IEEE symposium on security and privacy - S&P 2007. IEEE Computer Society, Silver Spring, MD, pp 321–334
Blaze M, Bleumer G, Strauss M (1998) Divertible protocols and atomic proxy cryptography. In: Goos G, Hartmanis J, van Leeuwen J (eds) Advances in cryptology - EUROCRYPT 1998. Lecture notes in computer science, vol 1403. Springer, Berlin, pp 127–144
Boldyreva A, Chenette N, Lee Y, O’Neill A (2009) Order-preserving symmetric encryption. In: Joux A (ed) Advances in cryptology - EUROCRYPT 2009. Lecture notes in computer science, vol 5479. Springer, Berlin, pp 224–241
Boneh D, Franklin M (2001) Identity-based encryption from the weil pairing. In: Kilian J (ed) Advance in cryptology - CRYPTO 2001. Lecture notes in computer science, vol 2139. Springer, Berlin, pp 213–229
Boneh D, Gentry C, Lynn B, Shacham H et al (2003) A survey of two signature aggregation techniques. RSA Cryptobytes 6(2):1–10
Boneh D, Crescenzo GD, Ostrovsky R, Persiano G (2004) Public key encryption with keyword search. In: Advances in cryptology - EUROCRYPT 2004. Lecture notes in computer science, vol 3027. Springer, Berlin, pp 506–522
Boritz JE (2005) IS practitioners’ views on core concepts of information integrity. Int J Account Inf Syst 6(4):260–279
Brakerski Z, Vaikuntanathan V (2011) Efficient fully homomorphic encryption from (standard) LWE. In: Ostrovsky R (ed) IEEE 52nd annual symposium on foundations of computer science - FOCS 2011. IEEE Computer Society, Silver Spring, MD, pp 97–106
Chang EC, Xu J (2008) Remote integrity check with dishonest storage server. In: Jajodia S, López J (eds) 13th european symposium on research in computer security – ESORICS 2008. Lecture notes in computer science, vol 5283. Springer, Berlin, pp 223–237
Chow S, Eisen P, Johnson H, Van Oorschot PC (2003) A white-box des implementation for drm applications. In: Digital rights management. Springer, Berlin, pp 1–15
Cohen JC, Acharya S (2014) Towards a trusted HDFS storage platform: mitigating threats to hadoop infrastructures using hardware-accelerated encryption with TPM-rooted key protection. J Inf Secur Appl 19(3):224–244
Curtmola R, Garay J, Kamara S, Ostrovsky R (2006) Searchable symmetric encryption: improved definitions and efficient constructions. In: Proceedings of the 13th ACM conference on computer and communications security - CCS 2006. ACM, New York, pp 79–88
Dalenius T (1986) Finding a needle in a haystack or identifying anonymous census records. J Off Stat 2(3):329
De Cristofaro E, Tsudik G (2010) Practical private set intersection protocols with linear complexity. In: Sion R (ed) 14th International conference financial cryptography and data security - FC 2010. Lecture notes in computer science, vol 6052. Springer, Berlin, pp 143–159
Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Delplace V, Manneback P, Pinel F, Varette S, Bouvry P (2013) Comparing the performance and power usage of GPU and ARM clusters for MapReduce. In: Third international conference on cloud and green computing - CGC 2013. IEEE, New York, pp 199–200
Dong C, Chen L, Wen Z (2013) When private set intersection meets big data: an efficient and scalable protocol. In: Proceedings of the 20th ACM conference on computer and communications security - CCS 2013. ACM, New York, pp 789–800
Dwork C (2006) Differential privacy. In: Automata, Languages and Programming - ICALP 2006. Lecture notes in computer science, vol 4052. Springer, Berlin, pp 1–12
Dworkin M (2011) NIST SP 800-38A recommendation for block cipher modes of operation: the CMAC mode for authentication
Fetscherin M (2002) Present state and emerging scenarios of digital rights management systems. Int J Media Manag 4(3):164–171
FIPS 186-2 (2000) Digital signature standard (DSS)
FIPS 198-1 (2008) The keyed-hash message authentication code (HMAC)
Gentry C (2009) Fully homomorphic encryption using ideal lattices. In: Mitzenmacher M (ed) Proceedings of the 41st annual ACM symposium on theory of computing - STOC 2009. ACM, New York, pp 169–178
Gentry C, Sahai A, Waters B (2013) Homomorphic encryption from learning with errors: conceptually-simpler, asymptotically-faster, attribute-based. In: Canetti R, Garay JA (eds) Advances in cryptology – CRYPTO 2013. Lecture notes in computer science, vol 8043. Springer, Berlin, pp 75–92
Goh EJ (2003) Secure indexes. IACR cryptology ePrint archive. http://eprint.iacr.org/2003/216.pdf
Goldwasser S, Micali S, Rackoff C (1985) The knowledge complexity of interactive proof-systems. In: Proceedings of the 7th annual ACM symposium on theory of computing - STOC 1985. ACM, New York, pp 291–304
Goodacre J, Cambridge A (2013) The evolution of the ARM architecture towards big data and the data-centre. In: Proceedings of the 8th workshop on virtualization in high-performance cloud computing - VHPC 2013. ACM, New York, p 4
Goyal V, Pandey O, Sahai A, Waters B (2006) Attribute-based encryption for fine-grained access control of encrypted data. In: Juels A, Wright RN, di Vimercati SDC (eds) Proceedings of the 13th ACM conference on computer and communications security - CCS 2006. ACM, New York, pp 89–98
Hacigümüş H, Iyer BR, Li C, Mehrotra S (2002) Executing SQL over encrypted data in the database-service-provider model. In: Franklin MJ, Moon B, Ailamaki A (eds) Proceedings of the ACM international conference on management of data - SIGMOD 2002. ACM, New York, pp 216–227
Herstein IN (1990) Abstract algebra. Macmillan, New York
Hoekstra M, Lal R, Pappachan P, Phegade V, Del Cuvillo J (2013) Using innovative instructions to create trustworthy software solutions. In: Proceedings of the 2nd international workshop on hardware and architectural support for security and privacy - HASP 2013. ACM, New York
Intel Software Guard Extensions Programming Reference (2014). https://software.intel.com/sites/default/files/managed/48/88/329298-002.pdf
Johnson D, Menezes A, Vanstone S (2001) The elliptic curve digital signature algorithm (ECDSA). Int J Inf Secur 1:36–63
Johnson R, Molnar D, Song D, Wagner D (2002) Homomorphic signature schemes. In: Preneel B (ed) Topics in cryptology – CT-RSA 2002. Lecture notes in computer science, vol 2271. Springer, Berlin, pp 244–262
Juels A Jr, BSK (2007) Pors: proofs of retrievability for large files. In: Ning P, di Vimercati SDC, Syverson PF (eds) Proceedings of the 2007 ACM conference on computer and communications security - CCS 2007. ACM, New York, pp 584–597
Koomey JG, Belady C, Patterson M, Santos A, Lange KD (2009) Assessing trends over time in performance, costs, and energy use for servers. Lawrence Berkeley National Laboratory, Stanford University, Microsoft Corporation, and Intel Corporation, Technical Report
Ku W, Chi CH (2004) Survey on the technological aspects of digital rights management. In: Zhang K, Zheng Y (eds) 7th international conference on information security - ISC 2004. Lecture notes in computer science, vol 3225. Springer, Berlin, pp 391–403
Li F, Hadjieleftheriou M, Kollios G, Reyzin L (2006) Dynamic authenticated index structures for outsourced databases. In: Proceedings of the 2006 ACM international conference on management of data - SIGMOD 2006. ACM, New York, pp 121–132
Li N, Li T, Venkatasubramanian S (2007) t-closeness: Privacy beyond k-anonymity and ℓ-diversity. In: IEEE 23rd international conference on data engineering - ICDE 2007, pp 106–115. doi:10.1109/ICDE.2007.367856
Lindell Y, Pinkas B (2000) Privacy preserving data mining. In: Bellare M (ed) Advances in cryptology - CRYPTO 2000. Lecture notes in computer science, vol 1880. Springer, Berlin, pp 36–54
Lindell Y, Pinkas B (2009) A proof of security of Yao’s protocol for two-party computation. J Cryptol 22(2):161–188
Lindell Y, Pinkas B (2009) Secure multiparty computation for privacy-preserving data mining. J Priv Confid 1(1):5
Luby M, Rackoff C (1988) How to construct pseudorandom permutations from pseudorandom functions. SIAM J Comput 17(2):373–386
Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M (2007) â„“-diversity: privacy beyond k-anonymity. ACM Trans Knowl Discov Data 1(1). doi:10.1145/1217299.1217302. http://doi.acm.org/10.1145/1217299.1217302
Mambo M, Okamoto E (1997) Proxy cryptosystems: delegation of the power to decrypt ciphertexts. IEICE Trans Fundam Electron Commun Comput Sci E80-A:54–63
McSherry FD (2009) Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: Çetintemel U, Zdonik SB, Kossmann D, Tatbul N (eds) Proceedings of the ACM SIGMOD international conference on management of data - SIGMOD 2009. ACM, New York, pp 19–30
Mohan P, Thakurta A, Shi E, Song D, Culler D (2012) Gupt: privacy preserving data analysis made easy. In: Proceedings of the ACM SIGMOD international conference on management of data - SIGMOD 2012. ACM, New York, pp 349–360
Ou Z, Pang B, Deng Y, Nurminen JK, Yla-Jaaski A, Hui P (2012) Energy-and cost-efficiency analysis of ARM-based clusters. In: 12th IEEE/ACM international symposium on cluster, cloud and grid computing - CCGrid 2012. IEEE, New York, pp 115–123
Popa RA, Redfield C, Zeldovich N, Balakrishnan H (2011) CryptDB: protecting confidentiality with encrypted query processing. In: Proceedings of the 23rd ACM symposium on operating systems principles - SOSP 2011. ACM, New York, pp 85–100
Quisquater JJ, Quisquater M, Quisquater M, Quisquater M, Guillou L, Guillou MA, Guillou G, Guillou A, Guillou G, Guillou S (1990) How to explain zero-knowledge protocols to your children. In: Menezes A, Vanstone SA (eds) Advances in cryptology – CRYPTO89 Proceedings. Lecture notes in computer science, vol 537. Springer, Berlin, pp 628–631
Rivest RL, Adleman L, Dertouzos ML (1978) On data banks and privacy homomorphisms. Found Secure Comput 4(11):169–180
Roy I, Setty ST, Kilzer A, Shmatikov V, Witchel E (2010) Airavat: security and privacy for mapreduce. In: USENIX symposium on networked systems design & implementation - NSDI 2010, USENIX, vol 10, pp 297–312
Ruan A, Martin A (2012) TMR: towards a trusted MapReduce infrastructure. In: IEEE eighth world congress on services - SERVICES 2012. IEEE, New York, pp 141–148
Sahai A, Waters B (2005) Fuzzy identity-based encryption. In: Cramer R (ed) Advances in cryptology - EUROCRYPT 2005. Lecture notes in computer science, vol 3494. Springer, Berlin, pp 457–473
Sandhu RS, Coyne EJ, Feinstein HL, Youman CE (1996) Role-based access control models. Computer 29(2):38–47
Schuster F, Costa M, Fournet C, Gkantsidis C, Peinado M, Mainar-Ruiz G, Russinovich M (2015) VC3: trustworthy data analytics in the cloud using SGX. In: 36th IEEE symposium on security and privacy - S&P 2015. IEEE, New York
Shan Y, Wang B, Yan J, Wang Y, Xu N, Yang H (2010) FPMR: MapReduce framework on FPGA. In: Cheung PYK, Wawrzynek J (eds) Proceedings of the 18th annual ACM/SIGDA international symposium on field programmable gate arrays - FPGA 2010. ACM, Monterey, CA, pp 93–102
Sheikh R, Mishra DK, Kumar B (2011) Secure multiparty computation: from millionaires problem to anonymizer. Inform Secur J A Glob Perspect 20(1):25–33
Shi E, Perrig A, Van Doorn L (2005) Bind: a fine-grained attestation service for secure distributed systems. In: 26th IEEE symposium on security and privacy - S&P 2005. IEEE, New York, pp 154–168
Slagell A, Bonilla R, Yurcik W (2006) A survey of PKI components and scalability issues. In: 25th IEEE international on performance, computing, and communications conference - IPCCC 2006. IEEE, New York, pp 475–484
Sweeney L (2002) k-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowledge Based Syst 10(05):557–570
TCG (2011) TPM Main Specification. http://www.trustedcomputinggroup.org/resources/tpm_main_specification
Trusted Platform Module (TPM) Summary (2008). Technical Report, Trusted Computing Group
van Dijk M, Gentry C, Halevi S, Vaikuntanathan V (2010) Fully homomorphic encryption over the integers. In: Gilbert H (ed) Advances in cryptology - EUROCRYPT 2010. Lecture notes in computer science, vol 6110. Springer, Berlin, pp 24–43
Wang H, Yin J, Perng CS, Yu PS (2008) Dual encryption for query integrity assurance. In: Proceedings of the 17th ACM conference on information and knowledge management. ACM, New York, pp 863–872
Wyseur B (2009) White-box cryptography. Katholieke Universiteit, Arenbergkasteel, B-3001 Heverlee, Belgium
Xiaoxiao L (2014) Alibaba has big hopes for new big data processing service. http://english.caixin.com/2014-07-17/100705224.html
Xie M, Wang H, Yin J, Meng X (2007) Integrity auditing of outsourced data. In: Koch C, Gehrke J, Garofalakis MN, Srivastava D, Aberer K, Deshpande A, Florescu D, Chan CY, Ganti V, Kanne CC, Klas W, Neuhold EJ (eds) Proceedings of the 33rd international conference on Very large data bases - VLDB 2007, VLDB Endowment, pp 782–793
Xu L, Pham KD, Kim H, Shi W, Suh T (2014) End-to-end big data processing protection in cloud environment using black-box: an FPGA approach. Int J Cloud Comput
Xu L, Shi W, Suh T (2014) PFC: privacy preserving FPGA cloud - a case study of MapReduce. In: 7th IEEE international conference on cloud computing
Yao AC (1982) Protocols for secure computations. In: IEEE 23th annual symposium on foundations of computer science - FOCS 1982. IEEE, New York, pp 160–164
Yao ACC (1986) How to generate and exchange secrets. In: IEEE 27th annual symposium on foundations of computer science - FOCS 1986. IEEE, New York, pp 162–167
Yuan E, Tong J (2005) Attributed based access control (abac) for web services. In: Proceedings of 2005 IEEE International Conference on Web Services - ICWS 2005. IEEE, New York
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Xu, L., Shi, W. (2016). Security Theories and Practices for Big Data. In: Yu, S., Guo, S. (eds) Big Data Concepts, Theories, and Applications . Springer, Cham. https://doi.org/10.1007/978-3-319-27763-9_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-27763-9_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27761-5
Online ISBN: 978-3-319-27763-9
eBook Packages: Computer ScienceComputer Science (R0)