Abstract
Hadoop is an open source distributed system for data storage and parallel computations that is widely used. It is essential to ensure the security, authenticity, and integrity of all Hadoop’s entities. The current secure implementations of Hadoop rely on Kerberos, which suffers from many security and performance issues including single point of failure, online availability requirement, and concentration of authentication credentials. Most importantly, these solutions do not guard against malicious and privileged insiders. In this paper, we design and implement an authentication framework for Hadoop systems based on Trusted Platform Module (TPM) technologies. The proposed protocol not only overcomes the shortcomings of the state-of-the-art protocols, but also provides additional significant security guarantees that guard against insider threats. We analyze and compare the security features and overhead of our protocol with the state-of-the-art protocols, and show that our protocol provides better security guarantees with lower optimized overhead.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Apache Hadoop. http://hadoop.apache.org
Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–10 (2010)
Bagchi, S., Shroff, N., Khalil, I., Panta, R., Krasniewski, M., Krogmeier, J.: Protocol for secure and energy-efficient reprogramming of wireless multi-hop sensor networks. US Patent 8,107,397 (2012)
Khalil, I., Bagchi, S.: Secos: key management for scalable and energy efficient crypto on sensors. In: Proceedings of IEEE Dependable Systems and Networks (DSN) (2003)
O’Malley, O., Zhang, K., Radia, S., Marti, R., Harrell, C.: Hadoop security design. Yahoo Inc.,Technical report (2009)
Hern, A.: Did your Adobe password leak? http://www.theguardian.com/technology/2013/nov/07/adobe-password-leak-can-check
Smith, K.: Big Data Security: The Evolution of Hadoop’s Security Model. http://www.infoq.com/articles/HadoopSecurityModel
Kerberos. http://web.mit.edu/rhel-doc/5/RHEL-5-manual/Deployment_Guide-en-US/ch-kerberos.html
Project Rhino. https://issues.apache.org/jira/browse/HADOOP-9392
Trusted Platform Module (TPM): Built-in Authentication. http://www.trustedcomputinggroup.org/solutions/authentication
Trusted Platform Module. http://en.wikipedia.org/wiki/Trusted_Platform_Module
Leicher, A., Kuntze, N., Schmidt, A.U.: Implementation of a trusted ticket system. In: Gritzalis, D., Lopez, J. (eds.) SEC 2009. IFIP AICT, vol. 297, pp. 152–163. Springer, Heidelberg (2009)
Ruan, A., Martin, A.: TMR: Towards a trusted mapreduce infrastructure. In: IEEE Eighth World Congress on Services (SERVICES), pp. 141–148 (2012)
Santos, N., Gummadi, K., Rodrigues, R.: Towards trusted cloud computing. In: Proceedings of the 2009 Conference on Hot Topics in Cloud Computing (2009)
Hadoop Security Analysis. http://www.tuicool.com/articles/NFf6be
Trusted platform module (TPM) quick reference guide. Intel Corporation (2007)
TPM Management. http://technet.microsoft.com/en-us/library/cc755108.aspx
TPM architecture. http://en.wikipedia.org/wiki/File:TPM.svg
Ng, R.: Trusted platform module TPM fundamental. Infineon Technologies Asia Pacific Pte Ltd. (2008)
Trusted Computing: TCG proposals. https://www.cs.bham.ac.uk/~mdr/teaching/modules/security/lectures/TrustedComputingTCG.html
Panta, R., Bagchi, S., Khalil, I.: Efficient wireless reprogramming through reduced bandwidth usage and opportunistic sleeping. Ad Hoc Netw. 7(1), 42–62 (2009)
Bouktif, S., Ahmed, F., Khalil, I., Antoniol, G.: A novel composite model approach to improve software quality prediction. Inf. Softw. Tech. 52(12), 1298–1311 (2010)
Shi, J., Taifi, M., Khreishah, A., Wu, J.: Sustainable gpu computing at scale. In: 2011 IEEE 14th International Conference on Computational Science and Engineering (CSE), pp. 263–272. IEEE (2011)
The Trusted Computing Group (TCG). Virtualized trusted platform architecture specification, version 1.0, revision 0.26 (2011)
Eclipse. https://www.eclipse.org/
The KDC and related programs for Kerberos 5. http://linuxsoft.cern.ch/cern/slc5X/x86_64/yum/updates/repoview/krb5-server.html
Trusted Computing for the Java(tm) Platform. http://trustedjava.sourceforge.net/index.php?item=jtss/about
TPM emulator. http://tpm-emulator.berlios.de/designdoc.html
Integrity Measurement Architecture (IMA). http://sourceforge.net/p/linux-ima/wiki/Home/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Khalil, I., Dou, Z., Khreishah, A. (2015). TPM-Based Authentication Mechanism for Apache Hadoop. In: Tian, J., Jing, J., Srivatsa, M. (eds) International Conference on Security and Privacy in Communication Networks. SecureComm 2014. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 152. Springer, Cham. https://doi.org/10.1007/978-3-319-23829-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-23829-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23828-9
Online ISBN: 978-3-319-23829-6
eBook Packages: Computer ScienceComputer Science (R0)