Using Deep Learning to Generate Relational HoneyData

Abay, Nazmiye Ceren; Akcora, Cuneyt Gurcan; Zhou, Yan; Kantarcioglu, Murat; Thuraisingham, Bhavani

doi:10.1007/978-3-030-02110-8_1

Nazmiye Ceren Abay⁵,
Cuneyt Gurcan Akcora⁵,
Yan Zhou⁵,
Murat Kantarcioglu⁵ &
…
Bhavani Thuraisingham⁵

1485 Accesses
7 Citations

The original version of this chapter was revised: Chapter authors have been added. The correction to this chapter is available at https://doi.org/10.1007/978-3-030-02110-8_12

Abstract

Although there has been a plethora of work in generating deceptive applications, generating deceptive data that can easily fool attackers received very little attention. In this book chapter, we discuss our secure deceptive data generation framework that makes it hard for an attacker to distinguish between the real versus deceptive data. Especially, we discuss how to generate such deceptive data using deep learning and differential privacy techniques. In addition, we discuss our formal evaluation framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Hardcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Change history

01 February 2020
This book was inadvertently published as an authored work with the chapter authors mentioned in the footnotes of the chapter opening pages. This has now been updated and the chapter authors have been mentioned in the respective chapter opening pages as mentioned below:

Notes

1.
https://www.cnet.com/news/linkedin-confirms-passwords-were-compromised/.

References

Abadi, M., Chu, A., Goodfellow, I., McMahan, H.B., Mironov, I., Talwar, K., Zhang, L.: Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318. ACM (2016)
Google Scholar
Abay, N.C., Zhou, Y., Kantarcioglu, M., Thuraisingham, B., Sweeney, L.: Privacy preserving synthetic data release using deep learning. The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (PKDD 2018) (2018)
Google Scholar
Ács, G., Melis, L., Castelluccia, C., Cristofaro, E.D.: Differentially private mixture of generative neural networks. CoRR abs/1709.04514 (2017). URL http://arxiv.org/abs/1709.04514
Almeshekah, M.H., Spafford, E.H.: Cyber security deception. In: Cyber Deception, pp. 23–50. Springer (2016)
Google Scholar
Baldi, P.: Autoencoders, unsupervised learning, and deep architectures. In: Proceedings of ICML workshop on unsupervised and transfer learning, pp. 37–49 (2012)
Google Scholar
Bindschaedler, V., Shokri, R., Gunter, C.A.: Plausible deniability for privacy-preserving data synthesis. Proceedings of the VLDB Endowment 10(5), 481–492 (2017)
Article Google Scholar
Breiman, L.: Random forests. Machine learning 45(1), 5–32 (2001)
Article Google Scholar
Bun, M., Steinke, T.: Concentrated differential privacy: Simplifications, extensions, and lower bounds. In: Theory of Cryptography Conference, pp. 635–658. Springer (2016)
Google Scholar
Chaudhuri, K., Monteleoni, C.: Privacy-preserving logistic regression. In: Advances in Neural Information Processing Systems, pp. 289–296 (2009)
Google Scholar
Dwork, C.: Differential privacy. In: Proceedings of the 33rd International Conference on Automata, Languages and Programming - Volume Part II, ICALP’06, pp. 1–12. Springer-Verlag, Berlin, Heidelberg (2006). DOI 10.1007/11787006_1. URL http://dx.doi.org/10.1007/11787006_1
Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., Naor, M.: Our data, ourselves: Privacy via distributed noise generation. In: Annual International Conference on the Theory and Applications of Cryptographic Techniques, pp. 486–503. Springer (2006)
Google Scholar
Dwork, C., Lei, J.: Differential privacy and robust statistics. In: Proceedings of the forty-first annual ACM symposium on Theory of computing, pp. 371–380. ACM (2009)
Google Scholar
Dwork, C., Roth, A., et al.: The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science 9(3–4), 211–407 (2014)
Article MathSciNet Google Scholar
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: A library for large linear classification. Journal of machine learning research 9(Aug), 1871–1874 (2008)
MATH Google Scholar
Goodfellow, I.: Efficient per-example gradient computations. arXiv preprint arXiv:1510.01799 (2015)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016). http://www.deeplearningbook.org
Hearst, M.A., Dumais, S.T., Osuna, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intelligent Systems and their applications 13(4), 18–28 (1998)
Article Google Scholar
Holz, T., Raynal, F.: Detecting honeypots and other suspicious environments. In: Information Assurance Workshop, 2005. IAW’05. Proceedings from the Sixth Annual IEEE SMC, pp. 29–36. IEEE (2005)
Google Scholar
Juels, A., Rivest, R.L.: Honeywords: Making password-cracking detectable. In: Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pp. 145–160. ACM (2013)
Google Scholar
Kotsiantis, S.B., Zaharakis, I., Pintelas, P.: Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering 160, 3–24 (2007)
Google Scholar
Lichman, M.: UCI machine learning repository (2013). URL http://archive.ics.uci.edu/ml
Nerlove, M., Press, S.J.: Univariate and multivariate log-linear and logistic models, vol. 1306. Rand Santa Monica (1973)
Google Scholar
Park, M., Foulds, J., Chaudhuri, K., Welling, M.: Practical privacy for expectation maximization. CoRR abs/1605.06995 (2016). URL http://arxiv.org/abs/1605.06995
Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318 (2013)
Google Scholar
Rubin, D.B.: Discussion statistical disclosure limitation. Journal of official Statistics 9(2), 461 (1993)
Google Scholar
Rubinstein, B.I., Bartlett, P.L., Huang, L., Taft, N.: Learning in a large function space: Privacy-preserving mechanisms for SVM learning. arXiv preprint arXiv:0911.5708 (2009)
Google Scholar
Schölkopf, B., Platt, J.C., Shawe-Taylor, J.C., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001). DOI 10.1162/089976601750264965. URL https://doi.org/10.1162/089976601750264965
Article MATH Google Scholar
Song, S., Chaudhuri, K., Sarwate, A.D.: Stochastic gradient descent with differentially private updates. In: Global Conference on Signal and Information Processing (GlobalSIP), 2013 IEEE, pp. 245–248. IEEE (2013)
Google Scholar
Spitzner, L.: Honeypots: tracking hackers, vol. 1. Addison-Wesley Reading (2003)
Google Scholar
Vaidya, J., Shafiq, B., Basu, A., Hong, Y.: Differentially private naive Bayes classification. In: Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)-Volume 01, pp. 571–576. IEEE Computer Society (2013)
Google Scholar
Yuill, J., Zappe, M., Denning, D., Feer, F.: Honeyfiles: deceptive files for intrusion detection. In: Information Assurance Workshop, 2004. Proceedings from the Fifth Annual IEEE SMC, pp. 116–122. IEEE (2004)
Google Scholar
Zhang, J., Cormode, G., Procopiuc, C.M., Srivastava, D., Xiao, X.: Privbayes: Private data release via Bayesian networks. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pp. 1423–1434. ACM (2014)
Google Scholar
Zhang, J., Zhang, Z., Xiao, X., Yang, Y., Winslett, M.: Functional mechanism: regression analysis under differential privacy. Proceedings of the VLDB Endowment 5(11), 1364–1375 (2012)
Article Google Scholar

Download references

Author information

Authors and Affiliations

The University of Texas at Dallas, Richardson, TX, USA
Nazmiye Ceren Abay, Cuneyt Gurcan Akcora, Yan Zhou, Murat Kantarcioglu & Bhavani Thuraisingham

Authors

Nazmiye Ceren Abay
View author publications
You can also search for this author in PubMed Google Scholar
Cuneyt Gurcan Akcora
View author publications
You can also search for this author in PubMed Google Scholar
Yan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Murat Kantarcioglu
View author publications
You can also search for this author in PubMed Google Scholar
Bhavani Thuraisingham
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Murat Kantarcioglu .

Editor information

Editors and Affiliations

Department of Software & Information System, University of North Carolina Charlotte, Charlotte, NC, USA
Ehab Al-Shaer
Department of Software and Information System, University of North Carolina, Charlotte, NC, USA
Jinpeng Wei
Computer Science Department, University of Texas at Dallas, Richardson, TX, USA
Kevin W. Hamlen
Computing and Information Science Division, Army Research Office, Durham, NC, USA
Cliff Wang

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Abay, N.C., Akcora, C.G., Zhou, Y., Kantarcioglu, M., Thuraisingham, B. (2019). Using Deep Learning to Generate Relational HoneyData. In: Al-Shaer, E., Wei, J., Hamlen, K., Wang, C. (eds) Autonomous Cyber Deception. Springer, Cham. https://doi.org/10.1007/978-3-030-02110-8_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-02110-8_1
Published: 26 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02109-2
Online ISBN: 978-3-030-02110-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics