Recurrent Neural Networks for Fuzz Testing Web Browsers

Sablotny, Martin; Jensen, Bjørn Sand; Johnson, Chris W.

doi:10.1007/978-3-030-12146-4_22

Martin Sablotny ORCID: orcid.org/0000-0002-9836-8254¹³,
Bjørn Sand Jensen¹³ &
Chris W. Johnson¹³

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11396))

Included in the following conference series:

International Conference on Information Security and Cryptology

738 Accesses
4 Citations
6 Altmetric

Abstract

Generation-based fuzzing is a software testing approach which is able to discover different types of bugs and vulnerabilities in software. It is, however, known to be very time consuming to design and fine tune classical fuzzers to achieve acceptable coverage, even for small-scale software systems. To address this issue, we investigate a machine learning-based approach to fuzz testing in which we outline a family of test-case generators based on Recurrent Neural Networks (RNNs) and train those on readily available datasets with a minimum of human fine tuning. The proposed generators do, in contrast to previous work, not rely on heuristic sampling strategies but principled sampling from the predictive distributions. We provide a detailed analysis to demonstrate the characteristics and efficacy of the proposed generators in a challenging web browser testing scenario. The empirical results show that the RNN-based generators are able to provide better coverage than a mutation based method and are able to discover paths not discovered by a classical fuzzer. Our results supplement findings in other domains suggesting that generation based fuzzing with RNNs is a viable route to better software quality conditioned on the use of a suitable model selection/analysis procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Code and data is available from https://github.com/susperius/icisc_rnnfuzz.

References

Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). http://tensorflow.org/
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Balog, M., Gaunt, A.L., Brockschmidt, M., Nowozin, S., Tarlow, D.: Deepcoder: learning to write programs. arXiv preprint arXiv:1611.01989 (2016)
Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)
Article Google Scholar
Böhme, M., Pham, V., Roychoudhury, A.: Coverage-based Greybox Fuzzing as Markov Chain. IEEE Trans. Softw. Eng., 1 (2018). https://doi.org/10.1109/TSE.2017.2785841. ISSN 0098-5589
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
DeMott, J.: The evolving art of fuzzing. DEF CON 14 (2006)
Google Scholar
DynamoRIO: Dynamorio, June 2017. http://dynamorio.org/
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of The Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
Google Scholar
Godefroid, P., Peleg, H., Singh, R.: Learn&fuzz: machine learning for input fuzzing. In: Automated Software Engineering (ASE 2017) (2017)
Google Scholar
Google: Using clusterfuzz. http://dev.chromium.org/Home/chromium-security/bugs/using-clusterfuzz
Hochreiter, S.: Untersuchungen zu dynamischen neuronalen netzen. Diploma Technische Universität München 91 (1991)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Höschele, M., Zeller, A.: Mining input grammars from dynamic taints. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, pp. 720–725. ACM (2016)
Google Scholar
Postel, J., Reynolds, J.: File transfer protocol. Technical report, October 1985. https://tools.ietf.org/html/rfc959
Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: International Conference on Machine Learning, pp. 2342–2350 (2015)
Google Scholar
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Mozilla Corporation: Firefox, August 2018. https://www.mozilla.org/en-US/firefox/
Oehlert, P.: Violating assumptions with fuzzing. IEEE Secur. Priv. 3(2), 58–62 (2005)
Article Google Scholar
Pascanu, R., Gulcehre, C., Cho, K., Bengio, Y.: How to construct deep recurrent neural networks. arXiv preprint arXiv:1312.6026 (2013)
Pradel, M., Sen, K.: Deep learning to find bugs (2017)
Google Scholar
Rawat, S., Jain, V., Kumar, A., Cojocar, L., Giuffrida, C., Bos, H.: Vuzzer: application-aware evolutionary fuzzing. In: Proceedings of the Network and Distributed System Security Symposium (NDSS) (2017)
Google Scholar
Sablotny, M.: Pyfuzz2 - fuzzing framework (2017). https://github.com/susperius/PyFuzz2
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Sutskever, I., Martens, J., Hinton, G.E.: Generating text with recurrent neural networks. In: Proceedings of the 28th International Conference on Machine Learning (ICML 2011), pp. 1017–1024 (2011)
Google Scholar
Sutton, M., Greene, A., Amini, P.: Fuzzing: Brute Force Vulnerability Discovery. Pearson Education (2007)
Google Scholar
Zalewski, M.: American fuzzy lop (2017). http://lcamtuf.coredump.cx/afl/

Download references

Acknowledgements

We gratefully acknowledge the support of NVIDIA Corporation with the provision of the GeForce 1080 Ti and the GeForce TITAN Xp used for this research. We also like to thank Chris Schneider from NVIDIA for his ongoing interest in our research and his support.

Author information

Authors and Affiliations

School of Computing Science, University of Glasgow, Glasgow, Scotland
Martin Sablotny, Bjørn Sand Jensen & Chris W. Johnson

Authors

Martin Sablotny
View author publications
You can also search for this author in PubMed Google Scholar
Bjørn Sand Jensen
View author publications
You can also search for this author in PubMed Google Scholar
Chris W. Johnson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martin Sablotny .

Editor information

Editors and Affiliations

Sejong University, Seoul, Korea (Republic of)
Kwangsu Lee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sablotny, M., Jensen, B.S., Johnson, C.W. (2019). Recurrent Neural Networks for Fuzz Testing Web Browsers. In: Lee, K. (eds) Information Security and Cryptology – ICISC 2018. ICISC 2018. Lecture Notes in Computer Science(), vol 11396. Springer, Cham. https://doi.org/10.1007/978-3-030-12146-4_22

Download citation

DOI: https://doi.org/10.1007/978-3-030-12146-4_22
Published: 23 January 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12145-7
Online ISBN: 978-3-030-12146-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics