Abstract
Generation-based fuzzing is a software testing approach which is able to discover different types of bugs and vulnerabilities in software. It is, however, known to be very time consuming to design and fine tune classical fuzzers to achieve acceptable coverage, even for small-scale software systems. To address this issue, we investigate a machine learning-based approach to fuzz testing in which we outline a family of test-case generators based on Recurrent Neural Networks (RNNs) and train those on readily available datasets with a minimum of human fine tuning. The proposed generators do, in contrast to previous work, not rely on heuristic sampling strategies but principled sampling from the predictive distributions. We provide a detailed analysis to demonstrate the characteristics and efficacy of the proposed generators in a challenging web browser testing scenario. The empirical results show that the RNN-based generators are able to provide better coverage than a mutation based method and are able to discover paths not discovered by a classical fuzzer. Our results supplement findings in other domains suggesting that generation based fuzzing with RNNs is a viable route to better software quality conditioned on the use of a suitable model selection/analysis procedure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Code and data is available from https://github.com/susperius/icisc_rnnfuzz.
References
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). http://tensorflow.org/
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Balog, M., Gaunt, A.L., Brockschmidt, M., Nowozin, S., Tarlow, D.: Deepcoder: learning to write programs. arXiv preprint arXiv:1611.01989 (2016)
Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)
Böhme, M., Pham, V., Roychoudhury, A.: Coverage-based Greybox Fuzzing as Markov Chain. IEEE Trans. Softw. Eng., 1 (2018). https://doi.org/10.1109/TSE.2017.2785841. ISSN 0098-5589
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)
DeMott, J.: The evolving art of fuzzing. DEF CON 14 (2006)
DynamoRIO: Dynamorio, June 2017. http://dynamorio.org/
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of The Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
Godefroid, P., Peleg, H., Singh, R.: Learn&fuzz: machine learning for input fuzzing. In: Automated Software Engineering (ASE 2017) (2017)
Google: Using clusterfuzz. http://dev.chromium.org/Home/chromium-security/bugs/using-clusterfuzz
Hochreiter, S.: Untersuchungen zu dynamischen neuronalen netzen. Diploma Technische Universität München 91 (1991)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Höschele, M., Zeller, A.: Mining input grammars from dynamic taints. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, pp. 720–725. ACM (2016)
Postel, J., Reynolds, J.: File transfer protocol. Technical report, October 1985. https://tools.ietf.org/html/rfc959
Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: International Conference on Machine Learning, pp. 2342–2350 (2015)
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Mozilla Corporation: Firefox, August 2018. https://www.mozilla.org/en-US/firefox/
Oehlert, P.: Violating assumptions with fuzzing. IEEE Secur. Priv. 3(2), 58–62 (2005)
Pascanu, R., Gulcehre, C., Cho, K., Bengio, Y.: How to construct deep recurrent neural networks. arXiv preprint arXiv:1312.6026 (2013)
Pradel, M., Sen, K.: Deep learning to find bugs (2017)
Rawat, S., Jain, V., Kumar, A., Cojocar, L., Giuffrida, C., Bos, H.: Vuzzer: application-aware evolutionary fuzzing. In: Proceedings of the Network and Distributed System Security Symposium (NDSS) (2017)
Sablotny, M.: Pyfuzz2 - fuzzing framework (2017). https://github.com/susperius/PyFuzz2
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Sutskever, I., Martens, J., Hinton, G.E.: Generating text with recurrent neural networks. In: Proceedings of the 28th International Conference on Machine Learning (ICML 2011), pp. 1017–1024 (2011)
Sutton, M., Greene, A., Amini, P.: Fuzzing: Brute Force Vulnerability Discovery. Pearson Education (2007)
Zalewski, M.: American fuzzy lop (2017). http://lcamtuf.coredump.cx/afl/
Acknowledgements
We gratefully acknowledge the support of NVIDIA Corporation with the provision of the GeForce 1080 Ti and the GeForce TITAN Xp used for this research. We also like to thank Chris Schneider from NVIDIA for his ongoing interest in our research and his support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Sablotny, M., Jensen, B.S., Johnson, C.W. (2019). Recurrent Neural Networks for Fuzz Testing Web Browsers. In: Lee, K. (eds) Information Security and Cryptology – ICISC 2018. ICISC 2018. Lecture Notes in Computer Science(), vol 11396. Springer, Cham. https://doi.org/10.1007/978-3-030-12146-4_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-12146-4_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12145-7
Online ISBN: 978-3-030-12146-4
eBook Packages: Computer ScienceComputer Science (R0)