Skip to main content

Recurrent Neural Networks for Fuzz Testing Web Browsers

  • Conference paper
  • First Online:
Information Security and Cryptology – ICISC 2018 (ICISC 2018)

Abstract

Generation-based fuzzing is a software testing approach which is able to discover different types of bugs and vulnerabilities in software. It is, however, known to be very time consuming to design and fine tune classical fuzzers to achieve acceptable coverage, even for small-scale software systems. To address this issue, we investigate a machine learning-based approach to fuzz testing in which we outline a family of test-case generators based on Recurrent Neural Networks (RNNs) and train those on readily available datasets with a minimum of human fine tuning. The proposed generators do, in contrast to previous work, not rely on heuristic sampling strategies but principled sampling from the predictive distributions. We provide a detailed analysis to demonstrate the characteristics and efficacy of the proposed generators in a challenging web browser testing scenario. The empirical results show that the RNN-based generators are able to provide better coverage than a mutation based method and are able to discover paths not discovered by a classical fuzzer. Our results supplement findings in other domains suggesting that generation based fuzzing with RNNs is a viable route to better software quality conditioned on the use of a suitable model selection/analysis procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Code and data is available from https://github.com/susperius/icisc_rnnfuzz.

References

  1. Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). http://tensorflow.org/

  2. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

  3. Balog, M., Gaunt, A.L., Brockschmidt, M., Nowozin, S., Tarlow, D.: Deepcoder: learning to write programs. arXiv preprint arXiv:1611.01989 (2016)

  4. Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)

    Article  Google Scholar 

  5. Böhme, M., Pham, V., Roychoudhury, A.: Coverage-based Greybox Fuzzing as Markov Chain. IEEE Trans. Softw. Eng., 1 (2018). https://doi.org/10.1109/TSE.2017.2785841. ISSN 0098-5589

  6. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)

  7. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)

  8. DeMott, J.: The evolving art of fuzzing. DEF CON 14 (2006)

    Google Scholar 

  9. DynamoRIO: Dynamorio, June 2017. http://dynamorio.org/

  10. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of The Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)

    Google Scholar 

  11. Godefroid, P., Peleg, H., Singh, R.: Learn&fuzz: machine learning for input fuzzing. In: Automated Software Engineering (ASE 2017) (2017)

    Google Scholar 

  12. Google: Using clusterfuzz. http://dev.chromium.org/Home/chromium-security/bugs/using-clusterfuzz

  13. Hochreiter, S.: Untersuchungen zu dynamischen neuronalen netzen. Diploma Technische Universität München 91 (1991)

    Google Scholar 

  14. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  15. Höschele, M., Zeller, A.: Mining input grammars from dynamic taints. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, pp. 720–725. ACM (2016)

    Google Scholar 

  16. Postel, J., Reynolds, J.: File transfer protocol. Technical report, October 1985. https://tools.ietf.org/html/rfc959

  17. Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: International Conference on Machine Learning, pp. 2342–2350 (2015)

    Google Scholar 

  18. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  19. Mozilla Corporation: Firefox, August 2018. https://www.mozilla.org/en-US/firefox/

  20. Oehlert, P.: Violating assumptions with fuzzing. IEEE Secur. Priv. 3(2), 58–62 (2005)

    Article  Google Scholar 

  21. Pascanu, R., Gulcehre, C., Cho, K., Bengio, Y.: How to construct deep recurrent neural networks. arXiv preprint arXiv:1312.6026 (2013)

  22. Pradel, M., Sen, K.: Deep learning to find bugs (2017)

    Google Scholar 

  23. Rawat, S., Jain, V., Kumar, A., Cojocar, L., Giuffrida, C., Bos, H.: Vuzzer: application-aware evolutionary fuzzing. In: Proceedings of the Network and Distributed System Security Symposium (NDSS) (2017)

    Google Scholar 

  24. Sablotny, M.: Pyfuzz2 - fuzzing framework (2017). https://github.com/susperius/PyFuzz2

  25. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  26. Sutskever, I., Martens, J., Hinton, G.E.: Generating text with recurrent neural networks. In: Proceedings of the 28th International Conference on Machine Learning (ICML 2011), pp. 1017–1024 (2011)

    Google Scholar 

  27. Sutton, M., Greene, A., Amini, P.: Fuzzing: Brute Force Vulnerability Discovery. Pearson Education (2007)

    Google Scholar 

  28. Zalewski, M.: American fuzzy lop (2017). http://lcamtuf.coredump.cx/afl/

Download references

Acknowledgements

We gratefully acknowledge the support of NVIDIA Corporation with the provision of the GeForce 1080 Ti and the GeForce TITAN Xp used for this research. We also like to thank Chris Schneider from NVIDIA for his ongoing interest in our research and his support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Sablotny .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sablotny, M., Jensen, B.S., Johnson, C.W. (2019). Recurrent Neural Networks for Fuzz Testing Web Browsers. In: Lee, K. (eds) Information Security and Cryptology – ICISC 2018. ICISC 2018. Lecture Notes in Computer Science(), vol 11396. Springer, Cham. https://doi.org/10.1007/978-3-030-12146-4_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-12146-4_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-12145-7

  • Online ISBN: 978-3-030-12146-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics