Quantifying the Limitations of Learning-Assisted Grammar-Based Fuzzing

  • Yuma Jitsunari
  • Yoshitaka ArahoriEmail author
  • Katsuhiko Gondow
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 926)


Grammar-based fuzzing is effective at finding the vulnerabilities of various input-parsing programs which take as inputs complex data conforming to a certain grammar. Traditional grammar-based fuzzing techniques require a manually-generated grammar for valid test input generation. However, writing an input grammar by hand has two major drawbacks: (1) it is costly and error-prone and (2) it has no capability to generate interesting inputs which induce high test-coverage (for finding many vulnerabilities). To address these problems, a state-of-the-art technique, Learn&Fuzz, automatically generates an input grammar via deep neural network-based statistical learning. Even Learn&Fuzz, however, has significant limitations; especially, it cannot successfully generate a (long) sequence of instructions (consisting of opcode plus zero or more operands), which contribute to high test-coverage of instruction-interpreting code. In this paper, we focus on and quantify the limitations of the current learning-assisted grammar-based fuzzing, i.e, how ineffective it is at generating instruction sequences triggering high test coverage. Through our experiments using a re-implementation of Learn&Fuzz and real instruction-interpreting code, we measure the test-coverage of the target code when tested by Learn&Fuzz. Our experimental results show the coverage is surprisingly low, and the analysis of the results open up new research directions to enhance learning-assisted grammar-based fuzzing.


Grammar-based fuzzing Grammar learning 


  1. 1.
    Chen, J., Diao, W., Zhao, Q., Zuo, C., Lin, Z., Wang, X.F., Lau, W.C., Sun, M., Yang, R., Zhang, K.: IoTFuzzer: discovering memory corruptions in IoT through app-based fuzzing. In: NDSS (2018)Google Scholar
  2. 2.
    Cummins, C., Petoumenos, P., Wang, Z., Leather, H.: End-to-end deep learning of optimization heuristics. In: PACT 2017, pp. 219–232 (2017)Google Scholar
  3. 3.
    Godefroid, P., Peleg, H., Singh, R.: Learn & fuzz: machine learning for input fuzzing. In: ASE 2017, pp. 50–59 (2017)Google Scholar
  4. 4.
    Godefroid, P., Levin, M.Y., Molnar, D.: Automated whitebox fuzz testing. In: NDSS 2008, pp. 151–166 (2008)Google Scholar
  5. 5.
  6. 6.
  7. 7.
    Graves, A.: Generating sequences with recurrent neural networks. CoRR 2013, abs/1308.0850 (2013)Google Scholar
  8. 8.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  9. 9.
    Li, J., Zhao, B., Zhang, C.: Fuzzing: a survey. Cybersecurity 1(1), 6 (2018)CrossRefGoogle Scholar
  10. 10.
  11. 11.
  12. 12.
    Purdom, P.: A sentence generator for testing parsers. BIT Numer. Math. 12(3), 366–375 (1972)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Sutton, M., Greene, A., Amini, P.: Fuzzing: Brute Force Vulnerability DiscoveryGoogle Scholar
  14. 14.
    Burget, L., Cernocky, J., Mikolov, T., Karafiat, M., Khu-danpur, S.: Recurrent neural network based language model. In: INTERSPEECH 2008, pp. 1045–1048 (2008)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Yuma Jitsunari
    • 1
  • Yoshitaka Arahori
    • 1
    Email author
  • Katsuhiko Gondow
    • 1
  1. 1.Tokyo Institute of TechnologyTokyoJapan

Personalised recommendations