Confidence Arguments for Evidence of Performance in Machine Learning for Highly Automated Driving Functions

  • Simon Burton
  • Lydia GauerhofEmail author
  • Bibhuti Bhusan Sethy
  • Ibrahim Habli
  • Richard Hawkins
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11699)


Due to their ability to efficiently process unstructured and highly dimensional input data, machine learning algorithms are being applied to perception tasks for highly automated driving functions. The consequences of failures and insufficiencies in such algorithms are severe and a convincing assurance case that the algorithms meet certain safety requirements is therefore required. However, the task of demonstrating the performance of such algorithms is non-trivial, and as yet, no consensus has formed regarding an appropriate set of verification measures. This paper provides a framework for reasoning about the contribution of performance evidence to the assurance case for machine learning in an automated driving context and applies the evaluation criteria to a pedestrian recognition case study.


Highly automated driving Machine learning Safety Assurance 


  1. 1.
    ISO/PRF PAS 21448: Road vehicles - safety of the intended functionality. Technical report, International Standards Organisation (ISO), Geneva (2011)Google Scholar
  2. 2.
    Goal structuring notation community standard version 2. Technical report, Assurance Case Working Group (ACWG) (2018). Accessed 04 June 2019
  3. 3.
    ISO 26262: Road vehicles - functional safety, second edition. Technical report, International Standards Organisation (ISO), Geneva (2018)Google Scholar
  4. 4.
    SAE J3016: Surface vehicle recommended practice, (r) taxonomy and definitions for terms related to driving automation systems for on-road motor vehicles. Technical report. SAE International, Geneva (2018)Google Scholar
  5. 5.
    Alsallakh, B., Jourabloo, A., Ye, M., Liu, X., Ren, L.: Do convolutional neural networks learn class hierarchy? CoRR arXiv:1710.06501 (2017)
  6. 6.
    Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., Mané, D.: Concrete problems in ai safety. arXiv preprint arXiv:1606.06565 (2016)
  7. 7.
    Baker, R., Habli, I.: An empirical evaluation of mutation testing for improving the test quality of safety-critical software. IEEE Trans. Software Eng. 39(6), 787–805 (2012)CrossRefGoogle Scholar
  8. 8.
    Burton, S., Gauerhof, L., Heinzemann, C.: Making the case for safety of machine learning in highly automated driving. In: Tonetta, S., Schoitsch, E., Bitsch, F. (eds.) SAFECOMP 2017. LNCS, vol. 10489, pp. 5–16. Springer, Cham (2017). Scholar
  9. 9.
    Chollet, F.: Deep Learning with Python. Manning Publications Co., Greenwich, CT, USA, 1st edn. (2017), chapter: 5.4.1. Visualizing intermediate activationsGoogle Scholar
  10. 10.
    Gauerhof, L., Munk, P., Burton, S.: Structuring validation targets of a machine learning function applied to automated driving. In: Gallina, B., Skavhaug, A., Bitsch, F. (eds.) SAFECOMP 2018. LNCS, vol. 11093, pp. 45–58. Springer, Cham (2018). Scholar
  11. 11.
    Hawkins, R., Habli, I., Kelly, T.: The principles of software safety assurance. In: 31st International System Safety Conference (2013)Google Scholar
  12. 12.
    Hawkins, R., Kelly, T., Knight, J., Graydon, P.: A new approach to creating clear safety arguments. In: Dale, C., Anderson, T. (eds.) Advances in Systems Safety. Springer, London (2011). Scholar
  13. 13.
    Huang, X., Kwiatkowska, M., Wang, S., Wu, M.: Safety verification of deep neural networks. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10426, pp. 3–29. Springer, Cham (2017). Scholar
  14. 14.
    Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \(<\)0.5MB model size. arXiv e-prints arXiv:1602.07360, February 2016
  15. 15.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  16. 16.
    Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533 (2016)
  17. 17.
    Kurd, Z., Kelly, T.: Establishing safety criteria for artificial neural networks. In: Palade, V., Howlett, R.J., Jain, L. (eds.) KES 2003. LNCS (LNAI), vol. 2773, pp. 163–169. Springer, Heidelberg (2003). Scholar
  18. 18.
    Lin, H.W., Tegmark, M., Rolnick, D.: Why does deep and cheap learning work so well? J. Stat. Phys. 168(6), 1223–1247 (2017)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Metzen, J.H., Genewein, T., Fischer, V., Bischoff, B.: On detecting adversarial perturbations. arXiv preprint arXiv:1702.04267 (2017)
  20. 20.
    Nguyen, A., Yosinski, J., Clune, J.: Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 427–436 (2015)Google Scholar
  21. 21.
    Nguyen, A.M., Yosinski, J., Clune, J.: Multifaceted feature visualization: Uncovering the different types of features learned by each neuron in deep neural networks. CoRR arXiv:1602.03616 (2016)
  22. 22.
    Picardi, C., Habli, I.: Perspectives on assurance case development for retinal disease diagnosis using deep learning. In: Riaño, D., Wilk, S., ten Teije, A. (eds.) Artificial Intelligence in Medicine AIME 2019. LNCS, p. 11526. Springer, Cham (2019). Scholar
  23. 23.
    Picardi, C., Hawkins, R., Paterson, C., Habli, I.: A pattern for arguing the assurance of machine learning in medical diagnosis systems. In: International Conference on Computer Safety, Reliability, and Security. Springer (2019) Google Scholar
  24. 24.
    Schorn, C., Guntoro, A., Ascheid, G.: Efficient on-line error detection and mitigation for deep neural network accelerators. In: Gallina, B., Skavhaug, A., Bitsch, F. (eds.) SAFECOMP 2018. LNCS, vol. 11093, pp. 205–219. Springer, Cham (2018). Scholar
  25. 25.
    Sculley, D., et al.: Hidden technical debt in machine learning systems. In: Advances in Neural Information Processing Systems, pp. 2503–2511 (2015)Google Scholar
  26. 26.
    Varshney, K.R.: Engineering safety in machine learning. In: 2016 Information Theory and Applications Workshop (ITA), pp. 1–5. IEEE (2016)Google Scholar
  27. 27.
    Zhang, S., Benenson, R., Schiele, B.: CityPersons: a diverse dataset for pedestrian detection. arXiv e-prints arXiv:1702.05693, February 2017

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Simon Burton
    • 1
    • 3
  • Lydia Gauerhof
    • 2
    Email author
  • Bibhuti Bhusan Sethy
    • 2
  • Ibrahim Habli
    • 3
  • Richard Hawkins
    • 3
  1. 1.Systems Engineering Vehicle, Robert Bosch GmbHLudwigsburgGermany
  2. 2.Corporate Research, Robert Bosch GmbHRenningenGermany
  3. 3.Assuring Autonomy International ProgrammeThe University of YorkYorkUK

Personalised recommendations