Advertisement

Modeling Camera Effects to Improve Visual Learning from Synthetic Data

  • Alexandra CarlsonEmail author
  • Katherine A. Skinner
  • Ram Vasudevan
  • Matthew Johnson-Roberson
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11129)

Abstract

Recent work has focused on generating synthetic imagery to increase the size and variability of training data for learning visual tasks in urban scenes. This includes increasing the occurrence of occlusions or varying environmental and weather effects. However, few have addressed modeling variation in the sensor domain. Sensor effects can degrade real images, limiting generalizability of network performance on visual tasks trained on synthetic data and tested in real environments. This paper proposes an efficient, automatic, physically-based augmentation pipeline to vary sensor effects – chromatic aberration, blur, exposure, noise, and color temperature – for synthetic imagery. In particular, this paper illustrates that augmenting synthetic training datasets with the proposed pipeline reduces the domain gap between synthetic and real domains for the task of object detection in urban driving scenes.

Keywords

Deep learning Image augmentation Object detection 

Notes

Acknowledgements

This work was supported by a grant from Ford Motor Company via the Ford-UM Alliance under award N022884, and by the National Science Foundation under Grant No. 1452793.

References

  1. 1.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  2. 2.
    Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2017)Google Scholar
  3. 3.
    Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3234–3243 (2016)Google Scholar
  4. 4.
    Alhaija, H.A., Mustikovela, S.K., Mescheder, L., Geiger, A., Rother, C.: Augmented reality meets deep learning for car instance segmentation in urban scenes. In: Proceedings of the British Machine Vision Conference, vol. 3 (2017)Google Scholar
  5. 5.
    Zhang, H., Sindagi, V., Patel, V.M.: Image de-raining using a conditional generative adversarial network. arXiv preprint arXiv:1701.05957 (2017)
  6. 6.
    Veeravasarapu, V., Rothkopf, C., Visvanathan, R.: Adversarially tuned scene generation. arXiv preprint arXiv:1701.00405 (2017)
  7. 7.
    Huang, S., Ramanan, D.: Undefined, undefined, undefined, undefined: expecting the unexpected: training detectors for unusual pedestrians with adversarial imposters. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4664–4673 (2017)Google Scholar
  8. 8.
    Andreopoulos, A., Tsotsos, J.K.: On sensor bias in experimental methods for comparing interest-point, saliency, and recognition algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 34, 110–126 (2012)CrossRefGoogle Scholar
  9. 9.
    Song, S., Lichtenberg, S.P., Xiao, J.: SUN RGB-D: a RGB-D scene understanding benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 567–576 (2015)Google Scholar
  10. 10.
    Dodge, S., Karam, L.: Understanding how image quality affects deep neural networks. In: 2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX), pp. 1–6. IEEE (2016)Google Scholar
  11. 11.
    Doersch, C., Gupta, A., Efros, A.A.: Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1422–1430 (2015)Google Scholar
  12. 12.
    Grossberg, M.D., Nayar, S.K.: Modeling the space of camera response functions. IEEE Trans. Pattern Anal. Mach. Intell. 26, 1272–1282 (2004)CrossRefGoogle Scholar
  13. 13.
    Couzinie-Devy, F., Sun, J., Alahari, K., Ponce, J.: Learning to estimate and remove non-uniform image blur. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1075–1082 (2013)Google Scholar
  14. 14.
    Foi, A., Trimeche, M., Katkovnik, V., Egiazarian, K.: Practical Poissonian-Gaussian noise modeling and fitting for single-image raw-data. IEEE Trans. Image Process. 17, 1737–1754 (2008)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Ramanagopal, M.S., Anderson, C., Vasudevan, R., Johnson-Roberson, M.: Failing to learn: autonomously identifying perception failures for self-driving cars. CoRR abs/1707.00051 (2017)Google Scholar
  16. 16.
    Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual worlds as proxy for multi-object tracking analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4340–4349 (2016)Google Scholar
  17. 17.
    Johnson-Roberson, M., Barto, C., Mehta, R., Sridhar, S.N., Rosaen, K., Vasudevan, R.: Driving in the matrix: can virtual worlds replace human-generated annotations for real world tasks? In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 746–753. IEEE (2017)Google Scholar
  18. 18.
    Sankaranarayanan, S., Balaji, Y., Jain, A., Lim, S., Chellappa, R.: Unsupervised domain adaptation for semantic segmentation with GANs. CoRR abs/1711.06969 (2017)Google Scholar
  19. 19.
    Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., Webb, R.: Learning from simulated and unsupervised images through adversarial training. arXiv preprint arXiv:1612.07828 (2016)
  20. 20.
    Sixt, L., Wild, B., Landgraf, T.: RenderGAN: generating realistic labeled data. arXiv preprint arXiv:1611.01331 (2016)
  21. 21.
    Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv preprint arXiv:1703.10593 (2017)
  22. 22.
    Eitel, A., Springenberg, J.T., Spinello, L., Riedmiller, M., Burgard, W.: Multimodal deep learning for robust RGB-D object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 681–687. IEEE (2015)Google Scholar
  23. 23.
    Wu, R., Yan, S., Shan, Y., Dang, Q., Sun, G.: Deep image: scaling up image recognition. arXiv preprint arXiv:1501.02876, vol. 7, no. 8 (2015)
  24. 24.
    Hauberg, S., Freifeld, O., Larsen, A.B.L., Fisher, J., Hansen, L.: Dreaming more data: class-dependent distributions over diffeomorphisms for learned data augmentation. In: Artificial Intelligence and Statistics, pp. 342–350 (2016)Google Scholar
  25. 25.
    Paulin, M., Revaud, J., Harchaoui, Z., Perronnin, F., Schmid, C.: Transformation pursuit for image classification. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3646–3653. IEEE (2014)Google Scholar
  26. 26.
    Kim, H.E., Lee, Y., Kim, H., Cui, X.: Domain-specific data augmentation for on-road object detection based on a deep neural network. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 103–108. IEEE (2017)Google Scholar
  27. 27.
    Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23–30. IEEE (2017)Google Scholar
  28. 28.
    Kanan, C., Cottrell, G.W.: Color-to-grayscale: does the method matter in image recognition? PloS One 7(1), e29740 (2012)CrossRefGoogle Scholar
  29. 29.
    Diamond, S., Sitzmann, V., Boyd, S., Wetzstein, G., Heide, F.: Dirty pixels: optimizing image classification architectures for raw sensor data (2017)Google Scholar
  30. 30.
    Karaimer, H.C., Brown, M.S.: A software platform for manipulating the camera imaging pipeline. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 429–444. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_26CrossRefGoogle Scholar
  31. 31.
    Kang, S.B.: Automatic removal of chromatic aberration from a single image. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007, pp. 1–8. IEEE (2007)Google Scholar
  32. 32.
    Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: Advances in Neural Information Processing Systems, pp. 2017–2025 (2015)Google Scholar
  33. 33.
    Cheong, H., Chae, E., Lee, E., Jo, G., Paik, J.: Fast image restoration for spatially varying defocus blur of imaging sensor. Sensors 15(1), 880–898 (2015)CrossRefGoogle Scholar
  34. 34.
    Bhukhanwala, S.A., Ramabadran, T.V.: Automated global enhancement of digitized photographs. IEEE Trans. Consum. Electron. 40(1), 1–10 (1994)CrossRefGoogle Scholar
  35. 35.
    Messina, G., Castorina, A., Battiato, S., Bosco, A.: Image quality improvement by adaptive exposure correction techniques. In: Proceedings of the 2003 International Conference on Multimedia and Expo, ICME 2003, vol. 1, I-549–I-552, July 2003Google Scholar
  36. 36.
    Hunter, R.S.: Accuracy, precision, and stability of new photoelectric color-difference meter. J. Opt. Soc. Am. 38, 1094 (1948)Google Scholar
  37. 37.
    Annadurai, S.: Fundamentals of Digital Image Processing. Pearson Education India, Delhi (2007)Google Scholar
  38. 38.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3354–3361. IEEE (2012)Google Scholar
  39. 39.
    Fritsch, J., Kuehnl, T., Geiger, A.: A new performance measure and evaluation benchmark for road detection algorithms. In: International Conference on Intelligent Transportation Systems (ITSC) (2013)Google Scholar
  40. 40.
    Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016)Google Scholar
  41. 41.
    Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: ground truth from computer games. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 102–118. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_7CrossRefGoogle Scholar
  42. 42.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)Google Scholar
  43. 43.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015Google Scholar
  44. 44.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)Google Scholar
  45. 45.
    Zhang, Y., David, P., Gong, B.: Curriculum domain adaptation for semantic segmentation of urban scenes. In: The IEEE International Conference on Computer Vision (ICCV), vol. 2, p. 6 (2017)Google Scholar
  46. 46.
    Hoffman, J., Wang, D., Yu, F., Darrell, T.: FCNs in the wild: pixel-level adversarial and constraint-based adaptation. CoRR abs/1612.02649 (2016)Google Scholar
  47. 47.
    Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-24574-4_28CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Alexandra Carlson
    • 1
    Email author
  • Katherine A. Skinner
    • 1
  • Ram Vasudevan
    • 1
  • Matthew Johnson-Roberson
    • 1
  1. 1.University of MichiganAnn ArborUSA

Personalised recommendations