Face Detection and Object Recognition for a Retinal Prosthesis

Rollend, Derek; Rosendall, Paul; Billings, Seth; Burlina, Philippe; Wolfe, Kevin; Katyal, Kapil

doi:10.1007/978-3-319-54407-6_20

Derek Rollend¹⁶,
Paul Rosendall¹⁶,
Seth Billings¹⁶,
Philippe Burlina¹⁶,
Kevin Wolfe¹⁶ &
…
Kapil Katyal¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10116))

Included in the following conference series:

Asian Conference on Computer Vision

1864 Accesses

Abstract

We describe the recent development of assistive computer vision algorithms for use with the Argus II retinal prosthesis system. While users of the prosthetic system can learn and adapt to the limited stimulation resolution, there exists great potential for computer vision algorithms to augment the experience and significantly increase the utility of the system for the user. To this end, our recent work has focused on helping with two different challenges encountered by the visually impaired: face detection and object recognition. In this paper, we describe algorithm implementations in both of these areas that make use of the retinal prosthesis for visual feedback to the user, and discuss the unique challenges faced in this domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The speech synthesis and recognition modules run asynchronously from the vision algorithm and their computational demands are minimal compared to the CNN detection/tracking portion.

References

Black, A.W., Taylor, P.A.: The festival speech synthesis system: system documentation. Technical report HCRC/TR-83, Human Communciation Research Centre, University of Edinburgh, Scotland, UK (1997). http://www.cstr.ed.ac.uk/projects/festival.html
Bradski, G.: The OpenCV Library. Dr. Dobb’s J. Softw. Tools (2000). http://code.opencv.org/projects/opencv/wiki/CiteOpenCV
Burlina, P.: MR-CNN: a stateful fast R-CNN. In: International Conference on Pattern Recognition (2016)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Computer Vision and Pattern Recognition (2014)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. CoRR abs/1502.01852 (2015)
Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint (2014). arXiv:1408.5093
Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1409–1422 (2012)
Article Google Scholar
Kang, K., Li, H., Yan, J., Zeng, X., Yang, B., Xiao, T., Zhang, C., Wang, Z., Wang, R., Wang, X., Ouyang, W.: T-CNN: tubelets with convolutional neural networks for object detection from videos. CoRR abs/1604.02532 (2016)
Google Scholar
Kang, K., Ouyang, W., Li, H., Wang, X.: Object detection from video tubelets with convolutional neural networks. CoRR abs/1604.04053 (2016)
Google Scholar
Liao, S., Zhu, X., Lei, Z., Zhang, L., Li, S.Z.: Learning Multi-scale Block Local Binary Patterns for Face Recognition. In: Lee, S.-W., Li, S.Z. (eds.) ICB 2007. LNCS, vol. 4642, pp. 828–837. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74549-5_87
Chapter Google Scholar
Lin, T., Maire, M., Belongie, S.J., Bourdev, L.D., Girshick, R.B., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. CoRR abs/1405.0312 (2014)
Google Scholar
Liu, W.: SSD Caffe (2015). https://github.com/weiliu89/caffe/tree/ssd
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E.: SSD: single shot multibox detector. CoRR abs/1512.02325 (2015)
Google Scholar
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. CoRR abs/1506.02640 (2015)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (NIPS) (2015)
Google Scholar
Rollend, D., Rosendall, P., Wolfe, K., Kleissas, D., Billings, S., Oben, J., Helder, J., Tenore, F., Burlina, P., Roy, A., Greenberg, R., Katyal, K.: Embedded clutter reduction and face detection algorithms for a visual prosthesis. In: 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), August 2016
Google Scholar
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015)
Google Scholar
Stanga, P., Sahel, J., Mohand-Said, S., daCruz, L., Caspi, A., Merlini, F., Greenberg, R.: Face detection using the argus II retinal prosthesis system. Invest. Ophthalmol. Vis. Sci. 54, 1766 (2013)
Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), vol. 1, pp. I-511-I-518 (2001)
Google Scholar
Walker, W., Lamere, P., Kwok, P., Raj, B., Singh, R., Gouvea, E., Wolf, P., Woelfel, J.: Sphinx-4: a flexible open source framework for speech recognition. Technical report, Mountain View, CA, USA (2004)
Google Scholar

Download references

Acknowledgement

This work was supported by an Alfred E. Mann collaboration grant. We would also like to thank Arup Roy, Avi Caspi, and Robert Greenberg, our collaborators from Second Sight Medical Products.

Author information

Authors and Affiliations

The Johns Hopkins University Applied Physics Laboratory, 11100 Johns Hopkins Road, Laurel, Maryland, USA
Derek Rollend, Paul Rosendall, Seth Billings, Philippe Burlina, Kevin Wolfe & Kapil Katyal

Authors

Derek Rollend
View author publications
You can also search for this author in PubMed Google Scholar
Paul Rosendall
View author publications
You can also search for this author in PubMed Google Scholar
Seth Billings
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Burlina
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Wolfe
View author publications
You can also search for this author in PubMed Google Scholar
Kapil Katyal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kapil Katyal .

Editor information

Editors and Affiliations

Institute of Information Science, Academia Sinica, Taipei, Taiwan
Chu-Song Chen
Tsinghua University , Beijing, China
Jiwen Lu
School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
Kai-Kuang Ma

1 Electronic supplementary material

Supplementary material 1 (mp4 23697 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rollend, D., Rosendall, P., Billings, S., Burlina, P., Wolfe, K., Katyal, K. (2017). Face Detection and Object Recognition for a Retinal Prosthesis. In: Chen, CS., Lu, J., Ma, KK. (eds) Computer Vision – ACCV 2016 Workshops. ACCV 2016. Lecture Notes in Computer Science(), vol 10116. Springer, Cham. https://doi.org/10.1007/978-3-319-54407-6_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-54407-6_20
Published: 15 March 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54406-9
Online ISBN: 978-3-319-54407-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics