AUGEN: An Ocular Support for Visually Impaired Using Deep Learning
Among the wide varieties of technologies, mobile phone technology has become popular and the usage of mobile phone applications is increasing day by day. Most of the modern mobiles are able to capture photographs. This can be used by the visually impaired to capture images of their surroundings which is then used to generate sentences that can be read out to the give visually impaired people a better knowledge of their surroundings. The content of an image is described automatically to them by which they can avoid seeking help from people around them. Computer vision is a field which can be used for gaining information from images or videos. The tasks which the human visual system can do can be done using computer vision. Visually impaired people can use these technologies in order to get better understanding of their surroundings.
KeywordsConvolutional neural networks Recurrent neural network Caption generation Text to speech
- 1.Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: a neural image caption generation. In: Computer vision and pattern recognition (CVPR), IEEE conferenceGoogle Scholar
- 2.Xu K, Ba JL, Kiros R, Cho K, Courville A, Salakhutdinov R, Zemel RS, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: International conference on machine learning, pp 2048–2057Google Scholar
- 6.Leuven XJKU, Gavves E, Fernando B, Tuytelaars T (2015) Guiding the long-short term memory model for image caption generation. In: The IEEE international conference on computer vision (ICCV), pp 2407–2415Google Scholar
- 7.Chen X, Lawrence Zitnick C (2015) MindsEye: a recurrent visual representation for image caption generation. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 2422–2431Google Scholar
- 8.Ba JL, Mnih V, Kavukcuoglu K-R (2014) Multiple object recognition with visual attention. arXiv: 1412.7755 [cs.LG]Google Scholar