Skip to main content

Mobile AR Solution for Deaf People

Correlation Between Face Detection and Speech Recognition

  • Conference paper
  • First Online:
Mobile Web and Intelligent Information Systems (MobiWIS 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11673))

Abstract

In the last two years, the authors’ research has been focused on designing a smart solution to compensate for hearing or visual deficiencies using the Google Glass hardware device and its own software architecture. This paper presents a solution aimed on deaf people or people with hearing impairment. At the beginning of this paper there is a brief explanation of the architecture of the designed solution, a description of the user interface through Google Glass. Related work to face detection, visual activity detection and speech recognition is presented with many possible approaches to these research areas. The principle of the solution lies in the combination of face detection and subsequent assignment of the recognized speech to the mouth of the correct face in the image. The aim of the solution is to digitally capture the ambient sound and based on its evaluation using neural networks, to present detected speech in a text form assigned to detected face from camera stream. The testing has shown that the solution is beneficial, and it is working as expected. Machine Learning Kit provides good results in face detection and communication with Google Cloud Speech API is fast enough for smooth user experience.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Shearer, A.E., Hildebrand, M.S., Smith, R.J.: Hereditary hearing loss and deafness overview. In: Adam, M.P., et al. (eds.) GeneReviews®. University of Washington, Seattle (1993)

    Google Scholar 

  2. Deafness and hearing loss. https://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss

  3. Národní plán opatření pro snížení negativních důsledků zdravotního postižení. https://www.knihkm.cz/handy/texty/narplan93.htm

  4. Hrubý, J.: Kolik je u nás sluchově postiženỳch (1998)

    Google Scholar 

  5. Hrubý, J.: Tak kolik těch sluchově postiženỳch u nás vlastně je? (2009)

    Google Scholar 

  6. Forecast number of mobile users worldwide 2019–2023—Statistic. https://www.statista.com/statistics/218984/number-of-global-mobile-users-since-2010/

  7. Mobile phone penetration worldwide 2013–2019—Statistic. https://www.statista.com/statistics/470018/mobile-phone-user-penetration-worldwide/

  8. Mobile connections worldwide by country 2013–2019—Statistic. https://www.statista.com/statistics/203636/mobile-connections-worldwide-by-country/

  9. Hjelmås, E., Low, B.K.: Face detection: a survey. Comput. Vis. Image Underst. 83, 236–274 (2001). https://doi.org/10.1006/cviu.2001.0921

    Article  MATH  Google Scholar 

  10. Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G.: A convolutional neural network cascade for face detection. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5325–5334. IEEE, Boston (2015). https://doi.org/10.1109/CVPR.2015.7299170

  11. Siatras, S., Nikolaidis, N., Krinidis, M., Pitas, I.: Visual lip activity detection and speaker detection using mouth region intensities. IEEE Trans. Circ. Syst. Video Technol. 19, 133–137 (2009). https://doi.org/10.1109/TCSVT.2008.2009262

    Article  Google Scholar 

  12. Johnson, D.H., Dudgeon, D.E.: Array Signal Processing: Concepts and Techniques. PTR Prentice Hall, Englewood Cliffs (1993)

    Google Scholar 

  13. Bourlard, H.A., Morgan, N.: Connectionist Speech Recognition. Springer, Boston (1994). https://doi.org/10.1007/978-1-4615-3210-1

    Book  Google Scholar 

  14. Graves, A., Mohamed, A., Hinton, G.: Speech Recognition with Deep Recurrent Neural Networks. arXiv:1303.5778 [cs] (2013)

  15. Doddington, G.R.: Speaker recognition—identifying people by their voices. Proc. IEEE 73, 1651–1664 (1985). https://doi.org/10.1109/PROC.1985.13345

    Article  Google Scholar 

  16. Berger, A., Vokalova, A., Maly, F., Poulova, P.: Google glass used as assistive technology its utilization for blind and visually impaired people. In: Younas, M., Awan, I., Holubova, I. (eds.) MobiWIS 2017. LNCS, vol. 10486, pp. 70–82. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65515-4_6

    Chapter  Google Scholar 

  17. Berger, A., Maly, F.: Prototype of a smart google glass solution for deaf (and hearing impaired) people. In: Younas, M., Awan, I., Ghinea, G., Catalan Cid, M. (eds.) MobiWIS 2018. LNCS, vol. 10995, pp. 38–47. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-97163-6_4

    Chapter  Google Scholar 

  18. Google Cloud including GCP & G Suite—Try Free. https://cloud.google.com/

  19. Cloud Vision API Documentation—Cloud Vision API Documentation. https://cloud.google.com/vision/docs/

  20. Holey, P.N., Gaikwad, V.T.: Google glass technology. Int. J. 2 (2014)

    Google Scholar 

  21. Exploiting a Bug in Google’s Glass - Jay Freeman (saurik). http://www.saurik.com/id/16

  22. Vahabzadeh, A., Keshav, N.U., Salisbury, J.P., Sahin, N.T.: Improvement of attention-deficit/hyperactivity disorder symptoms in school-aged children, adolescents, and young adults with autism via a digital smartglasses-based socioemotional coaching aid: short-term, uncontrolled pilot study. JMIR Ment Health 5 (2018). https://doi.org/10.2196/mental.9631

    Article  Google Scholar 

  23. Deshpande, S., Uplenchwar, G., Chaudhari, D.N.: Google glass. Int. J. Sci. Eng. Res. 4, 0–4 (2013)

    Google Scholar 

  24. How does Google glass work? (Infographic). https://www.varifocals.net/google-glass/

  25. Face Detection. https://firebase.google.com/docs/ml-kit/detect-faces

  26. Cloud Speech-to-Text - Speech Recognition—Cloud Speech-to-Text API. https://cloud.google.com/speech-to-text/

  27. Overview of Face Detection and Face Recognition - Amazon Rekognition. https://docs.aws.amazon.com/rekognition/latest/dg/face-feature-differences.html

Download references

Acknowledgment

This work and the contribution were supported by the project of Students Grant Agency – FIM, University of Hradec Kralove, Czech Republic.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ales Berger .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Berger, A., Kostak, M., Maly, F. (2019). Mobile AR Solution for Deaf People. In: Awan, I., Younas, M., Ăśnal, P., Aleksy, M. (eds) Mobile Web and Intelligent Information Systems. MobiWIS 2019. Lecture Notes in Computer Science(), vol 11673. Springer, Cham. https://doi.org/10.1007/978-3-030-27192-3_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-27192-3_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-27191-6

  • Online ISBN: 978-3-030-27192-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics