Abstract
Smart healthcare systems for the internet of things (IoT) platform are cost-efficient and facilitate continuous remote monitoring of patients to avoid unnecessary hospital visits and long waiting times to see practitioners. Presenting a smart healthcare system for the detection of dysphonia can reduce the suffering and pain of patients by providing an initial evaluation of voice. This preliminary feedback of voice could minimize the burden on ENT specialists by referring only genuine cases to them as well as giving an early alarm of potential voice complications to patients. Any possible delay in the treatment and/or inaccurate diagnosis using the subjective nature of tools may lead to severe circumstances for an individual because some types of dysphonia are life-threatening. Therefore, an accurate and reliable smart healthcare system for IoT platform to detect dysphonia is proposed and implemented in this study. Higher-order directional derivatives are used to analyze the time–frequency spectrum of signals in the proposed system. The computed derivatives provide essential and vital information by analyzing the spectrum along different directions to capture the changes that appeared due to malfunctioning the vocal folds. The proposed system provides 99.1% accuracy, while the sensitivity and specificity are 99.4 and 98.1%, respectively. The experimental results showed that the proposed system could provide better classification accuracy than the traditional non-directional first-order derivatives. Hence, the system can be used as a reliable tool for detecting dysphonia and implemented in edge devices to avoid latency issues and protect privacy, unlike cloud processing.
Similar content being viewed by others
References
Yang P, Stankevicius D, Marozas V, Deng Z, Liu E, Lukosevicius A, Dong F, Xu L, Min G (2018) Lifelogging data validation model for internet of things enabled personalized healthcare. IEEE Trans Syst Man Cybern: Syst 48(1):50–64. https://doi.org/10.1109/TSMC.2016.2586075
Guelzim T, Obaidat MS, Sadoun B (2016) Chapter 1-Introduction and overview of key enabling technologies for smart cities and homes. In: Smart cities and homes. morgan kaufmann, Boston, pp 1–16. https://doi.org/10.1016/B978-0-12-803454-5.00001-8
Raza M, Awais M, Singh N, Imran M, Hussain S (2020) Intelligent IoT framework for indoor healthcare monitoring of Parkinson’s disease patient. IEEE J Sel Areas Commun. https://doi.org/10.1109/JSAC.2020.3021571
Dourado CMJM, Silva SPPD, Nóbrega RVMD, Filho PPR, Muhammad K, Albuquerque VHCD (2020) An open IoHT-based deep learning framework for online medical image recognition. IEEE J Sel Areas Commun. https://doi.org/10.1109/JSAC.2020.3020598
Naseer A, Rani M, Naz S, Razzak MI, Imran M, Xu G (2020) Refining Parkinson’s neurological disorder identification through deep transfer learning. Neural Comput Appl 32(3):839–854. https://doi.org/10.1007/s00521-019-04069-0
Ali F, El-Sappagh S, Islam SMR, Ali A, Attique M, Imran M, Kwak K-S (2021) An intelligent healthcare monitoring framework using wearable sensors and social networking data. Future Gener Comput Syst 114:23–43. https://doi.org/10.1016/j.future.2020.07.047
Ali F, El-Sappagh S, Islam SMR, Kwak D, Ali A, Imran M, Kwak K-S (2020) A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion. Inf Fus 63:208–222. https://doi.org/10.1016/j.inffus.2020.06.008
Santos MAG, Munoz R, Olivares R, Filho PPR, Ser JD, Albuquerque VHCd (2020) Online heart monitoring systems on the internet of health things environments: A survey, a reference model and an outlook. Inf Fus 53:222–239. https://doi.org/10.1016/j.inffus.2019.06.004
Ding W, Abdel-Basset M, Eldrandaly KA, Abdel-Fatah L, Albuquerque VHCd (2020) Smart supervision of cardiomyopathy based on fuzzy Harris Hawks optimizer and wearable sensing data optimization: a new model. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2020.3000440
Muhammad K, Khan S, Ser JD, Albuquerque VHCd (2020) Deep learning for multigrade brain tumor classification in smart healthcare systems: a prospective survey. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2020.2995800
Rehman A, Naz S, Razzak MI, Akram F, Imran M (2020) A deep learning-based framework for automatic brain tumors classification using transfer learning. Circuits Syst Signal Process 39(2):757–775. https://doi.org/10.1007/s00034-019-01246-3
Razzak MI, Imran M, Xu G (2019) Efficient brain tumor segmentation with multiscale two-pathway-group conventional neural networks. IEEE J Biomed Health Inf 23(5):1911–1919. https://doi.org/10.1109/JBHI.2018.2874033
Ali Z, Muhammad G, Alhamid MF (2017) An automatic health monitoring system for patients suffering from voice complications in smart cities. IEEE Access 5:3900–3908. https://doi.org/10.1109/ACCESS.2017.2680467
Razzak MI, Imran M, Xu G (2020) Big data analytics for preventive medicine. Neural Comput Appl 32(9):4417–4451. https://doi.org/10.1007/s00521-019-04095-y
Hossain MS, Muhammad G, Alamri A (2019) Smart healthcare monitoring: a voice pathology detection paradigm for smart cities. Multimed Syst 25(5):565–575. https://doi.org/10.1007/s00530-017-0561-x
Arias-Londono JD, Gomez-Garcia JA, Godino JI (2019) Multimodal and multi-output deep learning architectures for the automatic assessment of voice quality using the GRB scale. IEEE J Sel Top Signal Process. https://doi.org/10.1109/JSTSP.2019.2956410
The american heritage® stedman’s medical dictionary retrieved MAy 1, 2018 from Dictionary.com website http://dictionary.reference.com/browse/dysphonia.
Mau T (2010) Diagnostic evaluation and management of hoarseness. The Med Clin North Am 94(5):945–960. https://doi.org/10.1016/j.mcna.2010.05.010
Roy N, Merrill RM, Gray SD, Smith EM (2005) Voice Disorders in the General Population: Prevalence, Risk Factors and Occupational Impact. Laryngoscope 115(11):1988–1995. https://doi.org/10.1097/01.mlg.0000179174.32345.41
Roy N, Merrill RM, Thibeault S, Parsa RA, Gray SD, Smith EM (2004) Prevalence of voice disorders in teachers and the general population. J Speech Lang Hear Res 47(2):281–293. https://doi.org/10.1044/1092-4388(2004/023)
Quick Statistics: Voice, Speech and Language. National Institute on Deafness and Other Communication Disorders. http://www.nidcd.nih.gov/health/statistics/vsl/Pages/stats.aspx. Accessed May 01, 2018
Yan Q, Yang R, Huang J Copy-move detection of audio recording with pitch similarity. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 19-April 24, 2015. pp 1782–1786. https://doi.org/10.1109/ICASSP.2015.7178277
Eddins DA, Anand S, Lang A, Shrivastav R (2020) Developing clinically relevant scales of breathy and rough voice quality. J Voice. https://doi.org/10.1016/j.jvoice.2019.12.021
Nemr K, Simoes-Zenari M, Cordeiro GF, Tsuji D, Ogawa AI, Ubrig MT, Menezes MH (2012) GRBAS and Cape-V scales: high reliability and consensus when applied at different times. J Voice: Off J Voice Found 26(6):812.e817-822. https://doi.org/10.1016/j.jvoice.2012.03.005
Thiruvaran T, Ambikairajah E, Epps J, Enzinger E A comparison of single-stage and two-stage modelling approaches for automatic forensic speaker recognition. In: 2013 IEEE 8th international conference on industrial and information systems, 17–20 Dec. 2013. pp 433–438. https://doi.org/10.1109/ICIInfS.2013.6732023
Uloza V, Vegiene A, Saferis V (2015) Correlation between the quantitative video laryngostroboscopic measurements and parameters of multidimensional voice assessment. Biomed Signal Process Control 17:3–10. https://doi.org/10.1016/j.bspc.2014.10.006
Poburka BJ (1999) A new stroboscopy rating form. J Voice 13(3):403–413. https://doi.org/10.1016/S0892-1997(99)80045-9
Rosen CA (2005) Stroboscopy as a research instrument: development of a perceptual evaluation tool. Laryngoscope 115(3):423–428. https://doi.org/10.1097/01.mlg.0000157830.38627.85
Deguchi S, Ishimaru Y, Washio S (2007) Preliminary evaluation of stroboscopy system using multiple light sources for observation of pathological vocal fold oscillatory pattern. Ann Otolog Rhinol Laryngol 116(9):687–694. https://doi.org/10.1177/000348940711600911
Speyer R, Wieneke GH, Kersing W, Dejonckere PH (2005) Accuracy of measurements on digital videostroboscopic images of the vocal folds. Ann Otolog Rhinol Laryngol 114(6):443–450. https://doi.org/10.1177/000348940511400606
Patel R, Dailey S, Bless D (2008) Comparison of high-speed digital imaging with stroboscopy for laryngeal imaging of glottal disorders. Ann Otolog Rhinol Laryngol 117(6):413–424. https://doi.org/10.1177/000348940811700603
Bohr C, Kraeck A, Eysholdt U, Ziethe A, Döllinger M (2013) Quantitative analysis of organic vocal fold pathologies in females by high-speed endoscopy. Laryngoscope 123(7):1686–1693. https://doi.org/10.1002/lary.23783
Manfredi C, Bocchi L, Cantarella G, Peretti G (2012) Videokymographic image processing: objective parameters and user-friendly interface. Biomed Signal Process Control 7(2):192–201. https://doi.org/10.1016/j.bspc.2011.02.007
Krausert CR, Olszewski AE, Taylor LN, McMurray JS, Dailey SH, Jiang JJ (2011) Mucosal wave measurement and visualization techniques. J Voice 25(4):395–405. https://doi.org/10.1016/j.jvoice.2010.02.001
Švec JG, Schutte HK (2012) Kymographic imaging of laryngeal vibrations. Curr Opin Otolaryngol Head Neck Surg 20(6):458–465. https://doi.org/10.1097/MOO.0b013e3283581feb
Woo P (2014) Objective measures of laryngeal imaging: what have we learned since Dr. Paul Moore. J Voice 28(1):69–81. https://doi.org/10.1016/j.jvoice.2013.02.001
Al-nasheri A, Muhammad G, Alsulaiman M, Ali Z, Mesallam TA, Farahat M, Malki KH, Bencherif MA (2017) An investigation of multidimensional voice program parameters in three different databases for voice pathology detection and classification. J Voice 31(1):113.e119-113.e118. https://doi.org/10.1016/j.jvoice.2016.03.019
Muhammad G, Mesallam TA, Malki KH, Farahat M, Alsulaiman M, Bukhari M (2011) Formant analysis in dysphonic patients and automatic Arabic digit speech recognition. Biomed Eng Online 10(41):1–12. https://doi.org/10.1186/1475-925X-10-41
Kay Elemetric Corp (1993) Multi-dimensional voice program (MDVP) Ver 33. Lincoln Park, NJ
Milenkovic P, Read C (1992) CSpeech version 4 user’s manual. Madison, WI
Boersma P, Weenink D (2001) Praat a system for doing phonetics by computer. Glot Int 5:341–345
Arjmandi MK, Pooyan M, Mikaili M, Vali M, Moqarehzadeh A (2011) Identification of voice disorders using long-time features and support vector machine with different feature reduction methods. J Voice Off J Voice Found 25(6):e275-289. https://doi.org/10.1016/j.jvoice.2010.08.003
Massachusetts Eye & Ear Infirmary Voice & Speech LAB (1994) Disordered voice database model 4337 (Ver. 1.03) Kay Elemetrics Corp, NJ
Peppard RC, Bless DM, Milenkovic P (1988) Comparison of young adult singers and nonsingers with vocal nodules. J Voice 2(3):250–260. https://doi.org/10.1016/S0892-1997(88)80083-3
Lin E, Jiang J, Hanson DG (1998) Glottographic signal perturbation in biomechanically different types of dysphonia. Laryngoscope 108(1 Pt 1):18–25
Rosen CA, Lombard LE, Murry T (2000) Acoustic, aerodynamic and videostroboscopic features of bilateral vocal fold lesions. Ann Otolog Rhinol Laryngol 109(9):823–828. https://doi.org/10.1177/000348940010900907
Ali Z, Elamvazuthi I, Alsulaiman M, Muhammad G (2016a) Detection of voice pathology using fractal dimension in a multiresolution analysis of normal and disordered speech signals. J Med Syst 40(1):20. https://doi.org/10.1007/s10916-015-0392-2
Ali Z, Talha M, Alsulaiman M (2017) A practical approach: design and implementation of a healthcare software for screening of dysphonic patients. IEEE Access 5:5844–5857. https://doi.org/10.1109/ACCESS.2017.2693282
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987. https://doi.org/10.1109/TPAMI.2002.1017623
Lopez PG, Montresor A, Epema D, Datta A, Higashino T, Iamnitchi A, Barcellos M, Felber P, Riviere E (2015) Edge-centric computing: vision and challenges. SIGCOMM Comput Commun Rev 45(5):37–42. https://doi.org/10.1145/2831347.2831354
Ali Z, Elamvazuthi I, Alsulaiman M, Muhammad G (2016b) Detection of voice pathology using fractal dimension in a multiresolution analysis of normal and disordered speech signals. J Med Syst 40(1):1–10
Ali Z, Imran M, Alsulaiman M, Zia T, Shoaib M (2018) A zero-watermarking algorithm for privacy protection in biomedical signals. Future Gener Comput Syst 82:290–303. https://doi.org/10.1016/j.future.2017.12.007
Markaki M, Stylianou Y (2011) Voice pathology detection and discrimination based on modulation spectral features. IEEE Trans Audio Speech Lang Process 19(7):1938–1948. https://doi.org/10.1109/tasl.2010.2104141
Muhammad G, Ali Z, Alsulaiman M, Almutib K (2014) Vocal fold disorder detection by applying LBP operator on dysphonic speech signal. In: Kijima H (ed) 2nd international conference on intelligent control. Modelling and systems engineering, Cambridge, pp 29–31
Marwan N, Carmen Romano M, Thiel M, Kurths J (2007) Recurrence plots for the analysis of complex systems. Phys Rep 438(5–6):237–329. https://doi.org/10.1016/j.physrep.2006.11.001
Titze I (1995) Workshop on acoustic voice analysis: summary statement. National center for voice and speech, Denver
Acknowledgements
The authors extend their appreciation to the Deputyship for Research & Innovation, “Ministry of Education” in Saudi Arabia for funding this research work through the project number IFKSURG-1435-051. The authors thank the Deanship of Scientific Research and RSSU at King Saud University for their technical support.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ali, Z., Imran, M. & Shoaib, M. An IoT-based smart healthcare system to detect dysphonia. Neural Comput & Applic 34, 11255–11265 (2022). https://doi.org/10.1007/s00521-020-05558-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-05558-3