Advertisement

Cluster Computing

, Volume 19, Issue 3, pp 1683–1690 | Cite as

Vocabulary optimization process using similar phoneme recognition and feature extraction

  • Sang Yeob Oh
  • Kyungyong Chung
Article

Abstract

In processing voice with environment noise, the noise must be eliminated to improve the vocabulary recognition rate. In this process, noise elimination and feature extraction for model-estimate technologies are utilized. Concerning these noise-elimination and model-estimate technologies, the most important part is to estimate mixed noise in the source signal and eliminate it. In a vocabulary recognition system, if unexpected noise appears in the signal, or if quantization noise is basically added to digital signals, the source signal is changed or damaged, which decreases the recognition rate. If a source signal is transformed or changed by being mixed with diverse kinds of noise, the hidden Markov model (HMM) is used for effective noise elimination. The HMM forms a model by extracting features to flexibly respond to diverse vocabulary changes found in voice and text, etc. The method is applicable to data changing over time, and can establish a more effective model as the number of parameters constituting the model grows larger. It can provide a robust model estimate by using a parameter set for structured models. HMM-based vocabulary recognition shows discriminating distribution of recognition probability regarding recognition vocabulary models, and has lower computational complexity for recognition. But it produces a relatively lower recognition rate. To solve that problem, a vocabulary recognition-model optimization method is proposed based on a similar phoneme–recognition process and efficient feature extraction. In vocabulary recognition, a similar phoneme–recognition process is applied to HMM to recognize models adjacent to the model group. Efficient feature extraction is used to optimize the recognition model to enhance the recognition rate. For vocabulary composition, a Gaussian-mixture feature-extraction model is optimized and used as a vocabulary recognition model. Then, it is processed with similar-phoneme recognition regarding the vocabulary recognition model.

Keywords

Similar phoneme recognition Feature extraction Vocabulary recognition Recognition model Model optimization 

Notes

Acknowledgments

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2013R1A1A2059964).

References

  1. 1.
    Oh, S.Y.: Bayesian method recognition rates improvement using HMM vocabulary recognition model optimization. J. Digit. Converg. 12(7), 273–278 (2014)CrossRefGoogle Scholar
  2. 2.
    Ahn, C.S., Oh, S.Y.: Gaussian model optimization using configuration thread control in CHMM vocabulary recognition. J. Digit. Policy Manag. 10(7), 167–172 (2012)Google Scholar
  3. 3.
    Oh, S.Y.: Speech recognition optimization learning model using HMM feature extraction in the Bhattacharyya algorithm. J. Digit. Policy Manag. 11(6), 199–204 (2013)Google Scholar
  4. 4.
    Srinivasan, A.: Speech recognition using hidden Markov model. Appl. Math. Sci. 5(79), 3943–3948 (2011)Google Scholar
  5. 5.
    Wang, K.C., Tsai, Y.H.: Voice activity detection algorithm with low signal-to-noise ratios based on spectrum entropy. Second international symposium on universal communication, pp. 423–428 (2008)Google Scholar
  6. 6.
    Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE transactions on acoustics, speech, signal processing, vol. ASSP-27, pp. 113–120 (1979)Google Scholar
  7. 7.
    Yi, Hu, Loizou, P.C.: Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 16(1), 229–238 (2008)CrossRefGoogle Scholar
  8. 8.
    Kim, W., Hansen, J.H.L.: Feature compensation in the Cepstral domain employing model combination. Speech Commun. 51(2), 83–96 (2009)CrossRefGoogle Scholar
  9. 9.
    ETSI Standard Document, ETSI ES 202 050 v1.1.1 (2002–2010) (2002)Google Scholar
  10. 10.
    Rangachari, S., Loizou, P.C.: A noise-estimation algorithm for highly non-stationary environments. Speech Commun. 48(2), 220–231 (2006)CrossRefGoogle Scholar
  11. 11.
    Oh, S.Y.: Decision tree for likely phoneme model schema support. J. Digit. Policy Manag. 11(10), 367–372 (2013)Google Scholar
  12. 12.
    Young, S.: HTK: hidden Markov model Toolkit V3.4.1. Cambridge University, Engineering Department, Speech Group (1993)Google Scholar
  13. 13.
    Wang, C.C., Pan, C.A., Hung, J.W.: Silence feature normalization for robust speech recognition in additive noise environments. In: Proceedings of the Conference on the International Speech Communication Association, vol. 9, pp. 1028–1031 (2008)Google Scholar
  14. 14.
    Lieb, M., Fischer, A.: Experiments with the Philips continuous ASR system on the AURORA noisy digits database. In: Proceedings of the 7th European Conference on Speech Communication and Technology, pp. 625–628 (2001)Google Scholar
  15. 15.
    Chung, K., Boutaba, R., Hariri, S.: Recent trends in digital convergence information system. Wirel. Pers. Commun. 79(4), 2409–2413 (2014)CrossRefGoogle Scholar
  16. 16.
    Oh, S., Chung, K.Y.: Target speech feature extraction using non-parametric correlation coefficient. Clust. Comput. 17(3), 893–899 (2014)CrossRefGoogle Scholar
  17. 17.
    Jung, H., Chung, K.: Life style improvement mobile service for high risk chronic disease based on PHR platform. Clust. Comput. 19(2), 967–977 (2016)CrossRefGoogle Scholar
  18. 18.
    Kim, S.H., Chung, K.: Emergency situation monitoring service using context motion tracking of chronic disease patients. Clust. Comput. 18(2), 747–759 (2015)CrossRefGoogle Scholar
  19. 19.
    Jung, H., Chung, K.: Knowledge-based dietary nutrition recommendation for obese management. Inf. Technol. Manag. 17(1), 29–42 (2016)CrossRefGoogle Scholar
  20. 20.
    Kim, J.H., Chung, K.Y.: Ontology-based healthcare context information model to implement ubiquitous environment. Multimed. Tools Appl. 71(2), 873–888 (2014)CrossRefGoogle Scholar
  21. 21.
    Jung, H., Chung, K.: Ontology-driven slope modeling for disaster management service. Clust. Comput. 18(2), 677–692 (2015)CrossRefGoogle Scholar
  22. 22.
    Jung, H., Chung, K.Y.: Discovery of automotive design paradigm using relevance feedback. Pers. Ubiquitous Comput. 18(6), 1363–1372 (2014)CrossRefGoogle Scholar
  23. 23.
    Trentin, E., Matassoni, M., Gori, M.: Evaluation on the Aurora 2 database of acoustic models that are less noise-sensitive. Speech Commun. Technol. EUROSPEECH 2003, 1805–1808 (2003)Google Scholar
  24. 24.
    Flynn, R., Jones, E.: A comparative study of auditory-based front-ends for robust speech recognition using the Aurora 2 database. Irish Signals and Systems Conference, vol. 2016, pp. 111–116 (2006)Google Scholar
  25. 25.
    Saon, G., Huerta, J.: Improvements to the IBM Aurora 2 multi-condition system. In: Proceedings of the 7th International Conference on Spoken Language Processing, pp. 469–472 (2002)Google Scholar
  26. 26.
    Oh, S.Y., Chung, K.Y.: Improvement of speech detection using ERB feature extraction. Wirel. Pers. Commun. 79(4), 2439–2451 (2014)CrossRefGoogle Scholar
  27. 27.
    Kim, K., Hong, M., Chung, K., Oh, S.Y.: Estimating unreliable objects and system reliability in P2P network. Peer-to-Peer Netw. Appl. 8(4), 610–619 (2015)CrossRefGoogle Scholar
  28. 28.
    Spriet, A., Moonen, M., Wouters, J.: Spatially preprocessed speech distortion weighted multi-channel Wiener filtering for noise reduction. Signal Process. 84(12), 2367–2387 (2004)CrossRefGoogle Scholar
  29. 29.
    Chung, K., Oh, S.Y.: Voice activity detection using improvement unvoiced feature normalization process in noisy environment. Wirel. Pers. Commun. 89(3), 747–759 (2016)CrossRefGoogle Scholar
  30. 30.
    Jung, H., Chung, K.: P2P context awareness based sensibility design recommendation using color and bio-signal analysis. Peer-to-Peer Netw. Appl. 9(3), 546–557 (2016)CrossRefGoogle Scholar
  31. 31.
    Chung, K., Kim, J.C., Park, R.C.: Knowledge-based health service considering user convenience using hybrid Wi-Fi P2P. Inf. Technol. Manag. 17(1), 67–80 (2016)CrossRefGoogle Scholar
  32. 32.
    Oh, S.Y., Chung, K., Han, J.S.: Towards ubiquitous health with convergence. Int. J. Technol. Health Care 24(3), 411–413 (2016)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Department of Computer EngineeringGachon UniversitySeongnam-siKorea
  2. 2.School of Computer Information EngineeringSangji UniversityWonju-siKorea

Personalised recommendations