Skip to main content

HMM-Based Speaker Gender Recognition for Bodo Language

  • Conference paper
  • First Online:
Advances in Communication, Cloud, and Big Data

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 31))

  • 650 Accesses

Abstract

Speech, the act of speaking, is the most natural way of exchanging information between homo sapiens. Speech primarily conveys the message via words, spoken by the speaker. Speech also conveys the emotion with which the speaker speaks, speaker’s health condition, gender of the speaker, and also the language in which the speaker is speaking. Systems which aim to recognize the speaker-related information in speech signals through an extraction and characterization process are called speaker recognition systems. Speaker recognition applications are becoming common and useful nowadays as many of the modern devices are designed and produced for the convenience of the general public. Speaker recognition systems are developed for many indigenous languages. Application of hidden Markov models (HMMs) to speaker recognition has seen considerable success and gained much popularity. This paper presents an attempt made toward developing a speaker gender recognition system. A model built using Hidden Markov Model Toolkit (HTK 3.4.1) has been trained and tested on sample speech of either gender in Bodo language, and results show good recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Fu Z, Zhao R (2003) An overview of modeling technology of speaker recognition. In: IEEE proceedings of the international conference on neural networks and signal processing, vol 2, pp 887–891, Dec 2003

    Google Scholar 

  2. Young S et al (2009) The HTK book (for HTK version 3.4). Cambridge University Engineering Department

    Google Scholar 

  3. Lee K-F, Hon HW, Reddy R (1990) An overview of the SPHINX speech recognition system. In: IEEE transactions on acoustic speech and signal processing, vol 38, no 1

    Google Scholar 

  4. Admin. ‘sphinx-4 application programmer’s guide (2015). http://cmusphinx.sourceforge.net/sphinx4

  5. Deka MK, Nath CK, Sarma SK, Talukdar PH (2011) An approach to noise robust speech recognition using LPC cepstral coefficient and MLP based artificial neural network with respect to Assamese and Bodo language. In: International symposium on devices MEMS, intelligent systems & communication (ISDMISC)

    Google Scholar 

  6. Patel J, Patel P, Virparia P (2014) Voice enabled telephony commands using Gujarati speech recognition. Int J Adv Res Comput Sci Softw Eng 3(12)

    Google Scholar 

  7. Mishra AN, Biswas A, Chandra M, Sharan SN (2011) Robust Hindi connected digits recognition. Int J Signal Process Image Process Pattern Recognit 4(2)

    Google Scholar 

  8. Vimala C, Radha V (2012) Speaker independent isolated speech recognition system for Tamil language using HMM. Procedia Eng 30:1097–1102

    Google Scholar 

  9. Boro MR (2008) The structure of Boro language. N.L. Publications, Panbazar, Guwahati

    Google Scholar 

  10. Boro MR (2007) The historical development of Boro language. N.L. Publications, Panbazar, Guwahati

    Google Scholar 

  11. Jurafsky D, Martin J (2014) Speech and language processing, 2nd edn. Pearson

    Google Scholar 

  12. Stevens SS, Volkmann J, Newman EB (1937) A scale for the measurement of the psychological magnitude pitch. J Acoust Soc Am 8(3):185–190. Bibcode:1937ASAJ….8..185S. https://doi.org/10.1121/1.1915893

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chandralika Chakraborty .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chakraborty, C., Talukdar, P.H. (2019). HMM-Based Speaker Gender Recognition for Bodo Language. In: Sarma, H., Borah, S., Dutta, N. (eds) Advances in Communication, Cloud, and Big Data. Lecture Notes in Networks and Systems, vol 31. Springer, Singapore. https://doi.org/10.1007/978-981-10-8911-4_15

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-8911-4_15

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-8910-7

  • Online ISBN: 978-981-10-8911-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics