Skip to main content

Abstract

This chapter provides an overview of existing language identification systems. Existing language-specific features applied for LID study have been highlighted. The reasons for attraction towards developing implicit LID systems are explained and finally the motivation for the present work has been discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Muthusamy YK, Cole RA, BT Oshika (1992) The OGI multi-language telephone speech corpus. In: Proceedings of international conference spoken language processing, pp 895–898, Oct 1992

    Google Scholar 

  2. LDC (1996) Philadelphia, PA. http://www.ldc.upenn.edu/Catalog. LDC96S46–LDC96S60

  3. Muthusamy YK, Jain N, Cole RA (1994) Perceptual benchmarks for automatic language identification. In: Proceedings of IEEE international conference acoustics, speech, and signal processing, vol 1, pp 333–336, April 1994

    Google Scholar 

  4. Lamel LF, Gauvain JL (1993) Cross lingual experiments with phone recognition. In: Proceedings of IEEE international conference acoustics, speech, and signal processing, pp 507–510, April 1993

    Google Scholar 

  5. Lamel LF, Gauvain JL (1994) Language identification using phonebased acoustic likelihoods. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing,vol 1, pp 293–296, April 1994

    Google Scholar 

  6. Berkling KM, Arai T, Bernard E (1994) Analysis of phoneme based features for langugae identification. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 289–292, April 1994

    Google Scholar 

  7. Hazen TJ, Zue VW (1994) Recent improvements in an approach to segement-based automatic language identification. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 1883–1886, Sept 1994

    Google Scholar 

  8. Andersen O, Dalsgaard P, Barry W (1994) On the use of datadriven clustering technique for identification of poly and mono-phonemes for four European languages. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 121–124, April 1994

    Google Scholar 

  9. Tucker RCF, Carey MJ, Paris ES (1994) Automatic language identification using sub-words models. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 301–304, April 1994

    Google Scholar 

  10. Zissman MA, Singer E (1994) Automatic language identification of telephone speech messages using phoneme recognition and N-gram modeling. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, (ICASSP-94), vol 1, pp I/305-I/308, 1994

    Google Scholar 

  11. Zissman MA (1996) Comparison of four approaches to automatic language identification of telephone speech. IEEE Trans Speech Audio Process 4:31–44

    Article  Google Scholar 

  12. Kadambe S, Hieronymus JL (1995) Language identification with phonological and lexical models. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 3507–3510, May 1995

    Google Scholar 

  13. Yan Y, Barnard E (1995) Analysis approach to automatic langauge identification based on language-dependent phone recognition. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, vol 5, pp 3511–3514, May 1995

    Google Scholar 

  14. Navratil J, Zuhlke W (1997) Phonetic-context mapping in language identification. In: Proceedings of EUROSPEECH, vol 1, (Greece), pp 71–74, Sept 1997

    Google Scholar 

  15. Navratil J (2001) Spoken language recognition a step toward multilinguality in speech processing. IEEE Trans Speech Audio Process 9:678–685 Sep

    Google Scholar 

  16. Hazen TJ, Zue VW (1997) Segment-based automatic language identification. J Acoust Soc Am 101:2323–2331

    Article  Google Scholar 

  17. Kirchhoff K, Parandekar S (2001) Multi-stream statistical N-gram modeling with application to automatic language identification. In Proceeding of EUROSPEECH-2001, pp 803–806, 2001

    Google Scholar 

  18. Prasad VK (2003) Segmentation and recognition of continuous speech. Ph.D. thesis, Indian Institute of Technology, Department of Computer Science and Engineering, Madras, India, 2003

    Google Scholar 

  19. Ramasubramanian V, Jayaram AKVS, Sreenivas TV (2003) Language identification using parallel phone recognition. In: WSLP, TIFR, (Mumbai), pp 109–116, Jan 2003

    Google Scholar 

  20. Gauvain J, Messaoudi A, Schwenk H (2004) Language recognition using phone latices. In: Proceedings of INTERSPEECH-2004, pp 25–28, 2004

    Google Scholar 

  21. Shen W, Campbell W, Gleason T, Reynolds D, Singer E (2006) Experiments with lattice-based PPRLM language identification. In: Proceedings on IEEE Odyssey 2006: speaker and language recognition workshop, pp 1–6, 2006

    Google Scholar 

  22. Gleason TP, Zissman MA (2001) Composite background models and score standardization for language identification systems. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP01), vol 1, pp 529–532, 2001

    Google Scholar 

  23. Cordoba R, Dharo L, Fernandez-Martinez F, Macias-Guarasa J, Ferreiros J (2007) Language identification based on n-gram frequency ranking. In: Proceedings of EUROSPEECH-2007, pp 2137–2140, 2007

    Google Scholar 

  24. Li H, Ma B, Lee C-H (2007) A vector space modeling approach to spoken language identification. IEEE Trans Audio, Speech Lang Process 15:271–284

    Article  Google Scholar 

  25. Chai SK, Haizhou L (2008) On acoustic diversification front-end for spoken language identification. IEEE Trans Audio, Speech, Lang Process 16:1029–1037

    Article  Google Scholar 

  26. Tong R, Ma B, Li H, Chng E (2008) Target-oriented phone selection from universal phone set for spoken language recognition. In: Proceedings of INTERSPEECH-2008

    Google Scholar 

  27. You J-L, Chen Y-N, Chu M, Soong FK, Wang J-L (2008) Identifying language origin of named entity with multiple information sources. IEEE Trans Audio, Speech, Lang Process 16:1077–1086 Auguest

    Google Scholar 

  28. Botha GR, Barnard E (2012) Factors that affect the accuracy of text-based language identification. Compu Speech Lang 26:307–320

    Article  Google Scholar 

  29. Zissman MA, Berkling KM (2001) Automatic language identification. Speech Comm 35:115–124

    MATH  Google Scholar 

  30. Martin AF, Przybocki MA (2003) NIST 2003 language recognition evaluation. In: Proceedings of EUROSPEECH (Geneva, Switzerland), pp 1341–1344, Sept 2003

    Google Scholar 

  31. Leonard RG, Doddington GR (1974) Automatic language identification. Technical report, A.F.R.A.D. Centre Technical Report RADC-TR-74-200, 1974

    Google Scholar 

  32. House AS, Neuburg EP (1977) Toward automatic identification of the language of an utterance. J Acoust Soc Am 62:708–713

    Article  Google Scholar 

  33. Cimarusti D, Eves RB (1982) Development of an automatic identification system of spoken languages: phase I. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 1661–1663, May 1982

    Google Scholar 

  34. Eady S (1982) Differences in F0 patterns of speech: tone languages versus stress language. Lang Speech 25:29–42

    Google Scholar 

  35. Ives R (1986) A minimal rule AI expert system for real-time classification of natural spoken languages. In: Proceedings of 2nd annual artificial intelligence and advanced computer technology conference, pp 337–340, 1986

    Google Scholar 

  36. Foil JT (1986) Language identification using noisy speech. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 861–864, April 1986

    Google Scholar 

  37. Goodman FJ, Martin AF, Wohlford RE (1989) Improved automatic language identification in noisy speech. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 528–531, May 1989

    Google Scholar 

  38. Muthusamy YK, Cole RA, Gopalakrishnan M (1991) A segment-based approach to automatic language identification. In: Proceedings of IEEE international conference on acoustics, speech,and signal processing, vol 1, pp 353–356, April 1991

    Google Scholar 

  39. Sugiyama M (1991) Automatic language recognition using acoustic features. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 813–816, May 1991

    Google Scholar 

  40. Riek L, Mistreta W, Morgan D (1991) Experiments in language identification. Technical Report, Lockheed Sanders Technical Report SPCOT-91-002, 1991

    Google Scholar 

  41. Nakagawa S, Ueda Y, Seino T (1992) Speaker-independent, text independent language identification by HMM. In: Proceedings of international conference on spoken language processing (ICSLP-1992), pp 1011–1014, 1992

    Google Scholar 

  42. Zissman MA (1993) Automatic langauge identification using Gaussian mixture and hidden Markov models. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 399–402, April 1993

    Google Scholar 

  43. Itahashi S, Zhou J, Tanaka K (1994) Spoken language discrimination using speech fundamental frequency. In: Proceedings of international conference on spoken language processing (ICSLP-1994), pp 1899–1902, 1994

    Google Scholar 

  44. Shuichi I, Liang D (1995) Language identification based on speech fundamental frequency. In: Proceedings of EUROSPEECH-1995, pp 1359–1362, 1995

    Google Scholar 

  45. Li K (1994) Automatic language identification using syllabic features. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 297–300, 1994

    Google Scholar 

  46. Pellegrino F, Andre-Abrecht R (1999) An unsupervised approach to language identification. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, pp 833–836, 1999

    Google Scholar 

  47. Carrasquillo PAT, Reynolds DA, Deller JR (2002) Language identification using Gaussian mixture model tokenization. In: Proceedings of IEEE international conference on acoustics,speech, and signal processing, vol I, pp 757–760, 2002

    Google Scholar 

  48. Torres-Carrasquillo P, Singer E, Kohler M, Greene R, Reynolds D, Deller JJ (2002) Approaches to language identification using Gaussian mixture models and shifted delta cepstral features. In: Proceedings international conference on spoken language processing (ICSLP-2002), 2002

    Google Scholar 

  49. Corredor-Ardoy C, Gauvain J, Adda-Decker M, Lamel L (1997) Language identification with language-independent acoustic models. In: Proceedings of EUROSPEECH-1997, pp 55–58, 1997

    Google Scholar 

  50. Dalsgaard P, Andersen O (1992) Identification of mono- and polyphonemes using acoustic-phonetic features derived by a self-organising neural network. In: Proceedings of International conference spoken language processing (ICSLP-1992), pp 547–550, 1992

    Google Scholar 

  51. Pellegrino F, Farinas J, Andr-Obrecht R (1992) Comparison of two phonetic approaches to language identification. In: Proceedings of EUROSPEECH99, pp 399–402, 1999

    Google Scholar 

  52. Ueda Y, Nakagawa S (1990) Diction for phoneme/syllable/word-category and identification of language using HMM. In: Proceedings of international conference on spoken language processing (ICSLP-1990), pp 1209–1212, 1990

    Google Scholar 

  53. Cole RA, Inouye JWT, Muthusamy YK, Gopalakrishnan M (1989) Language identification with neural networks: A feasibility study. In: Proceedings of IEEE Pacific rim conference communications, computers and signal processing, pp 525–529

    Google Scholar 

  54. Braun J, Levkowitz H (1998) Automatic language identification with perceptually guided training and recurrent neural networks. In: Proceedings of international conference on spoken language processing (ICSLP-1998), 1998

    Google Scholar 

  55. Wong E, Sridharan S (2002) Gaussian mixture model based language identification system. In: Proceedings international conference spoken language processing (ICSLP-2002), pp 93–96, 2002

    Google Scholar 

  56. Campbell W, Singera E, Torres-Carrasquillo P, Reynolds D (2004) Language recognition with support vector machines. In Proceedings of ODYSSEY- 2004:2004

    Google Scholar 

  57. Lu-Feng Z, Man-hung S, Xi Y, Gish H (2006) Discriminatively trained language models using support vector machines for language identification. In: Proceedings of speaker and language recognition workshop, 2006. IEEE Odyssey, pp 1–6

    Google Scholar 

  58. Castaldo F, Dalmasso E, Laface P, Colibro D, Vair C (2007) Language identification using acoustic models and speaker compensated cepstral-time matrices. In: IEEE international conference on acoustics, speech and signal processing (ICASSP 2007), pp IV-1013IV-1016, 2007

    Google Scholar 

  59. Noor E, Aronowitz H (2006) Efficient language identification using anchor models and support vector machines. In: Proceedings of IEEE Odyssey 2006 speaker and language recognition workshop, pp 1–6, 2006

    Google Scholar 

  60. Lin C, Wang H (2006) Language identification using pitch contour information in the ergodic Markov model. In: Proceedings of 2006 IEEE international conference on acoustics, speech, and signal processing (ICASSP 2006), pp I-I, 2006

    Google Scholar 

  61. Rouas J-L, Farinas J, Pellegrino F, Andr-Obrecht R (2005) Rhythmic unit extraction and modelling for automatic language identification. Speech Commun 47:436–456

    Article  Google Scholar 

  62. Wu C-H, Chiu Y-H, Shia C-J, Lin C-Y (2006) Automatic segmentation and identification of mixed-language speech using delta-BIC and LSA-based GMMs. IEEE Trans Audio Speech Lang Process 14:266–276

    Article  Google Scholar 

  63. Rouas JL (2007) Automatic prosodic variations modeling for language and dialect discrimination. IEEE Trans Audio, Speech, Lang Process 15:1904–1911

    Article  Google Scholar 

  64. Siu M-H, Yang X, Gish H (2009) Discriminatively trained GMMs for language classification using boosting methods. IEEE Trans Audio, Speech, Lang Process 17:187–197

    Article  Google Scholar 

  65. Sangwan A, Mehrabani M, Hansen JHL (2010) Automatic language analysis and identification based on speech production knowledge. In: ICASSP, 2010

    Google Scholar 

  66. Martnez D, Burget L, Ferrer L, Scheffer N (2012) iVector-based prosodic system for language Identification. In: ICASSP, 2012

    Google Scholar 

  67. Jyotsna B, Murthy HA, Nagarajan T (2000) Language identification from short segments of speech. In: Proceedings of international conference on spoken language processing (Beijing, China), pp 1033–1036, Oct 2000

    Google Scholar 

  68. Nagarajan T, Murthy HA (2002) Language identification using spectral vector distribution across languages. In: Proceedings of international conference on natural language processing, pp 327–335, 2002

    Google Scholar 

  69. Jayaram AKVS, Ramasubramanian V, Sreenivas TV (2003) Language identification using parallel sub-word recognition. In: Proceedings of IEEE international conference on acoustics, speech, and signal processing, vol I, pp 32–35, 2003

    Google Scholar 

  70. Mary L, Yegnanarayana B (2004) Autoassociative neural network models for language identification. In: Proceedings of international conference on intelligent sensing and information processing (Chennai, India), pp 317–320, 2004

    Google Scholar 

  71. Mary L, Rao KS, Yegnanarayana B (2005) Neural network classifiers for language identification using syntactic and prosodic features. In: Proceedings of IEEE international conference on Intelligent sensing and information processing (Chennai, India), pp 404–408, Jan 2005

    Google Scholar 

  72. Mary L, Yegnanarayana B (2008) Extraction and representation of prosodic features for language and speaker recognition. Speech Commun 50:782–796

    Article  Google Scholar 

  73. Sreenivasa Rao K, Koolagudi SG (2011) Identification of hindi dialects and emotions using spectral and prosodic features of speech. J Syst Cybern Inform 9(4):24–33

    Google Scholar 

  74. Yadav J, Sreenivasa Rao K (2014) Emotional-speech synthesis from neutral-speech using prosody imposition. In: International conference on recent trends in computer science and engineering (ICRTCSE-2014), Central University of Bihar, Patna, India, Feb 8–9, 2014

    Google Scholar 

  75. Koolagudi SG, Rastogi D, Sreenivasa Rao K (2012) Spoken language identification using spectral features. Communications in computer and information science (CCIS): contemporary computing, vol 306. Springer, New York, pp 496–497

    Google Scholar 

  76. Greenberg S (1999) Speaking in short hand–a syllable-centric perspective for understanding pronunciation variation. Speech Comm 29:159–176

    Article  Google Scholar 

  77. Maity S, Vuppala AK, Rao KS, Nandi D (2012) IITKGP-MLILSC speech database for language identification. In: National Conference Communication, Feb 2012

    Google Scholar 

  78. Rao KS, Maity S, Reddy VR (2013) Pitch synchronous and glottal closure based speech analysis for language recognition. Int J Speech Technol (Springer) 16(4):413–430

    Article  Google Scholar 

  79. Ramu Reddy V, Maity S, Sreenivasa Rao K (2013) Recognition of Indian languages using multi-level spectral and prosodic features. Int J Speech Technol (Springer) 16(4):489–510

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. Sreenivasa Rao .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 The Author(s)

About this chapter

Cite this chapter

Rao, K.S., Reddy, V.R., Maity, S. (2015). Literature Review. In: Language Identification Using Spectral and Prosodic Features. SpringerBriefs in Electrical and Computer Engineering(). Springer, Cham. https://doi.org/10.1007/978-3-319-17163-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-17163-0_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-17162-3

  • Online ISBN: 978-3-319-17163-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics