Cross-Word Pronunciation Variations

AbuZeina, Dia; Elshafei, Moustafa

doi:10.1007/978-1-4614-1213-7_4

Dia AbuZeina³ &
Moustafa Elshafei³

Part of the book series: SpringerBriefs in Electrical and Computer Engineering ((BRIEFSSPEECHTECH))

580 Accesses

Abstract

This chapter presents the cross-word problem of the Arabic language. It also includes the main sources of this problem: Idgham (merging), Iqlaab (changing), Hamzat Al-Wasl deleting, and merging of two consecutive unvoweled letters. Illustrative examples of the cross-word problem are also provided.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

AbuZeina D, Al-Khatib W et al (2011) Cross-word Arabic pronunciation variation modeling for speech recognition. Int J Speech Tech: 1–10
Google Scholar
Alghamdi M, Elshafei M, Almuhtasib H (2009) Arabic broadcast news transcription system. Int J Speech Tech 10:183–195
Article Google Scholar
Al-Haj H, Hsiao R, Lane I, Black A, Waibel A (2009) Pronunciation modeling for dialectal Arabic speech recognition, ASRU 2009: IEEE workshop, Italy
Google Scholar
Ali M, Elshafei M, Alghamdi M, Almuhtaseb H, Alnajjar A (2009) Arabic phonetic dictionaries for speech recognition. J Inform Tech Res 2(4):67–80
Article Google Scholar
Amdal I, Fosler-Lussier E (2003) Pronunciation variation modeling in automatic speech recognition. Telektronikk, 2.2003, pp 70–82
Google Scholar
Amdal I, Korkmazskiy F, Surendran AC (2000) Joint pronunciation modeling of non-native speakers using data-driven methods, ICSLP, Beijing, China, pp 622–625
Google Scholar
Benzeghiba M, De Mori R et al (2007) Automatic speech recognition and speech variability: a review. Speech Comm 49(10–11):763–786
Article Google Scholar
Berton A, Fetter P, Regel-Brietzmann P (1996) Compound words in large-vocabulary German speech recognition systems. In: Proceedings of the fourth international conference on spoken language, 1996. ICSLP 96, vol 2. 3–6 Oct 1996, pp 1165–1168
Google Scholar
Beulen K, Ortmanns S, Eiden A, Martin S, Welling L, Overmann J, Ney H (1998) Pronunciation modeling in the RWTH large vocabulary speech recognizer. In: Proceedings of the ESCA workshop modeling pronunciation variation for automatic speech recognition, pp 13–16
Google Scholar
Biadsy F, Habash N, Hirschberg J (2009) Improving the Arabic pronunciation dictionary for phone and word recognition with linguistically-based pronunciation rules. The 2009 annual conference of the North American chapter of the ACL, Colorado, pp 397–405
Google Scholar
Billa J, Noamany M et al (2002) Audio indexing of Arabic broadcast news. 2002 IEEE international conference on acoustics, speech, and signal processing (ICASSP)
Google Scholar
Boulianne G, Brousseau J, Ouellet P, Dumouchel P (2000) French large vocabulary recognition with cross-word phonology transducers. ICASSP 3:1675–1678
Google Scholar
Braga D, Freitas D, Barros MJ (2001) On the identification of word-boundaries using phonological rules for speech recognition and labeling phonological rules and trends in word endings. Forum American Bar Association, Chicago, IL
Google Scholar
Finke M, Waibel A (1997) Speaking mode dependent pronunciation modeling in large vocabulary conversational speech recognition. In: Proceedings of EuroSpeech-97, Rhodes, pp 2379–2382
Google Scholar
Fosler-Lussier E, Williams G (1999) Not just what, but also when: guided automatic pronunciation modeling for broadcast news. DARPA broadcast news workshop, Herndon, VA
Google Scholar
Fosler-Lussier E, Greenberg S, Morgan N (1999) Incorporating contextual phonetics into automatic speech recognition. In: Proceedings of the international congress on phonetic sciences, pp 611–614
Google Scholar
Giachin EP, Rosenberg AE et al (1991) Word juncture modeling using phonological rules for HMM-based continuous speech recognition. Comput Speech Lang 5(2):155–168
Article Google Scholar
Hazen TJ, Hetherington IL, Shu H, Livescu K (2005) Pronunciation modeling using a finite-state transducer representation. Speech Comm 46(2):189–203
Article Google Scholar
Helmer S (2001) Pronunciation adaptation at the lexical level. In: Proceedings ISCA ITRW workshop adaptation methods for speech recognition, Sophia Antipolis, France, 2001
Google Scholar
Hirsimaki T (2003) A review: decision trees in speech recognition. Helsinki University of Technology, Finland
Google Scholar
Hofmann H, Sakti S et al (2010) Improving spontaneous English ASR using a joint-sequence pronunciation model. 2010 4th international universal communication symposium (IUCS)
Google Scholar
Jurafsky D, Ward W, Zhang J, Herold K, Yu X, Zhang S (2001) What kind of pronunciation variation is hard for triphones to model? In: Proceedings of ICASSP, 2001
Google Scholar
Kessens JM, Wester M et al (1999) Improving the performance of a Dutch CSR by modeling within-word and cross-word pronunciation variation. Speech Comm 29(2–4):193–207
Article Google Scholar
Kim M, Oh YR, Kim HK (2007) Non-native pronunciation variation modeling using an indirect data-driven method. In: Proceedings of ASRU, Japan, 2007
Google Scholar
Lamel L, Messaoudi A et al (2009) Automatic speech-to-text transcription in Arabic. ACM Trans Asian Lang Inform Process 8(4):1–18
Article Google Scholar
Lee K-N, Chung M (2007) Morpheme-based modeling of pronunciation variation for large vocabulary continuous speech recognition in Korean. IEICE Trans Inf Syst E90-D(7):1063–1072
Article Google Scholar
Lestari D, Furui S (2010) Adaptation to pronunciation variations in Indonesian spoken query-based information retrieval. IEICE Trans Inf Syst E93.D(9):2388–2396
Article Google Scholar
McAllester D, Gillick L, Scattone F, Newman M (1998) Fabricating conversational speech data with acoustic models: a program to examine model-data mismatch. In: Proceedings of ICSLP, Sydney, Australia, December 1998
Google Scholar
Nock HJ, Young SJ (1998) Detecting and correcting poor pronunciations for multiword units. ESCA workshop, 1998
Google Scholar
Pousse L, Perennou G (1997) Dealing with pronunciation variants at the language model level for automatic continuous speech recognition of French. In: Proceedings of Eurospeech-97, Rhodes, pp 2727–2730
Google Scholar
Ravishankar M, Eskenazi M (1997) Automatic generation of context-dependent pronunciations. In: Proceedings of EuroSpeech-97, Rhodes, pp 2467–2470
Google Scholar
Riley M, Ljolje A (1995) Automatic generation of detailed pronunciation lexicons. In: Lee CH, Soong FK, Paliwal KK (eds) Automatic speech and speaker recognition: advanced topics. Kluwer Academic, Boston
Google Scholar
Riley M, Byrne W, Finke M, Khudanpur S, Ljolje A, McDonough J, Nock H, Saraclar M, Wooters C, Zavaliagkos G (1998) Stochastic pronunciation modelling from handlabelled phonetic corpora. In: Proceedings of ETRW on modeling pronunciation variation for automatic speech recognition, 1998, pp 109–116
Google Scholar
Saon G, Padmanabhan M (2001) Data-driven approach to designing compound words for continuous speech recognition. IEEE Trans Speech Audio Process 9(4):327–332
Article Google Scholar
Saraçlar M, Nock H, Khudanpur S (2000) Pronunciation modeling by sharing Gaussian densities across phonetic models. Comput Speech Lang 14:137–160
Article Google Scholar
Seman N, Kamaruzaman J (2008) Acoustic pronunciation variations modeling for standard Malay speech recognition. Comput Inform Sci 1(4):112–120, ISSN 1913-8989
Google Scholar
Siegler MA, Stern RM (1995) On the effects of speech rate in large vocabulary speech recognition systems. 1995 international conference on acoustics, speech, and signal processing, 1995. ICASSP-95
Google Scholar
Sloboda T, Waibel A (1996) Dictionary learning for spontaneous speech recognition. In: Proceedings of ICSLP-96, Philadelphia, PA, USA, pp 2328–2331
Google Scholar
Tajchman G, Foster E, Jurafsky D (1995) Building multiple pronunciation models for novel words using exploratory computational phonology. In EUROSPEECH-1995, pp 2247–2250
Google Scholar
Tsai MY, Chou FC et al (2007) Pronunciation modeling with reduced confusion for Mandarin Chinese using a three-stage framework. IEEE Trans Audio Speech Lang Process 15(2):661–675
Article MathSciNet Google Scholar
Uraga E, Pineda LA (2002) Automatic generation of pronunciation lexicons for Spanish. In: Proceedings of the third international conference on computational linguistics and intelligent text processing, Springer, Netherlands, pp 330–338
Google Scholar
Weintraub M, Murveit H et al (1989) Linguistic constraints in hidden Markov model based speech recognition. 1989 international conference on acoustics, speech, and signal processing, 1989. ICASSP-89
Google Scholar
Wester M (2003) Pronunciation modeling for ASR—knowledge-based and data-derived methods. Comput Speech Lang 17:69–85
Article Google Scholar
Wester M, Fosler-Lussier E (2000) A comparison of data-derived and knowledge-based modeling of pronunciation variation, ICSLP, Bejing, China, 2000
Google Scholar
Xin L, Wen W et al (2009) Data-driven lexicon expansion for Mandarin broadcast news and conversation speech recognition. IEEE international conference on acoustics, speech and signal processing, 2009. ICASSP 2009
Google Scholar
Yang Q, Martens J-P (2000) On the importance of exception and cross-word rules for the data-driven creation of Lexica for ASR. In: Proceedings of the 11th ProRisc workshop, 29 Nov–1 Dec, Veldhoven, The Netherlands, pp 589–593
Google Scholar
Zhang J, Gao J et al (2000) Extraction of Chinese compound words: an experimental study on a very large corpus. In: Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th annual meeting of the association for computational linguistics, vol 12. Association for Computational Linguistics, Hong Kong, pp 132–139
Google Scholar

Download references

Author information

Authors and Affiliations

King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
Dia AbuZeina & Moustafa Elshafei

Authors

Dia AbuZeina
View author publications
You can also search for this author in PubMed Google Scholar
Moustafa Elshafei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dia AbuZeina .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

AbuZeina, D., Elshafei, M. (2012). Cross-Word Pronunciation Variations. In: Cross-Word Modeling for Arabic Speech Recognition. SpringerBriefs in Electrical and Computer Engineering(). Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-1213-7_4

Download citation

DOI: https://doi.org/10.1007/978-1-4614-1213-7_4
Published: 15 November 2011
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4614-1212-0
Online ISBN: 978-1-4614-1213-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics