Abstract
Automatic phonetic transcription tools usually perform phonetic transcriptions directly from orthographic representations. Although these approaches often achieve good results, theoretical studies suggest that including morphophonological knowledge allows those systems to improve their performance. Following this idea, we developed a tool which first obtains an underlying representation of each word, using small lexica and dedicated lemmatizers. For each representation, a phonological derivation generates the phonetic transcription by applying linguistically motivated rules. Since most of these rules are added as optional parameters, the system permits to generate dialect-specific transcriptions. This system is not only a grapheme-to-phone tool, but it also obtains phonological representations and evaluates several linguistic processes occurring during the derivation. Preliminary experiments emulating a phonological system of Galician (using as input words spelled in European Portuguese) show that the underlying representation of most words can be obtained using small lexica and also that the derivation produces high-quality phonetic transcriptions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ashby, S., Ferreira, J.P.: The Role of Morphology in Generating High-Quality Pronunciation Lexica for Regional Variants of Portuguese. In: Pardo, T.A.S., Branco, A., Klautau, A., Vieira, R., de Lima, V.L.S. (eds.) PROPOR 2010. LNCS, vol. 6001, pp. 162–165. Springer, Heidelberg (2010)
Blevins, J.: The Syllable in Phonological Theory. In: Goldsmith, J.A. (ed.) The Handbook of Phonological Theory, pp. 206–244. Blackwell, Cambridge (1995)
Braga, D., Coelho, L.: Letter-to-sound conversion for Galician TTS systems. In: Actas de las IV Jornadas en Tecnologia del Habla, Zaragoza, pp. 171–176 (2006)
Braga, D., Coelho, L., Resende Jr., F.: A Rule-Based Grapheme-to-Phone Converter for TTS Systems in European Portuguese. In: Proceedings of the VI International Telecommunications Symposium (ITS 2006), Fortaleza, pp. 328–333 (2006)
Braga, D., Freixeiro, X.R.: Algoritmos de Conversão Grafema-Fone em Galego para Sistemas de Conversão Texto-Fala. In: Estudos galegos de Tradución & Paratradución no século XXI, Xerais, Vigo (2007)
Branco, A., Silva, J.: Very High Accuracy Rule-Based Nominal Lemmatization with a Minimal Lexicon. In: Actas do XXI Encontro Anual da Associação Portuguesa de Linguística (2007)
Castro, O.: Aproximación a la fonología y morfología gallegas. PhD Thesis, Georgetown University (1989)
Campillo, F., Braga, D., Mourín, A.B., García-Mateo, C., Silva, P., Sales Dias, M., Méndez, F.: Building High Quality Databases for Minority Languages such as Galician. In: Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), ELRA, La Valleta (2010)
Chomsky, N., Halle, M.: The Sound Pattern of English. Harper and Row, New York (1968)
Dubert García, F.: Máis sobre o rotacismo de /s/ en galego. In: Álvarez, R., Vilavedra, D. (eds.), Cinguidos Por Unha Arela Común. Homenaxe ó Profesor Xesús Alonso Montero, pp. 367–387. Universidade de Santiago de Compostela (1999)
Garcia, M., Gamallo, P.: Análise Morfossintáctica para Português Europeu e Galego: Problemas, Soluções e Avaliação. Linguamática. Revista para o Processamento Automático das Línguas Ibéricas 2(2), 59–67 (2010)
Garcia, M., González, I.J.: Conversión Fonética Automática con Información Fonológica para el Gallego. Procesamiento del Lenguaje Natural 47, 283–291 (2011)
González González, M., Banga, E.R., Campillo, F., Méndez, F., Rodríguez Liñares, L., Iglesias, G.: Specific features of the Galician language and implications for speech technology development. Speech Communication 50, 874–887 (2008)
ILG/RAG: Normas Ortográficas e Morfolóxicas do Idioma Galego. Real Academia Galega and Instituto da Lingua Galega, Vigo (2005)
Malvar, P., Pichel, J.R., Senra, Ó., Gamallo, P., García, A.: Vencendo a escassez de recursos computacionais. Carvalho: Tradutor Automático Estatístico Inglês-Galego a partir do corpus paralelo Europarl Inglês-Português. Linguamática. Revista Para o Processamento Automático Das Línguas Ibéricas 2(2), 31–38 (2010)
Malvar, P., Pichel, J.R.: Generación semiautomática de recursos de Opinion Mining para el gallego a partir del portugués y el español. In: ICL: Workshop on Iberian Cross-Language NLP tasks. 27th Conference of the Spanish Society for Natural Language Processing. Huelva (2011)
Mira Mateus, M.H., Andrade, E.d.: The Phonology of Portuguese. Oxford University Press, Oxford (2000)
Mohanan, K.P.: The Theory of Lexical Phonology. Dordrecht, Reidel (1986)
Mourín, A., Braga, D., Coelho, L., García-Mateo, C., Campillo, F., Dias, M.: Homograph Disambiguation in Galician TTS Systems. In: IX Congreso Internacional da Asociación Internacional de Estudos Galegos. A Coruña - Santiago de Compostela - Vigo (2009)
Padró, L.: Analizadores Multilingües en FreeLing. Linguamática. Revista para o Processamento Automático das Línguas Ibéricas 3(2), 13–20 (2011)
Paulo, S., Oliveira, L.C., Mendes, C., Figueira, L., Cassaca, R., Viana, C., Moniz, H.: DIXI – A Generic Text-to-Speech System for European Portuguese. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS (LNAI), vol. 5190, pp. 91–100. Springer, Heidelberg (2008)
Regueira, X.L.: A sílaba en galego: lingua, estándar e ideoloxía. In: Lorenzo, R. (ed.), Homenaxe a Fernando R. Tato Plaza, pp. 235–254. Universidade de Santiago de Compostela, Santiago de Compostela (2002)
Regueira, X. L.: Dicionario de Pronuncia da Lingua Galega. Real Academia Galega and Instituto da Lingua Galega, A Coruña (2010)
Seara, I.C., Pacheco, F.S., Seara Júnior, R., Kafka, S.G., Klein, S., Seara, R.: Geração Automática de Variantes de Léxicos do Português Brasileiro para Sistemas de Reconhecimento de Fala. In: Actas do XX Simpósio Brasileiro de Telecomunicações. Rio de Janeiro (2003)
Siravenha, A.C., Neto, N., Macedo, V., Klautau, A.: Uso de Regras Fonológicas com Determinação de Vogal Tônica para Conversão Grafema-Fone em Português Brasileiro. In: Proceedings of the 7th International Information and Telecommunication Technologies Symposium (I2TS 2008), Foz do Iguaçu (2008)
Oliveira, C., Castro Moutinho, L., Teixeira, A.J.S.: On European Portuguese automatic syllabification. In: Proceedings of Interspeech 2005, pp. 2933–2936 (2005)
Veiga, A., Candeias, S., Perdigão, F.: Generating a Pronunciation Dictionary for European Portuguese Using a Joint-Sequence Model with Embedded Stress Assignment. In: Proceedings of the 8th Brazilian Symposium in Information and Human Language Technology (STIL 2011), pp. 144–153 (2011)
Vigário, M., Martins, F., Frota, S.: A ferramenta FreP e a frequência de tipos silábicos e classes de segmentos no Português. In: Selecção de Comunicações apresentadas no XX Encontro Nacional da Associação Portuguesa de Linguística, pp. 675–687 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Garcia, M., González, I.J. (2012). Automatic Phonetic Transcription by Phonological Derivation. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds) Computational Processing of the Portuguese Language. PROPOR 2012. Lecture Notes in Computer Science(), vol 7243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28885-2_39
Download citation
DOI: https://doi.org/10.1007/978-3-642-28885-2_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28884-5
Online ISBN: 978-3-642-28885-2
eBook Packages: Computer ScienceComputer Science (R0)