Development of a Femininity Estimator for Voice Therapy of Gender Identity Disorder Clients

Minematsu, Nobuaki; Sakuraba, Kyoko

doi:10.1007/978-3-540-74122-0_3

Nobuaki Minematsu¹ &
Kyoko Sakuraba²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4441))

1224 Accesses

Abstract

This work describes the development of an automatic estimator of perceptual femininity (PF) of an input utterance using speaker verification techniques. The estimator was designed for its clinical use and the target speakers are Gender Identity Disorder (GID) clients, especially MtF (Male to Female) transsexuals. The voice therapy for MtFs, which is conducted by the second author, comprises three kinds of training; 1) raising the baseline F ₀ range, 2) changing the baseline voice quality, and 3) enhancing F ₀ dynamics to produce an exaggerated intonation pattern. The first two focus on static acoustic properties of speech and the voice quality is mainly controlled by size and shape of the articulators, which can be acoustically characterized by the spectral envelope. Gaussian Mixture Models (GMM) of F ₀ values and spectrums were built separately for biologically male speakers and female ones. Using the four models, PF was estimated automatically for each of 142 utterances of 111 MtFs. The estimated values were compared with the PF values obtained through listening tests with 3 female and 6 male novice raters. Results showed very high correlation (R=0.86) between the two, which is comparable to the intra- and inter-rater correlation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Levitt, H.: Processing of speech signals for physical and sensory disabilities. National Academy of Sciences 92(22), 9999–10006 (1995)
Article Google Scholar
Ehsani, F., Knodt, E.: Speech technology in computer-aided language learning: strategies and limitations of a new CALL paradigm. Language Learning & Technology 2(1), 45–60 (1998)
Google Scholar
Loizou, P.C.: Mimicking the human ear. IEEE Signal Process Magazine 15, 101–130 (1998)
Article Google Scholar
Loizou, P.C.: Signal-processing techniques for cochlear implants – a review of progress in deriving electrical stimuli from the speech signal. IEEE Engineering in Medicine and Biology Magazine 18(3), 34–46 (1999)
Article MathSciNet Google Scholar
Suhail, Y., Oweiss, K.G.: Augmenting information channels in hearing aids and cochlear implants under adverse conditions. In: Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pp. 889–892 (May 2006)
Google Scholar
Barney, H.L., Haworth, F.E., Dunn, H.K.: An experimental transistorized artificial larynx. Bell System Technical Journal 38, 1337–1356 (1959)
Google Scholar
Houston, K.M., Hillman, R.E., Kobler, J.B., Meltzner, G.S.: Development of sound source components for a new electrolarynx speech prosthesis. In: Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pp. 2347–2350 (March 1999)
Google Scholar
Meltzner, G.S., Kobler, J.B., Hillman, R.E.: Measuring the neck frequency response function of laryngectomy patients: implications for the design of electrolarynx devices. Acoustic Society of America 114(2), 1035–1047 (2003)
Article Google Scholar
Mori, H., Otawa, H., Ono, T., Ito, Y., Kasuya, H.: Internet-based acoustic voice evaluation system for screening of laryngeal cancer, Acoustic Society of Japan 62(3), 193–198 (2006) (in Japanese)
Google Scholar
Jong, J.D., Bernstein, J.: Relating phonepass scores overall scores to the council of europe framework level descriptors. In: Proc. European Conf. Speech Communication and Technology, pp. 2803–2806 (September 2001)
Google Scholar
Edgerton, M.T.: The surgical treatment of male transsexuals. Clinics in Plastic Surgery 1(2), 285–323 (1974)
Google Scholar
Wolfort, F.G., Parry, R.G.: Laryngeal chondroplasty for appearance. Plastic and reconstructive surgery 56(4), 371–374 (1975)
Article Google Scholar
Bralley, R.C., Bull, G.L., Gore, C.H., Edgerton, M.T.: Evaluation of vocal pitch in male transsexuals. Communication Disorder 11, 443–449 (1978)
Article Google Scholar
Spencer, L.E.: Speech characteristics of MtF transsexuals: a perceptual and acoustic study. Folia phoniat. 40, 31–42 (1988)
Article Google Scholar
Mount, K.H., Salmon, S.J.: Changing the vocal characteristics of a postoperative transsexual patient: a longitudinal study. Communication Disorder 21, 229–238 (1988)
Article Google Scholar
Gelfer, M.P.: Voice therapy for the male-to-female transgendered client. American J. Speech-Language Pathology 8, 201–2008 (1999)
Google Scholar
Sato, H.: Acoustic cues of female voice quality. IECE Transaction 57(1), 23–30 (1974) (in Japanese)
Google Scholar
Bennett, S.: Acoustic correlates of perceived sexual identity in preadolescent children’s voices. Acoustic Society of America 66(4), 989–1000 (1979)
Article Google Scholar
Andrews, M.L., Schmidt, C.P.: Gender presentation: perceptual and acoustical analyses of voice. J. Voice 11(3), 307–313 (1997)
Article Google Scholar
Gelfer, M.P., Schofield, K.J.: Comparison of acoustic and perceptual measures of voice in MtF transsexuals perceived as female vs. those perceived as male. J. Voice 14(1), 22–33 (2000)
Article Google Scholar
Wolfe, V.I.: Intonation and fundamental frequency in MtF TS. J. Speech Hearing Disorders 55, 43–50 (1990)
Google Scholar
Paige, A., Zue, V.W.: Calculation of vocal tract length, AU-18(3), 268–270 (1970)
Google Scholar
Minematsu, N., Asakawa, S., Hirose, K.: Para-linguistic information represented as distortion of the acoustic universal structure in speech. In: Proc. IEEE Int. Conf. Acousitcs, Speech, and Signal Processing. vol. 1, pp. 261–264 (May 2006)
Google Scholar
Rosenberg, A.E., Parthasarathy, S.: Speaker background models for connected digit password speaker verification. In: Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pp. 81–84 (May 1996)
Google Scholar
Heck, L.P., Weintraub, M.: Handset-dependent background models for robust text-independent speaker recognition. In: Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pp. 1071–1074 (April 1997)
Google Scholar
http://www.mibel.cs.tsukuba.ac.jp/jnas/
Sakuraba, K., Imaizumi, S., Hirose, K., Kakehi, K.: Sexual difference between male and female listeners in the perceptual test with the voice produced by MtF transsexuals. In: Proc. Spring Meeting of Acoustic Society of Japan, 2-P-4, pp. 337–338 (March 2004) (in Japanese)
Google Scholar

Download references

Author information

Authors and Affiliations

The University of Tokyo,
Nobuaki Minematsu
Kiyose-shi Welfare Center for the Handicapped,
Kyoko Sakuraba

Authors

Nobuaki Minematsu
View author publications
You can also search for this author in PubMed Google Scholar
Kyoko Sakuraba
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Christian Müller

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Minematsu, N., Sakuraba, K. (2007). Development of a Femininity Estimator for Voice Therapy of Gender Identity Disorder Clients. In: Müller, C. (eds) Speaker Classification II. Lecture Notes in Computer Science(), vol 4441. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74122-0_3

Download citation

DOI: https://doi.org/10.1007/978-3-540-74122-0_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74121-3
Online ISBN: 978-3-540-74122-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics