The Influence of the Interlocutor’s Gender on the Speaker’s Role Identification

Lerner, Anat; Miara, Oren; Malayev, Sarit; Silber-Varod, Vered

doi:10.1007/978-3-319-99579-3_34

Anat Lerner ORCID: orcid.org/0000-0002-9293-3195¹⁶,
Oren Miara¹⁷,
Sarit Malayev¹⁷ &
…
Vered Silber-Varod ORCID: orcid.org/0000-0002-1564-9350¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11096))

Included in the following conference series:

International Conference on Speech and Computer

1427 Accesses
2 Citations

Abstract

The objective of the current on-going research is to automatically identify the role played by a speaker in a dialogue, and to explore potential conditions that might impose higher speaker’s role identification. We use an interactive Map Task setup with two potential roles: followers and leaders, where each speaker participated twice thus acting in both roles with the same interlocutor. The paper aims to identify speaker’s role, and to explore potential influence of the gender of the speaker, the gender of the interlocutor, and the order of the roles played by the speaker. By using deep learning procedures over a set of acoustic features, we automatically trace the footprints of the role through the speech signal. Results show an average of 73.3% role’s classification rate. We further show that there is a significant difference in the role’s classification rates, depending on the interlocutor’s gender. On average, when the interlocutor is a male, the speaker tends to identify with his or her role more clearly – 77.5% versus 69.9% when the interlocutor is a woman.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Giles, H., Coupland, J., Coupland, N.: Accommodation theory: Communication, context, and consequence. In: Giles, H., Coupland, J., Coupland, N. (eds.) Contexts of Accommodation: Developments in Applied Sociolinguistics, Chap. 1, pp. 1–69. Cambridge University Press, Cambridge (1991)
Google Scholar
Gallois, C., Giles, H.: Communication accommodation theory. The international encyclopedia of language and social interaction (2015)
Google Scholar
Hirschberg, J.: Communication and prosody: functional aspects of prosody. Speech Commun. 36(1), 31–43 (2002)
Article Google Scholar
Ancona, D., Chong, C.L.: Entrainment: pace, cycle, and rhythm in organizational behavior. In: Staw, B.M., Cummings, L.L. (eds.) Research in Organizational Behavior: An Annual Series of Analytical Essays and Critical Reviews, vol. 18, pp. 251–284. Elsevier Science/JAI Press (1996)
Google Scholar
Chartrand, T.L., Bargh, J.A.: The chameleon effect: the perception-behavior link and social interaction. J. Pers. Soc. Psychol. 76(6), 893–910 (1999)
Article Google Scholar
Shepard, C.A.: Communication accommodation theory. The New Handbook of Language and Social Psychology, pp. 33–56 (2001)
Google Scholar
Lee, C.C., et al.: Computing vocal entrainment: a signal-derived PCA-based quantification scheme with application to affect analysis in married couple interactions. Comput. Speech Lang. 28(2), 518–539 (2014)
Article Google Scholar
Lerner, A., Silber-Varod, V., Batista, F., Moniz, H.: In search of the role’s footprints in client-therapist dialogues. In: Proceedings of Speech Prosody 2016 (SP 2016), Boston, USA (2016)
Google Scholar
Koulouri, T., Lauria, S., Macredie, R.D.: The influence of visual feedback and gender dynamics on performance, perception, and communication strategies in CSCW. Int. J. Hum. Comput. Stud. 97, 162–181 (2017)
Article Google Scholar
Broner, M.A.: Impact of interlocutor and task on first and second language use in a Spanish immersion program. Unpublished doctoral dissertation, University of Minnesota, Minneapolis (2000)
Google Scholar
Kim, Y., McDonough, K.: The effect of interlocutor proficiency on the collaborative dialogue between Korean as a second language learners. Lang. Teach. Res. 12(2), 211–234 (2008)
Article Google Scholar
Davis, L.: The influence of interlocutor proficiency in a paired oral assessment. Lang. Test. 26(3), 367–396 (2009)
Article Google Scholar
Hori, C., Hori, T., Watanabe, S., Hershey, J.R.: Context-sensitive and role-dependent spoken language understanding using bidirectional and attention LSTMs. In: Morgan, N. (ed.) INTERSPEECH 2016, pp. 3236–3240. ISCA, San Francisco (2016). https://doi.org/10.21437/interspeech.2016
Ma, W., Zhang, M., Liu, Y., Ma, S. Multi-grained role labeling based on multi-modality information for real customer service telephone conversation. In: Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI), pp. 1816–1822. AAAI Press, New York (2016)
Google Scholar
Chen, P.C., Chi, T.C., Su, S.Y., Chen, Y.N.: Dynamic time-aware attention to speaker roles and contexts for spoken language understanding. arXiv preprint arXiv:1710.00165 (2017)
Chi, T.C., Chen, P.C., Su, S.Y., Chen, Y.N.: Speaker role contextual modeling for language understanding and dialogue policy learning. arXiv preprint arXiv:1710.00164 (2017)
Li, Y., et al.: Unsupervised classification of speaker roles in multi-participant conversational speech. Comput. Speech Lang. 42, 81–99 (2017)
Article Google Scholar
Barzilay, R., Collins, M., Hirschberg, J., Whittaker, S.: The rules behind roles: identifying speaker role in radio broadcasts. In: Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI 2000), Austin, Texas, pp. 679–684 (2000)
Google Scholar
Liu, Y.: Initial study on automatic identification of speaker role in broadcast news speech. In: Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers, Association for Computational Linguistics, pp. 81–84 (2006)
Google Scholar
Weizman, E.: Positioning in Media Dialogue: Negotiating Roles in the News Interview, vol. 3. John Benjamins Publishing, Amsterdam (2008)
Book Google Scholar
Zhang, B., Hutchinson, B., Wu, W., Ostendorf, M.: Extracting phrase patterns with minimum redundancy for unsupervised speaker role classification. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 717–720 (2010)
Google Scholar
Luan, Y., Ji, Y., Ostendorf, M.: LSTM based Conversation Models. arXiv preprint arXiv:1603.09457 (2016)
Silber-Varod, V., Lerner, A., Jokisch, O.: Automatic speaker’s role classification with a bottom-up acoustic feature selection. In: Proceedings GLU 2017 International Workshop on Grounding Language Understanding, Stockholm, Sweden, pp. 52–56 (2017). https://doi.org/10.21437/glu.2017-11
Eyben, F., Wöllmer, M. Schuller, B.: OpenSMILE: the munich versatile and fast open-source audio feature extractor. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 1459–1462 (2010). https://doi.org/10.1145/1873951.1874246
Hall, M., Witten, I., Frank, E.: Data mining: practical machine learning tools and techniques, 3rd edn. Kaufmann, Burlington (2011)
Google Scholar
MaTaCOp homepage, The Open University of Israel Map Task Corpus (MaTaCOp), http://www.openu.ac.il/en/academicstudies/matacop/. Accessed 30 Apr 2018
Anderson, H., et al.: The HCRC Map Task Corpus. Lang. Speech 34(4,) 351–366 (1991)
Google Scholar
Carletta, J., Isard, A., Kowtko, J., Doherty-Sneddon, G.: HCRC dialogue structure coding manual. Human Communication Research Centre (1996)
Google Scholar
Ochs, E.: Planned and unplanned discourse. In: Givon, T. (ed.) Syntax and Semantics: Discourse and Syntax, vol. 12. Academic Press, New York (1979)
Google Scholar
ZOOM. https://www.zoom-na.com/products/field-video-recording/field-recording/zoom-h4n-handy-recorder. Accessed 21 Apr 2018
McFee, B., et al.: librosa: audio and music signal analysis in python. In: Proceedings of the 14th Python in Science Conference (SciPy 2015), Austin, Texas, pp. 18–25 (2015)
Google Scholar
Tavarez, D., et al.: Exploring fusion methods and feature space for the classification of paralinguistic information. In: INTERSPEECH 2017, Stockholm, Sweden, pp. 3517–3521 (2017)
Google Scholar
Grus, J.: Data Science from Scratch: First Principles with Python. O’Reilly Media Inc., Sebastopol (2015). ISBN 978-1-491-90142-7
Google Scholar
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. CoRR, vol. abs/1412.6980 (2014). http://arxiv.org/abs/1412.6980
Busso, C., Metallinou, A., Narayanan, S.S.: Iterative feature normalization for emotional speech detection. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5692–5695. IEEE (2011)
Google Scholar

Download references

Acknowledgments

This work was supported by the Open Media and Information Lab (OMILab) at The Open University of Israel [Grant Number 20184] and by research grant #507761 from the Research Authority at The Open University of Israel.

Author information

Authors and Affiliations

Mathematics and Computer Science Department, The Open University of Israel, Ra’anana, Israel
Anat Lerner
Open Media and Information Lab (OMILab), The Open University of Israel, Ra’anana, Israel
Oren Miara, Sarit Malayev & Vered Silber-Varod

Authors

Anat Lerner
View author publications
You can also search for this author in PubMed Google Scholar
Oren Miara
View author publications
You can also search for this author in PubMed Google Scholar
Sarit Malayev
View author publications
You can also search for this author in PubMed Google Scholar
Vered Silber-Varod
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anat Lerner .

Editor information

Editors and Affiliations

SPIIRAS, St. Petersburg, Russia
Alexey Karpov
Leipzig University of Telecommunications, Leipzig, Germany
Oliver Jokisch
Moscow State Linguistic University, Moscow, Russia
Rodmonga Potapova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lerner, A., Miara, O., Malayev, S., Silber-Varod, V. (2018). The Influence of the Interlocutor’s Gender on the Speaker’s Role Identification. In: Karpov, A., Jokisch, O., Potapova, R. (eds) Speech and Computer. SPECOM 2018. Lecture Notes in Computer Science(), vol 11096. Springer, Cham. https://doi.org/10.1007/978-3-319-99579-3_34

Download citation

DOI: https://doi.org/10.1007/978-3-319-99579-3_34
Published: 25 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99578-6
Online ISBN: 978-3-319-99579-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics