Skip to main content

The Influence of the Interlocutor’s Gender on the Speaker’s Role Identification

  • Conference paper
  • First Online:
Speech and Computer (SPECOM 2018)

Abstract

The objective of the current on-going research is to automatically identify the role played by a speaker in a dialogue, and to explore potential conditions that might impose higher speaker’s role identification. We use an interactive Map Task setup with two potential roles: followers and leaders, where each speaker participated twice thus acting in both roles with the same interlocutor. The paper aims to identify speaker’s role, and to explore potential influence of the gender of the speaker, the gender of the interlocutor, and the order of the roles played by the speaker. By using deep learning procedures over a set of acoustic features, we automatically trace the footprints of the role through the speech signal. Results show an average of 73.3% role’s classification rate. We further show that there is a significant difference in the role’s classification rates, depending on the interlocutor’s gender. On average, when the interlocutor is a male, the speaker tends to identify with his or her role more clearly – 77.5% versus 69.9% when the interlocutor is a woman.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Giles, H., Coupland, J., Coupland, N.: Accommodation theory: Communication, context, and consequence. In: Giles, H., Coupland, J., Coupland, N. (eds.) Contexts of Accommodation: Developments in Applied Sociolinguistics, Chap. 1, pp. 1–69. Cambridge University Press, Cambridge (1991)

    Google Scholar 

  2. Gallois, C., Giles, H.: Communication accommodation theory. The international encyclopedia of language and social interaction (2015)

    Google Scholar 

  3. Hirschberg, J.: Communication and prosody: functional aspects of prosody. Speech Commun. 36(1), 31–43 (2002)

    Article  Google Scholar 

  4. Ancona, D., Chong, C.L.: Entrainment: pace, cycle, and rhythm in organizational behavior. In: Staw, B.M., Cummings, L.L. (eds.) Research in Organizational Behavior: An Annual Series of Analytical Essays and Critical Reviews, vol. 18, pp. 251–284. Elsevier Science/JAI Press (1996)

    Google Scholar 

  5. Chartrand, T.L., Bargh, J.A.: The chameleon effect: the perception-behavior link and social interaction. J. Pers. Soc. Psychol. 76(6), 893–910 (1999)

    Article  Google Scholar 

  6. Shepard, C.A.: Communication accommodation theory. The New Handbook of Language and Social Psychology, pp. 33–56 (2001)

    Google Scholar 

  7. Lee, C.C., et al.: Computing vocal entrainment: a signal-derived PCA-based quantification scheme with application to affect analysis in married couple interactions. Comput. Speech Lang. 28(2), 518–539 (2014)

    Article  Google Scholar 

  8. Lerner, A., Silber-Varod, V., Batista, F., Moniz, H.: In search of the role’s footprints in client-therapist dialogues. In: Proceedings of Speech Prosody 2016 (SP 2016), Boston, USA (2016)

    Google Scholar 

  9. Koulouri, T., Lauria, S., Macredie, R.D.: The influence of visual feedback and gender dynamics on performance, perception, and communication strategies in CSCW. Int. J. Hum. Comput. Stud. 97, 162–181 (2017)

    Article  Google Scholar 

  10. Broner, M.A.: Impact of interlocutor and task on first and second language use in a Spanish immersion program. Unpublished doctoral dissertation, University of Minnesota, Minneapolis (2000)

    Google Scholar 

  11. Kim, Y., McDonough, K.: The effect of interlocutor proficiency on the collaborative dialogue between Korean as a second language learners. Lang. Teach. Res. 12(2), 211–234 (2008)

    Article  Google Scholar 

  12. Davis, L.: The influence of interlocutor proficiency in a paired oral assessment. Lang. Test. 26(3), 367–396 (2009)

    Article  Google Scholar 

  13. Hori, C., Hori, T., Watanabe, S., Hershey, J.R.: Context-sensitive and role-dependent spoken language understanding using bidirectional and attention LSTMs. In: Morgan, N. (ed.) INTERSPEECH 2016, pp. 3236–3240. ISCA, San Francisco (2016). https://doi.org/10.21437/interspeech.2016

  14. Ma, W., Zhang, M., Liu, Y., Ma, S. Multi-grained role labeling based on multi-modality information for real customer service telephone conversation. In: Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI), pp. 1816–1822. AAAI Press, New York (2016)

    Google Scholar 

  15. Chen, P.C., Chi, T.C., Su, S.Y., Chen, Y.N.: Dynamic time-aware attention to speaker roles and contexts for spoken language understanding. arXiv preprint arXiv:1710.00165 (2017)

  16. Chi, T.C., Chen, P.C., Su, S.Y., Chen, Y.N.: Speaker role contextual modeling for language understanding and dialogue policy learning. arXiv preprint arXiv:1710.00164 (2017)

  17. Li, Y., et al.: Unsupervised classification of speaker roles in multi-participant conversational speech. Comput. Speech Lang. 42, 81–99 (2017)

    Article  Google Scholar 

  18. Barzilay, R., Collins, M., Hirschberg, J., Whittaker, S.: The rules behind roles: identifying speaker role in radio broadcasts. In: Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI 2000), Austin, Texas, pp. 679–684 (2000)

    Google Scholar 

  19. Liu, Y.: Initial study on automatic identification of speaker role in broadcast news speech. In: Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers, Association for Computational Linguistics, pp. 81–84 (2006)

    Google Scholar 

  20. Weizman, E.: Positioning in Media Dialogue: Negotiating Roles in the News Interview, vol. 3. John Benjamins Publishing, Amsterdam (2008)

    Book  Google Scholar 

  21. Zhang, B., Hutchinson, B., Wu, W., Ostendorf, M.: Extracting phrase patterns with minimum redundancy for unsupervised speaker role classification. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 717–720 (2010)

    Google Scholar 

  22. Luan, Y., Ji, Y., Ostendorf, M.: LSTM based Conversation Models. arXiv preprint arXiv:1603.09457 (2016)

  23. Silber-Varod, V., Lerner, A., Jokisch, O.: Automatic speaker’s role classification with a bottom-up acoustic feature selection. In: Proceedings GLU 2017 International Workshop on Grounding Language Understanding, Stockholm, Sweden, pp. 52–56 (2017). https://doi.org/10.21437/glu.2017-11

  24. Eyben, F., Wöllmer, M. Schuller, B.: OpenSMILE: the munich versatile and fast open-source audio feature extractor. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 1459–1462 (2010). https://doi.org/10.1145/1873951.1874246

  25. Hall, M., Witten, I., Frank, E.: Data mining: practical machine learning tools and techniques, 3rd edn. Kaufmann, Burlington (2011)

    Google Scholar 

  26. MaTaCOp homepage, The Open University of Israel Map Task Corpus (MaTaCOp), http://www.openu.ac.il/en/academicstudies/matacop/. Accessed 30 Apr 2018

  27. Anderson, H., et al.: The HCRC Map Task Corpus. Lang. Speech 34(4,) 351–366 (1991)

    Google Scholar 

  28. Carletta, J., Isard, A., Kowtko, J., Doherty-Sneddon, G.: HCRC dialogue structure coding manual. Human Communication Research Centre (1996)

    Google Scholar 

  29. Ochs, E.: Planned and unplanned discourse. In: Givon, T. (ed.) Syntax and Semantics: Discourse and Syntax, vol. 12. Academic Press, New York (1979)

    Google Scholar 

  30. ZOOM. https://www.zoom-na.com/products/field-video-recording/field-recording/zoom-h4n-handy-recorder. Accessed 21 Apr 2018

  31. McFee, B., et al.: librosa: audio and music signal analysis in python. In: Proceedings of the 14th Python in Science Conference (SciPy 2015), Austin, Texas, pp. 18–25 (2015)

    Google Scholar 

  32. Tavarez, D., et al.: Exploring fusion methods and feature space for the classification of paralinguistic information. In: INTERSPEECH 2017, Stockholm, Sweden, pp. 3517–3521 (2017)

    Google Scholar 

  33. Grus, J.: Data Science from Scratch: First Principles with Python. O’Reilly Media Inc., Sebastopol (2015). ISBN 978-1-491-90142-7

    Google Scholar 

  34. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. CoRR, vol. abs/1412.6980 (2014). http://arxiv.org/abs/1412.6980

  35. Busso, C., Metallinou, A., Narayanan, S.S.: Iterative feature normalization for emotional speech detection. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5692–5695. IEEE (2011)

    Google Scholar 

Download references

Acknowledgments

This work was supported by the Open Media and Information Lab (OMILab) at The Open University of Israel [Grant Number 20184] and by research grant #507761 from the Research Authority at The Open University of Israel.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anat Lerner .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lerner, A., Miara, O., Malayev, S., Silber-Varod, V. (2018). The Influence of the Interlocutor’s Gender on the Speaker’s Role Identification. In: Karpov, A., Jokisch, O., Potapova, R. (eds) Speech and Computer. SPECOM 2018. Lecture Notes in Computer Science(), vol 11096. Springer, Cham. https://doi.org/10.1007/978-3-319-99579-3_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-99579-3_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99578-6

  • Online ISBN: 978-3-319-99579-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics