Talker Quality in Interactive Scenarios

  • Benjamin Weiss
Part of the T-Labs Series in Telecommunication Services book series (TLABS)


In passive scenarios, people just listen (and watch) stimuli, which allows the participants to concentrate well on the task, and facilitates careful preparation and manipulation of the stimuli. In contrast to this, interaction introduces several issues, first of all, it induces verbal flexibility, as the participants should not read text out, but have to produce spontaneous speech for real conversations. For the field of acoustically analyzing Talker Quality, these individual differences between utterances in content and duration require robust acoustic parameters. Even more crucial is the case for experiment that includes pre-defined conditions to be manipulated.


Interaction parameters Feedback/back-channel Turn-taking Dialog acts Entrainment Strategy Overlap Likability HCI HHI 


  1. 4.
    Albert, W., Gribbons, W., Almadas, J.: Pre-conscious assessment of trust: a case study of financial and health care web sites. In: Proceedings of the Human Factors and Ergonomics Society Annual Meeting, San Antonio, pp. 449–453 (2009)Google Scholar
  2. 9.
    Anderson, A., Bader, M., Bard, E., Boyle, E., Doherty, G.M., Garrod, S., Isard, S., Kowtko, J., McAllister, J., Miller, J., Sotillo, C., Thompson, H.S., Weinert, R.: The HCRC map task corpus. Lang. Speech 34, 351–366 (1991)CrossRefGoogle Scholar
  3. 15.
    Aronson, E., Wilson, T., Akert, R.M.: Social Psychology, 7th edn. Prentice Hall (2009)Google Scholar
  4. 17.
    Back, M.D., Schmukle, S.C., Egloff, B.: A closer look at first sight: social relations lens model analysis of personality and interpersonal attraction at zero acquaintance. Eur. J. Personal. 25, 225–238 (2011)CrossRefGoogle Scholar
  5. 20.
    Bailly, G., Amélie, L.: Speech dominoes and phonetic convergence. In: Proceedings of the conference on Interspeech, pp. 1153–1156 (2010)Google Scholar
  6. 21.
    Baker, A., Ayres, J.: The effect of apprehensive behavior on communication apprehension and interpersonal attraction. Commun. Res. Rep. 11, 45–51 (1994)CrossRefGoogle Scholar
  7. 22.
    Baker, R., Hazan, V.: DiapixUK: task materials for the elicitation of multiple spontaneous speech dialogs. Behav. Res. Methods 43, 761–770 (2011)CrossRefGoogle Scholar
  8. 25.
    Bartneck, C., Croft, E., Kulic, D., Zoghbi, S.: Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots. Int. J. Soc. Robot. 1, 71–81 (2009)CrossRefGoogle Scholar
  9. 28.
    Bell, L., Gustafson, J., Heldner, M.: Prosodic adaption in human-computer interaction. In: Proceedings of ICPHS, pp. 2453–2456 (2003)Google Scholar
  10. 33.
    Bernsen, N.O.: From theory to design support tool. In: Ruttkay, Z., Pelachaud, C. (eds.) Multimodality in Language and Speech Systems, pp. 93–148. Kluwer, Dordrecht (2002)CrossRefGoogle Scholar
  11. 34.
    Bernsen, N.O., Dybkjær, H., Dybkjær, L.: Cooperativity in human-machine and human-human spoken dialogue. Discourse Process. 21, 213–236 (1996)CrossRefGoogle Scholar
  12. 35.
    Bernsen, N., Dybkjær, L.: Multimodal Usability. Springer, London (2009)Google Scholar
  13. 42.
    Bradac, J., Mulac, A., House, A.: Lexical diversityand magnitude of convergent versus divergent style shifting perceptual and evaluative consequences. Lang. Commun. 8, 213–228 (1988)CrossRefGoogle Scholar
  14. 43.
    Brandt, D.: On liking social performance with social competence: some relations between communicative and attributions of interpersonal attractiveness and effectiveness. Hum. Commun. Res. 5, 223–226 (1979)CrossRefGoogle Scholar
  15. 44.
    Branigan, H.P., Pickering, M.J., Pearson, J., Mclean, J.F.: Linguistic alignment between people and computers. J. Pragmat. 42, 2355–2368 (2010)CrossRefGoogle Scholar
  16. 45.
    Brennan, S.E., Clark, H.H.: Lexical choice and conceptual pacts in conversation. J. Exp. Psychol. Learn. Mem. Cogn. 11, 1482–1493 (1996)CrossRefGoogle Scholar
  17. 46.
    Brockmann, C., Isard, A., Oberlander, J., White, M.: Modelling alignment for affective dialogue. In: Proceedings of the Workshop on Adapting the Interaction Style to Affective Factors at the 10th International Conference on User Modeling, pp. 1–5 (2005)Google Scholar
  18. 56.
    Burkhardt, F., Weiss, B., Eyben, F., Deng, J., Schuller, B.: Detecting vocal irony. In: Proceedings of the Conference on German Society for Computational Linguistics and Language Technology, pp. 16–191 (2017)Google Scholar
  19. 59.
    Buschmeier, H., Bergmann, K., Kopp, S.: An alignment-capable microplanner for natural language generation. In: Proceedings of the 12th European Workshop on Natural Language Generation, p. 82–89. ACM, New York (2007)Google Scholar
  20. 62.
    Cafaro, A., Vilhjálmsson, H., Bickmore, T.: First impressions in human–agent virtual encounters. ACM Trans. Comput. Hum. Interact. 23, 24:1–40 (2016)CrossRefGoogle Scholar
  21. 66.
    Chartrand, T.L., Bargh, J.A.: The chameleon effect: The perception-behavior link and social interaction. J. Pers. Soc. Psychol. 76(6), 893–910 (1999)CrossRefGoogle Scholar
  22. 72.
    Cowan, B., Branigan, H., Obregón, M., Bugis, E., Beale, R.: Voice anthropomorphism, interlocutor modelling and alignment effects on syntactic choices in human-computer dialogue. Int. J. Hum. Comput. Stud. 83, 27–42 (2015)CrossRefGoogle Scholar
  23. 73.
    Cuayáhuitl, H., Renals, S., Lemon, O., Shimodaira, H.: Evaluation of a hierarchical reinforcement learning spoken dialogue system. Comput. Speech Lang. 24, 395–429 (2010)CrossRefGoogle Scholar
  24. 77.
    Dabbs, J.M.: Similarity of gestures and interpersonal influence. In: Proceedings of the Annual Convention of the American Psychological Association, vol. 4, pp. 337–338 (1969)Google Scholar
  25. 78.
    Davies, B.: Grice’s cooperative principle: getting the meaning across. Leeds working papers in linguistics, University of Leeds (2008)Google Scholar
  26. 81.
    De Looze, C., Scherer, S., Vaughan, B., Campbell, N.: Investigating automatic measurements of prosodic accommodation and its dynamics in social interaction. Speech Commun. 58, 11–34 (2014)CrossRefGoogle Scholar
  27. 87.
    Diefenbach, S., Hassenzahl, M.: Handbuch zur fun-ni toolbox. user experience evaluation auf drei ebenen. Tech. rep., Folkwang Universität (2010).
  28. 90.
    Dybkjær, L., Bernsen, N.O., Dybkjær, H.: Grice incorporated. Cooperativity in spoken dialogue. In: Proceedings of COLING, pp. 328–333 (1996)Google Scholar
  29. 93.
    Ehrenbrink, P., Möller, S.: Development of a reactance scale for human–computer interaction. Quality User Experience 3:2, 1–13 (2018)Google Scholar
  30. 94.
    Enfield, N.J.: How we talk. The Inner Workings of Conversation. Basic Books, New York (2017)Google Scholar
  31. 95.
    Engelbrecht, K.P., Kühnel, C., Möller, S.: Weighting the coefficients in PARADISE models to increase their generalizability. In: André, E. Dybkjær, L., Minker, W., Neumann, H., Pieraccini, R., Weber, M. (eds.) 4th IEEE Workshop on Perception and Interactive Technologies for Speech-Based Systems (PIT), Kloster Irsee, LNAI 5078, pp. 289–292. Springer, Berlin (2008)Google Scholar
  32. 97.
    Evanini, K., Hunter, P., Liscombe, J., Sündermann, D., Dayanidhi, K., Pieraccini, R.: Caller experience: a method for evaluating dialog systems and its automatic prediction. In: Proceedings of the Spoken Language Technology Workshop, SLT, pp. 129–132 (2008)Google Scholar
  33. 99.
    Fandrianto, A., Eskenazi, M.: Prosodic entrainment in an information-driven dialog system. In: Proceedings of the Interspeech, pp. 1–4 (2012)Google Scholar
  34. 100.
    Förster, J., Strack, F.: Motor actions in retrieval of valenced information: II. Boundary conditions for motor congruence effects. Percept. Mot. Skills 86, 1423–1426 (1998)CrossRefGoogle Scholar
  35. 117.
    Foster, M., Giuliani, M., Knoll, A.: Comparing objective and subjective measures of usability in a human-robot dialogue system. In: Proceedings of the International Conference on Universal Access in Human-Computer Interaction: Ambient Interaction, pp. 879–887 (2009)Google Scholar
  36. 119.
    Gödde, F., Möller, S., Engelbrecht, K.P., Kühnel, C., Schleicher, R., Naumann, A., Wolters, M.: Study of a speech-based smart home system with older users. In: Proceedings of the International Workshop on Intelligent User Interfaces for Ambient Assisted Living, pp. 17–22 (2008)Google Scholar
  37. 122.
    Gibbon, D., Mertins, I., Moore, R. (eds.): Handbook of Multimodal and Spoken Dialogue Systems: Resources, Terminology and Product Evaluation. Kluwer, Norwell (2000)Google Scholar
  38. 124.
    Giles, H.: Accommodation theory: some new directions. York Papers Linguist. 9, 105–136 (1980)Google Scholar
  39. 125.
    Goldbrand, S.: Imposed latencies, interruptions and dyadic interaction: physiological response and interpersonal attraction. J. Res. Pers. 15, 221–232 (1981)CrossRefGoogle Scholar
  40. 129.
    Gravano, A.: Turn-taking and affirmative cue words in task-oriented dialogue. Ph.D. thesis, Columbia University (2009)Google Scholar
  41. 130.
    Gravano, S., Beňuš, Š., Levitan, R., Hirschberg, J.: Backward mimicry and forward influence in prosodic contour choice in standard American English. In: Proceedings of the Interspeech, pp. 1839–1843 (2015)Google Scholar
  42. 131.
    Gravano, A., Levitan, R., Willson, L., Beňuš, Š., Hirschberg, J., Nenkova, A.: Acoustic and prosodic correlates of social behavior. In: Proceedings of the Interspeech, pp. 97–100 (2011)Google Scholar
  43. 134.
    Grice, H.P.: Logic and conversation. In: Cole, P., Morgan, J.L. (eds.): Speech Acts, Syntax and Semantics, vol. 3, pp. 41–58. Academic Press, New York (1975)Google Scholar
  44. 137.
    Hajdinjak, M., Mihelic, F.: The PARADISE evaluation framework: issues and findings. Comput. Linguist. 32, 263–272 (2006)zbMATHCrossRefGoogle Scholar
  45. 143.
    Hassenzahl, M., Diefenbach, S., Göritz, A.: Needs, affect, and interactive products—facets of user experience. Interacting Comput. 22, 353–362 (2010)CrossRefGoogle Scholar
  46. 144.
    Hassenzahl, M., Monk, A.: The inference of perceived usability from beauty. Hum. Comput. Interact. 25(3), 235–260 (2010)CrossRefGoogle Scholar
  47. 148.
    Heldner, J., Edlund, M., Hirschberg, J.: Pitch similarity in the vicinity of backchannels. In: Proceedings of the Interspeech, pp. 1–4 (2010)Google Scholar
  48. 150.
    Hermann, F., Niedermann, I., Peissner, M., Henke, K., Naumann, A.: Users interact differently: towards a usability-oriented taxonomy. In: Jacko, J. (ed.) Interaction Design and Usability, HCII 2007, No. 4550 in LNAI, pp. 812–817. Springer, Heidelberg (2007)Google Scholar
  49. 153.
    Hoeldtke, K., Raake, A.: Conversation analysis of multi-party conferencing and its relation to perceived quality. In: Proceedings of the International Conference on Communications (ICC), IEEE, pp. 1–5. Kyoto, Japan (2011)Google Scholar
  50. 155.
    ISO 24617-2:2012: Language resource management—semantic annotation framework (SemAF), Part 2: Dialogue acts (2012)Google Scholar
  51. 156.
    ISO 9421-11: Ergonomic Requirements for Office Work with Visual Display Terminals (VDTs). Part 11: Guidance on Usability. ISO, Geneva (1998)Google Scholar
  52. 162.
    ITU-T Rec. P.800: Methods for Subjective Determination of Transmission Quality. International Telecommunication Union, Geneva (1996)Google Scholar
  53. 163.
    ITU-T Rec. P.805: Subjective Evaluation of Conversational Quality. International Telecommunication Union, Geneva (2007)Google Scholar
  54. 164.
    ITU-T Rec. P.851: Subjective Quality Evaluation of Telephone Services Based on Spoken Dialogue Systems. International Telecommunication Union, Geneva (2003)Google Scholar
  55. 165.
    ITU-T Suppl. 24 to P-Series: Parameters Describing the Interaction with Spoken Dialogue Systems. International Telecommunication Union, Geneva (2005)Google Scholar
  56. 166.
    ITU-T Suppl. 25 to P-Series: Parameters Describing the Interaction with Multimodal Dialogue Systems. International Telecommunication Union, Geneva (2011)Google Scholar
  57. 167.
    ITU-T Suppl. 26 to P-Series: Scenarios for the Subjective Evaluation of Three-Party Audio Telemeetings Quality. International Telecommunication Union, Geneva (2012)Google Scholar
  58. 168.
    Jokinen, K.: Challenges for adaptive conversational agents. In: Proceedings of the Baltic Conferences on Human Language Technologies, pp. 51–60 (2005)Google Scholar
  59. 169.
    Jokinen, K., Hurtig, T.: User expectations and real experience on a multimodal interactive. In: Proceedings of the Interspeech, pp. 1049–1052 (2006)Google Scholar
  60. 170.
    Jokinen, K., McTear, M.: Spoken Dialogue Systems. Synthesis Lectures on Human-Centered Informatics. Morgan & Claypool, Wadsworth (2010)Google Scholar
  61. 172.
    Kühnel, C.: Quantifying Quality Aspects of Multimodal Interactive Systems. T-Labs Series in Telecommunication Services. Springer, Berlin (2011)Google Scholar
  62. 173.
    Kühnel, C., Weiss, B., Möller, S.: Talking heads for interacting with spoken dialog smart-home systems. In: 10th Interspeech, Brighton, pp. 304–307 (2009)Google Scholar
  63. 174.
    Kühnel, C., Weiss, B., Möller, S.: Evaluating multimodal systems—a comparison of established questionnaires and interaction parameters. In: ACM NordiCHI, Reykjavik, pp. 286–293 (2010)Google Scholar
  64. 175.
    Kühnel, C., Weiss, B., Möller, S.: Parameters describing multimodal interaction—definitions and three usage scenarios. In: 11th Interspeech, Makuhari, pp. 2014–2017 (2010)Google Scholar
  65. 176.
    Kühnel, C., Weiss, B., Schulz, M., Möller, S.: Quality aspects of multimodal dialog systems: identity, stimulation and success. In: 12th Interspeech, Florence, pp. 1349–1352 (2011)Google Scholar
  66. 182.
    Keizer, S., Kastoris, P., Foster, M.E., Deshmukh, A., Lemon, O.: Evaluating a social multi-user interaction model using a Nao robot. In: Proceedings of the IEEE International Symposium on Robot and Human Interactive Communication, pp. 318–322. IEEE, Piscataway (2014)Google Scholar
  67. 185.
    Kenny, D.: Models of non-independence in dyadic research. J. Soc. Pers. Relat. 13, 279–294 (1996)CrossRefGoogle Scholar
  68. 191.
    Kitawaki, N., Itoh, K.: Pure delay effects on speech quality in telecommunications. IEEE J. Sel. Areas Commun. 9, 586–593 (1991)CrossRefGoogle Scholar
  69. 193.
    Knapp, M., Hall, J.: Nonverbal Communication in Human Interaction. Thomas Learning, Wadsworth (2010)Google Scholar
  70. 195.
    Kohn, L., Dipboye, R.: The effect on interview structure on recruiting outcomes. J. Appl. Soc. Psychol. 28, 821–843 (1998)CrossRefGoogle Scholar
  71. 197.
    Král, P., Cerisara, C.: Automatic dialogue act recognition with syntactic features. Lang. Resour. Eval. 48, 419–441 (2016)CrossRefGoogle Scholar
  72. 198.
    Krämer, C.: Soziale Wirkung virtueller Helfer. Kohlhammer, Stuttgart (2008)Google Scholar
  73. 200.
    Krämer, N., Kopp, S., Becker-Asano, C., Sommer, N.: Smile and the world will smile with you-the effects of a virtual agent’s smile on users’ evaluation and behavior. Int. J. Hum. Comput. Stud. 71, 335–349 (2013)CrossRefGoogle Scholar
  74. 202.
    Krämer, N.C., Rosenthal-von der Pütten, A.M., Edinger, C.: The effects of a robot’s nonverbal behavior on users’ mimicry and evaluation. In: Proceedings of the Intelligent Virtual Agents, pp. 442–446 (2016)CrossRefGoogle Scholar
  75. 204.
    Krause, S., Back, M.D., Egloff, B., Schmukle, S.C.: Implicit interpersonal attraction in small groups automatically activated evaluations predict actual behavior toward social partners. Soc. Psychol. Personal. Sci. 20, 671–679 (2014)CrossRefGoogle Scholar
  76. 210.
    Lai, C., Carletta, J., Renals, S.: Modelling participant affect in meetings with turn-taking features. In: Proceedings of the Workshop of Affective Social Speech Signals (2013)Google Scholar
  77. 211.
    Lakin, J., Jefferis, V., Cheng, C., Chartrand, T.: The chameleon effect as social glue: Evidence for the evolutionary significance of nonconscious mimicry. J. Nonverbal Behav. 27(3), 145–162 (2003)CrossRefGoogle Scholar
  78. 212.
    LaPrelle, J., Hoyle, R., Insko, C., Bernthal, P.: Interpersonal attraction and descriptions of the traits of others: Ideal similarity, self similarity, and liking. J. Res. Pers. 24, 216–240 (1990)CrossRefGoogle Scholar
  79. 214.
    Lavie, T., Tractinsky, N.: Assessing dimensions of perceived visual aesthetics of web sites. Int. J. Hum. Comput. Stud. 60, 269–298 (2004)CrossRefGoogle Scholar
  80. 217.
    Lee, C.C., Katsamanis, A., Black, M., Baucom, B., Christensen, A., Georgiou, P., Narayanan, S.S.: Computing vocal entrainment: A signal-derived PCA-based quantification scheme with application to affect analysis in married couple interactions. Comput. Speech Lang. 28, 518–539 (2013)CrossRefGoogle Scholar
  81. 218.
    Lee, D., Lee, J., Kim, E.K., Lee, J.: Dialog act modeling for virtual personal assistant applications using a small volume of labeled data and domain knowledge. In: Proceedings of the Interspeech, p. 1231–1235 (2015)Google Scholar
  82. 222.
    Levitan, R.: Acoustic-prosodic entrainment in human-human and human-computer dialogue. Ph.D. thesis, University of Columbia (2014)Google Scholar
  83. 223.
    Levitan, R., Hirschberg, J.: Measuring acoustic-prosodic entrainment with respect to multiple levels and dimensions. In: Proceedings of the Interspeech, pp. 3081–3084. (2011)Google Scholar
  84. 224.
    Levitan, R., Beňuš, S., Gálvez, R., Gravano, A., Savoretti, F., Trnka, M., Weise, A., Hirschberg, J.: Implementing acoustic-prosodic entrainment in a conversational avatar. In: Proceedings of the Interspeech, pp. 1166–1170 (2016)Google Scholar
  85. 225.
    Lewandowski, N., Schweitzer, A.: Prosodic and segmental convergence in spontaneous German conversations. J. Acoust. Soc. Am. 128, 1458 (2010)CrossRefGoogle Scholar
  86. 227.
    Lindgaard, G., Dudek, C., Sen, D., Sumegi, L., Noonan, P.: An exploration of relations between visual appeal, trustworthiness and perceived usability of homepages. ACM Trans. Comput. Hum. Interact. 18(1), 1–30 (2011)CrossRefGoogle Scholar
  87. 228.
    Lindgaard, G., Fernandes, G., Dudek, C., Brown, J.: Attention web designers: you have 50 milliseconds to make a good first impression! Behav. Inform. Technol. 25(2), 115–126 (2006)CrossRefGoogle Scholar
  88. 230.
    Lopes, J., Eskenazi, M., Trancoso, I.: Automated two-way entrainment to improve spoken dialog system performance. In: IEEE Conference on Acoustics, Speech and Signal Processing (ICASSP), p. 8372–8376. IEEE, Piscataway (2013)Google Scholar
  89. 231.
    López-Cózar Delgado, R., Araki, M.: Spoken, multilingual and multimodal dialogue systems: development and assessment. Wiley, Chichester (2005)CrossRefGoogle Scholar
  90. 232.
    Lubold, N., Pon-Barry, H., Walker, E.: Naturalness and rapport in a pitch adaptive learning companion. In: Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, pp. 1–8. IEEE, Piscataway (2015)Google Scholar
  91. 233.
    Lubold, N., Walker, E., Pon-Barry, H.: Effects of voice-adaptation and social dialogue on perceptions of a robotic learning companion. In: Proceedings of the Human-Robot Interaction, pp. 1–8 (2016)Google Scholar
  92. 234.
    Luengo, I., Navas, E., Odriozola, I., Saratxaga, I., Hernaez, I., Sainz, I., Erro, D.: Modified LTSE-VAD algorithm for applications requiring reduced silence frame misclassification. In: Proceedings of the International Conference on Language Resources and Evaluation, LREC, pp. 1539–1544 (2010)Google Scholar
  93. 235.
    Maat, M.T., Truong, K.P., Heylen, D.: How turn-taking strategies influence users’ impressions of an agent. In: Proceedings of the International Conference on Intelligent Virtual Agents (IVA), pp. 441–453. Springer, Berlin (2010)Google Scholar
  94. 236.
    Möller, S.: Perceptual quality dimensions of spoken dialogue systems: a review and new experimental results. In: Proceedings of the of Forum Acusticum, Budapest, p. 2681–2686 (2005)Google Scholar
  95. 237.
    Möller, S.: Quality of Telephone-Based Spoken Dialogue Systems. Springer, New York (2005)Google Scholar
  96. 240.
    Möller, S., Skowronek, J.: Quantifying the impact of system characteristics on perceived quality dimensions of a spoken dialogue service. In: Proceedings of the European Conference on Speech Communication and Technology, Geneva, vol. 3, pp. 1953–1956 (2003)Google Scholar
  97. 241.
    Möller, S., Engelbrecht, K.P., Schleicher, R.: Predicting the quality and usability of spoken dialogue services. Speech Commun. 50, 730–744 (2009)CrossRefGoogle Scholar
  98. 243.
    Mayer, R.: Multimedia Learning, 2nd edn. University Press, Cambridge (2009)CrossRefGoogle Scholar
  99. 252.
    Mehrabian, A.: Some referents and measures of nonverbal behavior. Behav. Res. Methods Instrum. 1, 213–217 (1969)Google Scholar
  100. 254.
    Mehu, M., Little, A.C., Dunbar, R.I.: Sex differences in the effect of smiling on social judgments: an evolutionary approach. J. Soc. Evol. Cult. Psychol. 2, 103–121 (2008)CrossRefGoogle Scholar
  101. 260.
    NASA: NASA and Jamestown education module (2006).
  102. 263.
    Naumann, A., Hermann, F., Peissner, M., Henke, K.: Interaktion mit Informations- und Kommunikationstechnologie: Eine Klassifikation von Benutzertypen. In: Herczeg, M., Kindsmüller, M. (eds.) Mensch & Computer 2008: Viel Mehr Interaktion, pp. 37–45. Oldenbourg Wissenschaftsverlag, München (2008)Google Scholar
  103. 264.
    Naumann, A., Hermann, F., Niedermann, I., Peissner, M., Henke, K.: Interindividuelle Unterschiede in der Interaktion mit Informations- und Kommunikationstechnologie. In: Gross, T. (ed.) Mensch & Computer 2007, pp. 311–314. Oldenbourg Wissenschaftsverlag, München (2007)Google Scholar
  104. 266.
    Nenkova, A., Gravano, A., Hirschberg, J.: High frequency word entrainment in spoken dialogue. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies, pp. 169–172, ACM, New York (2008)Google Scholar
  105. 268.
    Norton, R.W., Pettegrew, L.S.: Communicator style as an effect determinant of attraction. Commun. Res. 4, 257–282 (1977)CrossRefGoogle Scholar
  106. 271.
    Oviatt, S., Cohen, P.R.: The Paradigm Shift to Multimodality in Contemporary Computer Interfaces. Synthesis Lectures on Human-Centered Informatics. Morgan & Claypool, Wadsworth (2015)Google Scholar
  107. 272.
    Oviatt, S., Darves, C., Coulston, R.: Toward adaptive conversational interfaces: modeling speech convergence with animated personas. ACM Trans. Compu. Hum. Interact. 11, 300–328 (2004)CrossRefGoogle Scholar
  108. 275.
    Pardo, J.S.: On phonetic convergence during conversational interaction. J. Acoust. Soc. Am. 119(4), 2382–2393 (2006)CrossRefGoogle Scholar
  109. 276.
    Pérez, J., Gálvez, R., Gravano, A.: Disentrainment may be a positive thing: a novel measure of unsigned acoustic-prosodic synchrony, and its relation to speaker engagement. In: Proceedings of the Interspeech, pp. 1270–1274 (2016)Google Scholar
  110. 279.
    Perakakis, M., Potamianos, A.: Multimodal system evaluation using modality efficiency and synergy metrics. In: Proceedings of the International Conference on Multimodal Interaction (ICMI), pp. 9–16. ACM, New York (2008)Google Scholar
  111. 283.
    Pickering, M.J., Garrod, S.: Toward a mechanistic psychology of dialogue. Behav. Brain Sci. 27, 169–225 (2004)Google Scholar
  112. 284.
    Pickering, M.J., Garrod, S.: Alignment as the basis for successful communication. Res. Lang. Comput. 4, 203–228 (2006)CrossRefGoogle Scholar
  113. 286.
    Polychroniou, A.: The SSPNet—mobile corpus: from the detection of non-verbal cues to the inference of social behaviour during mobile phone conversations. Ph.D. thesis, University of Glasgow (2014)Google Scholar
  114. 288.
    Puckette, M.: The theory and technique of electronic music. (2007)CrossRefGoogle Scholar
  115. 289.
    Putnam, W.B., Street, R.L.J.: The conception and perception of noncontent speech performance: implications for speech-accommodation theory. Int. J. Sociol. Lang. 46, 97–114 (1984)Google Scholar
  116. 292.
    Ramakrishna, A., Greer, T., Atkins, D., Narayanan, S.: Computational modeling of conversational humor in psychotherapy. In: Proceedings of the Interspeech (2018)Google Scholar
  117. 295.
    Reeves, B., Nass, C.: The Eedia Equation: How People Treat Computers, Television, and New Media Like Real People and Places. Cambridge University Press, Cambridge (1996)Google Scholar
  118. 296.
    Reithinger, N., Klesen, M.: Dialog act classification using language models. In: Proceedings of the European Conference on Speech Communication and Technology, Rhodes, pp. 2235–2238 (1997)Google Scholar
  119. 297.
    Reitter, D., Moore, J.: Predicting success in dialogue. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), vol. 45, pp. 808–815 (2007)Google Scholar
  120. 298.
    Reitter, D., Moore, J.: Alignment and task success in spoken dialogue. J. Mem. Lang. 76, 29–46 (2014)CrossRefGoogle Scholar
  121. 305.
    Ruttkay, Z., C., D., Noot, H.: Embodied conversational agents on a common ground. a framework for design and evaluation. In: Ruttkay, Z., Pelachaud, C. (eds.) From Brows to Trust: Evaluating Embodied Conversational Agents, pp. 27–66. Springer, New York (2004)Google Scholar
  122. 306.
    Scapin, D., Senach, B., Trousse, B., Pallot, M.: User experience: buzzword or new paradigm? In: 5th International Conference on Advances in Computer-Human Interactions (ACHI), Valencia, pp. 336–341 (2012)Google Scholar
  123. 314.
    Schmitt, A., Minker, W.: Towards Adaptive Spoken Dialog Systems. Springer, New York (2013)zbMATHCrossRefGoogle Scholar
  124. 316.
    Schoenenberg, K.: The quality of mediated-conversations under transmission delay. Ph.D. thesis, Technische Universität Berlin (2015)Google Scholar
  125. 320.
    Schweitzer, A., Lewandowski, N.: Convergence of articulation rate in spontaneous speech. In: Proceedings of the Interspeech, pp. 525–529 (2013)Google Scholar
  126. 321.
    Schweitzer, A., Walsh, M.: Exemplar dynamics in phonetic convergence of speech rate. In: Proceedings of the Interspeech, pp. 2100–2104 (2016)Google Scholar
  127. 322.
    Shepard, C.A., Giles, H., Le Poire, B.A.: Communication accommodation theory. In: Robinson, W.P., Giles, H. (eds.) The New Handbook of Language and Social Psychology, pp. 33–56. Wiley, New York (2001)Google Scholar
  128. 323.
    Shriberg, E., Bates, R., Stolcke, A., Taylor, P., Jurafsky, D., Ries, K., Coccaro, N., Martin, R., Meteer, M., Ess-Dykema, C.V.: Can prosody aid the automatic classification of dialog acts in conversational speech? Lang. Speech 41, 439–487 (1998)CrossRefGoogle Scholar
  129. 324.
    Silber-Varod, V., Lerner, A., Jokisch, O.: Automatic speaker’s role classification with a bottom-up acoustic feature selection. In: Proceedings of the International Workshop on Grounding Language Understanding (GLU), pp. 52–56 (2017)Google Scholar
  130. 332.
    Steininger, S., Schiel, F., Rabold, S.: Annotation of multimodal data. In: Wahlster, W. (ed.) SmartKom: Foundations of Multimodal Dialogue Systems, Cognitive Technologies, pp. 571–596. Springer, Berlin (2006)CrossRefGoogle Scholar
  131. 333.
    Stolcke, A., Coccaro, N., Bates, R., Taylor, P., Ess-Dykema, C.V., Ries, K., Shriberg, E., Jurafsky, D., Martin, R., Meteer, M.: Dialog act modeling for automatic tagging and recognition of conversational speech. Comput. Linguist. 26, 339–373 (2000)CrossRefGoogle Scholar
  132. 334.
    Street, R.L.: Evaluation of noncontent speech accommodation. Lang. Commun. 2, 13–31 (1982)CrossRefGoogle Scholar
  133. 335.
    Street, R.L.: Speech convergence and speech evaluation in fact-finding interviews. Hum. Commun. Res. 11, 139–169 (1984)CrossRefGoogle Scholar
  134. 336.
    Street, R.L.J.: Participant-observer differences in speech evaluation. J. Lang. Soc. Psychol. 4, 125–130 (1985)CrossRefGoogle Scholar
  135. 340.
    Suhm, B., Waibel, A.: Toward better language models for spontaneous speech. In: Proceedings of the International Conference on Spoken Language Processing, Yokohama, pp. 831–834 (1994)Google Scholar
  136. 346.
    Thiran, J.P., Marqués, F., Bourlard, H.: Multimodal Signal Processing. Theory and Applications for Human-Computer Interaction. Academic Press, Oxford (2010)Google Scholar
  137. 347.
    Thomason, J., Nguyen, H.V., Litman, D.: Prosodic entrainment and tutoring dialogue success. In: Proceedings of the Artificial Intelligence in Education, pp. 750–753 (2013)Google Scholar
  138. 351.
    Tractinsky, N., Cokhavi, A., Kirschenbaum, M., Sharfi, T.: Evaluating the consistency of immediate aesthetic perceptions of web pages. Int. J. Hum. Comput. Stud. 64, 1071–1083 (2006)CrossRefGoogle Scholar
  139. 352.
    Truong, K.P., Heylen, D.: Measuring prosodic alignment in cooperative task-based conversations. In: Proceedings of the Interspeech, pp. 843–846 (2012)Google Scholar
  140. 353.
    Tuch, A.N., Presslaber, E.E., Stöcklin, M., Opwis, K., Bargas-Avila, J.A.: The role of visual complexity and prototypicality regarding first impression of websites: working towards understanding aesthetic judgments. Int. J. Hum. Comput. Stud. 70(11), 794–811 (2012)CrossRefGoogle Scholar
  141. 360.
    Vinciarelli, A., Salamin, H., Polychroniou, A., Mohammadi, G., Origlia, A.: From nonverbal cues to perception: personality and social attractiveness. In: Cognitive Behavioural Systems. Lecture Notes in Computer Science, vol. 7403, pp. 60–72. Springer, Berlin (2012)CrossRefGoogle Scholar
  142. 366.
    Walker, M.A., Passonneau, R.: DATE: a dialog act tagging scheme for evaluation of spoken dialog systems. In: Proceedings of the Human Language Technology Conference (HLT), pp. 1–8 (2001)Google Scholar
  143. 367.
    Walker, M.A., Kamm, C.A., Litman, D.J.: Towards developing general models of usability with PARADISE. Nat. Lang. Eng. 6, 464–377 (2000)CrossRefGoogle Scholar
  144. 368.
    Walker, M.A., Passonneau, R., Boland, J.E.: Quantitative and qualitative evaluation of Darpa Communicator spoken dialogue systems. In: Proceedings of the Annual Meeting on Association for Computational Linguistics, pp. 515–522 (2001)Google Scholar
  145. 369.
    Walker, M.A., Litman, D.J., Kamm, C.A., Abella, A.: PARADISE: a framework for evaluating spoken dialogue agents. In: Proceedings of the Association for Computational Linguistics, European Chapter (ACL/EACL), pp. 271–280 (1997)Google Scholar
  146. 370.
    Walker, M.A., Litman, D.J., Kamm, C.A., Abella, A.: Evaluating spoken dialogue agents with PARADISE: two case studies. Comput. Speech Lang. 12, 317–347 (1998)CrossRefGoogle Scholar
  147. 373.
    Ward, A., Litman, D.: Dialog convergence and learning. In: Proceedings of the Artificial Intelligence in Education, pp. 1–8 (2007)Google Scholar
  148. 374.
    Ward, N., Nakagawa, S.: Automatic user-adaptive speaking rate selection for information delivery. In: Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP), pp. 549–552 (1990)Google Scholar
  149. 378.
    Wechsung, I., Ehrenbrink, P., Schleicher, R., Möller, S.: Investigating the social facilitation effect in human-robot interaction. In: International Workshop on Spoken Dialogue Systems (IWSDS), pp. 125–134 (2012)Google Scholar
  150. 379.
    Wechsung, I., Weiss, B., Ehrenbrink, P., Möller, S.: Development and validation of the conversational agents scale (CAS). In: Interspeech, Lyon, pp. 1106–1110 (2013)Google Scholar
  151. 381.
    Wechsung, I., Schulz, M., Engelbrecht, K.P., Niemann, J., Möller, S.: All users are (not) equal—the influence of user characteristics on perceived quality, modality choice and performance. In: Workshop on Paralinguistic Information and its Integration in Spoken Dialogue Systems (IWSDS), pp. 175–188 (2011)CrossRefGoogle Scholar
  152. 389.
    Weiss, B., Hillmann, S.: Feedback matters: applying dialog act annotation to study social attractiveness in three-party conversations. In: ACL-ISO Workshop on Interoperable Semantic Annotation, Portorož, pp. 55–58 (2016)Google Scholar
  153. 391.
    Weiss, B., Schoenenberg, K.: Conversational structures affecting auditory likeability. In: Interspeech, pp. 1791–1795 (2014)Google Scholar
  154. 392.
    Weiss, B., Tönges, R.: Automatic adaption of spoken dialog systems for public and working environments. In: IADIS International Conference on Interfaces and Human Computer Interaction (IHCI), Lisbon, pp. 284–288 (2012)Google Scholar
  155. 399.
    Weiss, B., Wechsung, I., Hillmann, S., Möller, S.: Multimodal HCI: exploratory studies on effects of first impression and single modality ratings in retrospective evaluation. J. Multimodal User Interfaces 11(2), 115–131 (2017)CrossRefGoogle Scholar
  156. 400.
    Weiss, B., Wechsung, I., Marquardt, S.: Assessing ICT user groups. In: ACM NordiCHI, Copenhagen, pp. 275–283 (2012)Google Scholar
  157. 401.
    Weiss, B., Willkomm, S., Möller, S.: Evaluating an adaptive dialog system for the public. In: Interspeech, Lyon, pp. 2034–2038 (2013)Google Scholar
  158. 402.
    Weiss, B., Wechsung, I., Kühnel, C., Möller, S.: Evaluating embodied conversational agents in multimodal interfaces. Comput. Cogn. Sci. 1:6, 1–21 (2015)Google Scholar
  159. 404.
    Weiss, B., Kühnel, C., Wechsung, I., Fagel, S., Möller, S.: Quality of talking heads in different interaction and media contexts. Speech Commun. 52(6), 481–492 (2010)CrossRefGoogle Scholar
  160. 406.
    Weiss, B., Guse, D., Möller, S., Raake, A., Borowiak, A., Reiter, U.: Temporal development of quality of experience. In: Möller, S., Raake, A. (eds.) Quality of Experience: Advanced Concepts, Applications and Methods, pp. 133–147. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  161. 408.
    Williams, K.D., Cheung, C.K.T., Choi, W.: Cyberostracism: Effects of being ignored over the internet. J. Pers. Soc. Psychol. 79, 748–762 (2000)CrossRefGoogle Scholar
  162. 410.
    Włodarczak, M., Simko, J., Wagner, P.: Temporal entrainment in overlapped speech: cross-linguistic study. In: Proceedings of the Interspeech, pp. 615–618 (2012)Google Scholar
  163. 411.
    Wolters, M., Georgila, K., MacPherson, S., Moore, J.: Being old doesn’t mean acting old: older users’ interaction with spoken dialogue systems. ACM Trans. Accessible Comput. 2(1), 1–39 (2009)CrossRefGoogle Scholar
  164. 413.
    Wright Hastie, H., Poesio, M., Isard, S.: Automatically predicting dialoguestructure using prosodic features. Speech Commun. 36, 63–79 (1998)zbMATHCrossRefGoogle Scholar
  165. 415.
    Yang, Z., Narayanan, S.: Analyzing temporal dynamics of dyadic synchrony in affective interactions. In: Proceedings of the Interspeech, pp. 42–46 (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Benjamin Weiss
    • 1
  1. 1.Technische Universität BerlinBerlinGermany

Personalised recommendations