Abstract
This chapter describes a first experimental study about discrimination of speech quality change along separate perceptual quality dimensions (“discontinuity,” “noisiness,” “coloration”), as induced by different types of speech quality impairment (random frame loss, signal-correlated noise, bandpass filtering). It could be demonstrated that on average participants were distinguishing between those perceptual dimensions, with perceived degradation intensity being kept constant. Furthermore, evidence pointed toward the internal formation of distinct sensory and short-term perceptual quality references [Preliminary analyses of empirical data presented in this chapter have been conducted in Uhrig et al. (J. Neural Eng. 16(3):036009, 2019).]
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Still, it could not be entirely ruled out that participants spontaneously derived cognitive quality judgments as their quality awareness was heightened by more conspicuous quality changes (see Sect. 2.2.2), especially by more annoying, “negative” changes from high to low quality.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
References
S. Möller, Assessment and Prediction of Speech Quality in Telecommunications (Springer, Boston, 2000)
S. Uhrig, G. Mittag, S. Möller, J.-N. Voigt-Antons, Neural correlates of speech quality dimensions analyzed using electroencephalography (EEG). J. Neural Eng. 16(3), 036009 (2019)
J. Blauert and U. Jekosch, Auditory quality of performance spaces for music – the problem of the references, in Proceedings of the 19th International Congress on Acoustics (ICA 2007) (2007), pp. 1205–1210
S. Möller, A. Raake, M. Wältermann, N. Côté, Towards a universal scale for perceptual value, in 2010 Second International Workshop on Quality of Multimedia Experience (QoMEX) (IEEE, Trondheim, 2010), pp. 142–146
S. Möller, M. Wältermann, A. Raake, About the nature of references – and implications for quality prediction, in Proceedings of the Forum Acusticum 2011 (DK-Aalborg, European Acoustics Assoc., 2011), pp. 1205–1210
A. Raake, J. Blauert, Comprehensive modeling of the formation process of sound-quality, in 2013 Fifth International Workshop on Quality of Multimedia Experience (QoMEX) (IEEE, Klagenfurt am Wörthersee, 2013), pp. 76–81
ITU-T Recommendation P.800, Methods for Subjective Determination of Transmission Quality (International Telecommunication Union (ITU), Geneva, 1996)
J. Polich, Updating P300: an integrative theory of P3a and P3b. Clin. Neurophysiol. 118(10), 2128–2148 (2007)
E.N. Sokolov, J.A. Spinks, R. Näätänen, H. Lyytinen, The Orienting Response in Information Processing (Lawrence Erlbaum Associates Publishers, Mahwah, 2002)
D. Friedman, Y.M. Cycowicz, H. Gaeta, The novelty P3: an event-related brain potential (ERP) sign of the brain’s evaluation of novelty. Neurosci. Biobehav. Rev. 25(4), 355–373 (2001)
A. Kok, On the utility of P3 amplitude as a measure of processing capacity. Psychophysiology 38(3), 557–577 (2001)
R. Verleger, P. Jaśkowski, E. Wascher, Evidence for an integrative role of P3b in linking reaction to perception. J. Psychophysiol. 19(3), 165–181 (2005)
S. Arndt, K. Brunnström, E. Cheng, U. Engelke, S. Möller, J.-N. Antons, Review on using physiology in quality of experience. Electron. Imaging 2016(16), 1–9 (2016)
U. Engelke, D.P. Darcy, G.H. Mulliken, S. Bosse, M.G. Martini, S. Arndt, J.-N. Antons, K.Y. Chan, N. Ramzan, K. Brunnström, Psychophysiology-based QoE assessment: a survey. IEEE J. Select. Topics Signal Proces. 11(1), 6–21 (2017)
S. Uhrig, A. Perkis, D.M. Behne, Does P3 reflect speech quality change? controlling for auditory evoked activity in event-related brain potential (ERP) waveforms, in 2019 IEEE International Symposium on Multimedia (ISM) (IEEE, San Diego, 2019), pp. 152–1523
F. Pulvermüller, Y. Shtyrov, Language outside the focus of attention: the mismatch negativity as a tool for studying higher cognitive processes. Prog. Neurobiol. 79(1), 49–71 (2006)
M. Wältermann, Dimension-Based Quality Modeling of Transmitted Speech. T-Labs Series in Telecommunication Services (Springer, Heidelberg, 2013)
ITU-T Recommendation G.191, Software Tools for Speech and Audio Coding Standardization (International Telecommunication Union (ITU), Geneva, 2010)
ITU-T Recommendation P.810, Modulated Noise Reference Unit (MNRU) (International Telecommunication Union (ITU), Geneva, 1996)
ITU-T Recommendation P.56, Objective Measurement of Active Speech Level (International Telecommunication Union (ITU), Geneva, 2011)
ITU-T Recommendation P.10/G.100, Vocabulary for Performance, Quality of Service and Quality of Experience (International Telecommunication Union (ITU), Geneva, 2017)
S. Olejnik, J. Algina, Generalized Eta and Omega squared statistics: measures of effect size for some common research designs. Psychol. Method. 8(4), 434–447 (2003)
R. Johnson, On the neural generators of the P300 component of the event-related potential. Psychophysiology 30(1), 90–97 (1993)
C.J. Billings, K.O. Bennett, M.R. Molis, M.R. Leek, Cortical encoding of signals in noise: effects of stimulus type and recording paradigm. Ear Hear., 1 (2011)
M. Sams, K. Alho, R. Näätänen, Short-term habituation and dishabituation of the mismatch negativity of the ERP. Psychophysiology 21(4), 434–441 (1984)
J.M.K. Nousak, D. Deacon, W. Ritter, H.G. Vaughan, Storage of information in transient auditory memory. Cognit. Brain Res. 4(4), 305–317 (1996)
S.J. Luck, An Introduction to the Event-Related Potential Technique, 2nd edn. (The MIT Press, Cambridge, 2014)
S.J. Luck, N. Gaspelin, How to get statistically significant effects in any ERP experiment (and why you shouldn’t). Psychophysiology 54(1), 146–157 (2017)
S.A. Hillyard, M. Kutas, Electrophysiology of cognitive processing. Annu. Rev. Psychol. 34(1), 33–61 (1983)
R. Tibon, D.A. Levy, Striking a balance: analyzing unbalanced event-related potential data. Front. Psychol. 6, 555 (2015)
E. Courchesne, Changes in P3 waves with event repetition: long-term effects on scalp distribution and amplitude. Electroencephalogr. Clin. Neurophysiol. 45(6), 754–766 (1978)
D. Friedman, G.V. Simpson, ERP amplitude and scalp distribution to target and novel events: effects of temporal order in young, middle-aged and older adults. Cognit. Brain Res. 2(1), 49–63 (1994)
J. Katayama, J. Polich, Auditory and visual P300 topography from a 3 stimulus paradigm. Clin. Neurophysiol. 110(3), 463–468 (1999)
A.K. Porbadnigk, J.-N. Antons, B. Blankertz, M.S. Treder, R. Schleicher, S. Möller, G. Curio, Using ERPs for assessing the (sub) conscious perception of noise, in Proceeding of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (IEEE, Piscataway, 2010), pp. 2690–2693
J.-N. Antons, R. Schleicher, S. Arndt, S. Möller, A.K. Porbadnigk, G. Curio, Analyzing speech quality perception using electroencephalography. IEEE J. Select. Topics Signal Process. 6(6), 721–731 (2012)
S. Arndt, J.-N. Antons, R. Gupta, K. ur Rehman Laghari, R. Schleicher, S. Möller, T.H. Falk, Subjective quality ratings and physiological correlates of synthesized speech, in 2013 Fifth International Workshop on Quality of Multimedia Experience (QoMEX) (IEEE, Klagenfurt am Wörthersee, 2013), pp. 152–157
A.K. Porbadnigk, M.S. Treder, B. Blankertz, J.-N. Antons, R. Schleicher, S. Möller, G. Curio, K.-R. Müller, Single-trial analysis of the neural correlates of speech quality perception. J. Neural Eng. 10(5), 056003 (2013)
S. Uhrig, A. Perkis, D.M. Behne, Effects of speech transmission quality on sensory processing indicated by the cortical auditory evoked potential. J. Neural Eng. 17(4), 046021 (2020)
Author information
Authors and Affiliations
Appendix
Appendix
(3) | ||||||
Contrast | Estimate | SEM | t | p | ||
Fz | T | HQ − LQ-Col | −1.002 | 1.558 | −0.643 | 0.520 |
HQ − LQ-Noi | 4.213 | 1.555 | 2.710 | 0.013∗ | ||
LQ-Col − LQ-Noi | 5.215 | 1.528 | 3.413 | 0.002∗ | ||
D | HQ − LQ-Col | −1.446 | 1.430 | −1.011 | 0.936 | |
HQ − LQ-Noi | −0.046 | 1.411 | −0.032 | 0.974 | ||
LQ-Col − LQ-Noi | 1.401 | 1.420 | 0.987 | 0.936 | ||
Cz | T | HQ − LQ-Col | −0.607 | 1.558 | −0.390 | 0.697 |
HQ − LQ-Noi | −2.339 | 1.555 | −1.505 | 0.397 | ||
LQ-Col − LQ-Noi | −1.732 | 1.528 | −1.134 | 0.514 | ||
D | HQ − LQ-Col | 2.186 | 1.430 | 1.529 | 0.379 | |
HQ − LQ-Noi | 0.991 | 1.411 | 0.702 | 0.800 | ||
LQ-Col − LQ-Noi | −1.195 | 1.419 | −0.842 | 0.800 | ||
Pz | T | HQ − LQ-Col | −4.336 | 1.558 | −2.783 | 0.011∗ |
HQ − LQ-Noi | −7.141 | 1.555 | −4.594 | <0.001∗ | ||
LQ-Col − LQ-Noi | −2.805 | 1.528 | −1.836 | 0.066 | ||
D | HQ − LQ-Col | −3.840 | 1.430 | −2.685 | 0.015∗ | |
HQ − LQ-Noi | 0.382 | 1.411 | 0.271 | 0.787 | ||
LQ-Col − LQ-Noi | 4.221 | 1.420 | 2.974 | 0.009∗ |
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Uhrig, S. (2022). Discrimination of Speech Quality Change Along Perceptual Dimensions (Study I). In: Human Information Processing in Speech Quality Assessment. T-Labs Series in Telecommunication Services. Springer, Cham. https://doi.org/10.1007/978-3-030-71389-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-71389-8_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71388-1
Online ISBN: 978-3-030-71389-8
eBook Packages: EngineeringEngineering (R0)