Skip to main content

Discrimination of Speech Quality Change Along Perceptual Dimensions (Study I)

  • Chapter
  • First Online:
Human Information Processing in Speech Quality Assessment

Part of the book series: T-Labs Series in Telecommunication Services ((TLABS))

  • 258 Accesses

Abstract

This chapter describes a first experimental study about discrimination of speech quality change along separate perceptual quality dimensions (“discontinuity,” “noisiness,” “coloration”), as induced by different types of speech quality impairment (random frame loss, signal-correlated noise, bandpass filtering). It could be demonstrated that on average participants were distinguishing between those perceptual dimensions, with perceived degradation intensity being kept constant. Furthermore, evidence pointed toward the internal formation of distinct sensory and short-term perceptual quality references [Preliminary analyses of empirical data presented in this chapter have been conducted in Uhrig et al. (J. Neural Eng. 16(3):036009, 2019).]

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Still, it could not be entirely ruled out that participants spontaneously derived cognitive quality judgments as their quality awareness was heightened by more conspicuous quality changes (see Sect. 2.2.2), especially by more annoying, “negative” changes from high to low quality.

  2. 2.

    http://psychtoolbox.org/.

  3. 3.

    https://cran.r-project.org/web/packages/ez/.

  4. 4.

    https://www.dgps.de/fileadmin/documents/EK/TLInfo_EEG_V2.docx.

  5. 5.

    https://sccn.ucsd.edu/wiki/Makoto%27s_preprocessing_pipeline.

  6. 6.

    https://cran.r-project.org/web/packages/nlme/.

  7. 7.

    https://cran.r-project.org/web/packages/multcomp/.

References

  1. S. Möller, Assessment and Prediction of Speech Quality in Telecommunications (Springer, Boston, 2000)

    Book  Google Scholar 

  2. S. Uhrig, G. Mittag, S. Möller, J.-N. Voigt-Antons, Neural correlates of speech quality dimensions analyzed using electroencephalography (EEG). J. Neural Eng. 16(3), 036009 (2019)

    Google Scholar 

  3. J. Blauert and U. Jekosch, Auditory quality of performance spaces for music – the problem of the references, in Proceedings of the 19th International Congress on Acoustics (ICA 2007) (2007), pp. 1205–1210

    Google Scholar 

  4. S. Möller, A. Raake, M. Wältermann, N. Côté, Towards a universal scale for perceptual value, in 2010 Second International Workshop on Quality of Multimedia Experience (QoMEX) (IEEE, Trondheim, 2010), pp. 142–146

    Google Scholar 

  5. S. Möller, M. Wältermann, A. Raake, About the nature of references – and implications for quality prediction, in Proceedings of the Forum Acusticum 2011 (DK-Aalborg, European Acoustics Assoc., 2011), pp. 1205–1210

    Google Scholar 

  6. A. Raake, J. Blauert, Comprehensive modeling of the formation process of sound-quality, in 2013 Fifth International Workshop on Quality of Multimedia Experience (QoMEX) (IEEE, Klagenfurt am Wörthersee, 2013), pp. 76–81

    Google Scholar 

  7. ITU-T Recommendation P.800, Methods for Subjective Determination of Transmission Quality (International Telecommunication Union (ITU), Geneva, 1996)

    Google Scholar 

  8. J. Polich, Updating P300: an integrative theory of P3a and P3b. Clin. Neurophysiol. 118(10), 2128–2148 (2007)

    Article  Google Scholar 

  9. E.N. Sokolov, J.A. Spinks, R. Näätänen, H. Lyytinen, The Orienting Response in Information Processing (Lawrence Erlbaum Associates Publishers, Mahwah, 2002)

    Google Scholar 

  10. D. Friedman, Y.M. Cycowicz, H. Gaeta, The novelty P3: an event-related brain potential (ERP) sign of the brain’s evaluation of novelty. Neurosci. Biobehav. Rev. 25(4), 355–373 (2001)

    Article  Google Scholar 

  11. A. Kok, On the utility of P3 amplitude as a measure of processing capacity. Psychophysiology 38(3), 557–577 (2001)

    Article  Google Scholar 

  12. R. Verleger, P. Jaśkowski, E. Wascher, Evidence for an integrative role of P3b in linking reaction to perception. J. Psychophysiol. 19(3), 165–181 (2005)

    Article  Google Scholar 

  13. S. Arndt, K. Brunnström, E. Cheng, U. Engelke, S. Möller, J.-N. Antons, Review on using physiology in quality of experience. Electron. Imaging 2016(16), 1–9 (2016)

    Article  Google Scholar 

  14. U. Engelke, D.P. Darcy, G.H. Mulliken, S. Bosse, M.G. Martini, S. Arndt, J.-N. Antons, K.Y. Chan, N. Ramzan, K. Brunnström, Psychophysiology-based QoE assessment: a survey. IEEE J. Select. Topics Signal Proces. 11(1), 6–21 (2017)

    Article  Google Scholar 

  15. S. Uhrig, A. Perkis, D.M. Behne, Does P3 reflect speech quality change? controlling for auditory evoked activity in event-related brain potential (ERP) waveforms, in 2019 IEEE International Symposium on Multimedia (ISM) (IEEE, San Diego, 2019), pp. 152–1523

    Google Scholar 

  16. F. Pulvermüller, Y. Shtyrov, Language outside the focus of attention: the mismatch negativity as a tool for studying higher cognitive processes. Prog. Neurobiol. 79(1), 49–71 (2006)

    Article  Google Scholar 

  17. M. Wältermann, Dimension-Based Quality Modeling of Transmitted Speech. T-Labs Series in Telecommunication Services (Springer, Heidelberg, 2013)

    Google Scholar 

  18. ITU-T Recommendation G.191, Software Tools for Speech and Audio Coding Standardization (International Telecommunication Union (ITU), Geneva, 2010)

    Google Scholar 

  19. ITU-T Recommendation P.810, Modulated Noise Reference Unit (MNRU) (International Telecommunication Union (ITU), Geneva, 1996)

    Google Scholar 

  20. ITU-T Recommendation P.56, Objective Measurement of Active Speech Level (International Telecommunication Union (ITU), Geneva, 2011)

    Google Scholar 

  21. ITU-T Recommendation P.10/G.100, Vocabulary for Performance, Quality of Service and Quality of Experience (International Telecommunication Union (ITU), Geneva, 2017)

    Google Scholar 

  22. S. Olejnik, J. Algina, Generalized Eta and Omega squared statistics: measures of effect size for some common research designs. Psychol. Method. 8(4), 434–447 (2003)

    Article  Google Scholar 

  23. R. Johnson, On the neural generators of the P300 component of the event-related potential. Psychophysiology 30(1), 90–97 (1993)

    Article  Google Scholar 

  24. C.J. Billings, K.O. Bennett, M.R. Molis, M.R. Leek, Cortical encoding of signals in noise: effects of stimulus type and recording paradigm. Ear Hear., 1 (2011)

    Google Scholar 

  25. M. Sams, K. Alho, R. Näätänen, Short-term habituation and dishabituation of the mismatch negativity of the ERP. Psychophysiology 21(4), 434–441 (1984)

    Article  Google Scholar 

  26. J.M.K. Nousak, D. Deacon, W. Ritter, H.G. Vaughan, Storage of information in transient auditory memory. Cognit. Brain Res. 4(4), 305–317 (1996)

    Article  Google Scholar 

  27. S.J. Luck, An Introduction to the Event-Related Potential Technique, 2nd edn. (The MIT Press, Cambridge, 2014)

    Google Scholar 

  28. S.J. Luck, N. Gaspelin, How to get statistically significant effects in any ERP experiment (and why you shouldn’t). Psychophysiology 54(1), 146–157 (2017)

    Article  Google Scholar 

  29. S.A. Hillyard, M. Kutas, Electrophysiology of cognitive processing. Annu. Rev. Psychol. 34(1), 33–61 (1983)

    Article  Google Scholar 

  30. R. Tibon, D.A. Levy, Striking a balance: analyzing unbalanced event-related potential data. Front. Psychol. 6, 555 (2015)

    Google Scholar 

  31. E. Courchesne, Changes in P3 waves with event repetition: long-term effects on scalp distribution and amplitude. Electroencephalogr. Clin. Neurophysiol. 45(6), 754–766 (1978)

    Article  Google Scholar 

  32. D. Friedman, G.V. Simpson, ERP amplitude and scalp distribution to target and novel events: effects of temporal order in young, middle-aged and older adults. Cognit. Brain Res. 2(1), 49–63 (1994)

    Article  Google Scholar 

  33. J. Katayama, J. Polich, Auditory and visual P300 topography from a 3 stimulus paradigm. Clin. Neurophysiol. 110(3), 463–468 (1999)

    Article  Google Scholar 

  34. A.K. Porbadnigk, J.-N. Antons, B. Blankertz, M.S. Treder, R. Schleicher, S. Möller, G. Curio, Using ERPs for assessing the (sub) conscious perception of noise, in Proceeding of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (IEEE, Piscataway, 2010), pp. 2690–2693

    Google Scholar 

  35. J.-N. Antons, R. Schleicher, S. Arndt, S. Möller, A.K. Porbadnigk, G. Curio, Analyzing speech quality perception using electroencephalography. IEEE J. Select. Topics Signal Process. 6(6), 721–731 (2012)

    Article  Google Scholar 

  36. S. Arndt, J.-N. Antons, R. Gupta, K. ur Rehman Laghari, R. Schleicher, S. Möller, T.H. Falk, Subjective quality ratings and physiological correlates of synthesized speech, in 2013 Fifth International Workshop on Quality of Multimedia Experience (QoMEX) (IEEE, Klagenfurt am Wörthersee, 2013), pp. 152–157

    Google Scholar 

  37. A.K. Porbadnigk, M.S. Treder, B. Blankertz, J.-N. Antons, R. Schleicher, S. Möller, G. Curio, K.-R. Müller, Single-trial analysis of the neural correlates of speech quality perception. J. Neural Eng. 10(5), 056003 (2013)

    Google Scholar 

  38. S. Uhrig, A. Perkis, D.M. Behne, Effects of speech transmission quality on sensory processing indicated by the cortical auditory evoked potential. J. Neural Eng. 17(4), 046021 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Appendix

Appendix

Table A.3 Post hoc pairwise comparisons for ERP parameter analysis of oddball quality. Asterisks indicate statistical significance. Subtable numbers point to corresponding effect indices (#) in Table 5.2. T, target; D, distractor

(3)

  
 

Contrast

Estimate

SEM

t

p

Fz

T

HQ − LQ-Col

−1.002

1.558

−0.643

0.520

  

HQ − LQ-Noi

4.213

1.555

2.710

0.013

  

LQ-Col − LQ-Noi

5.215

1.528

3.413

0.002

 

D

HQ − LQ-Col

−1.446

1.430

−1.011

0.936

  

HQ − LQ-Noi

−0.046

1.411

−0.032

0.974

  

LQ-Col − LQ-Noi

1.401

1.420

0.987

0.936

Cz

T

HQ − LQ-Col

−0.607

1.558

−0.390

0.697

  

HQ − LQ-Noi

−2.339

1.555

−1.505

0.397

  

LQ-Col − LQ-Noi

−1.732

1.528

−1.134

0.514

 

D

HQ − LQ-Col

2.186

1.430

1.529

0.379

  

HQ − LQ-Noi

0.991

1.411

0.702

0.800

  

LQ-Col − LQ-Noi

−1.195

1.419

−0.842

0.800

Pz

T

HQ − LQ-Col

−4.336

1.558

−2.783

0.011

  

HQ − LQ-Noi

−7.141

1.555

−4.594

<0.001

  

LQ-Col − LQ-Noi

−2.805

1.528

−1.836

0.066

 

D

HQ − LQ-Col

−3.840

1.430

−2.685

0.015

  

HQ − LQ-Noi

0.382

1.411

0.271

0.787

  

LQ-Col − LQ-Noi

4.221

1.420

2.974

0.009

Table A.4 Post hoc pairwise comparisons for ERP parameter analysis of reference quality. Asterisks indicate statistical significance. Subtable numbers point to corresponding effect indices (#) in Table 5.3. T, target; D, distractor
Fig. A.4
figure 12

Grand average oddball-standard difference waveforms, split up by channel (Fz, Cz, Pz), oddball type (target, distractor), and oddball quality (LQ-Dis, LQ-Noi, LQ-Col), with HQ assigned to constant reference quality. Lines of dots above time axis indicate at which epoch time points each waveform significantly deviates from baseline. The gray area marks the time window for ERP parameter extraction. Error bands represent 95% confidence intervals

Fig. A.5
figure 13

Grand average oddball-standard difference waveforms, split up by channel (Fz, Cz, Pz), oddball type (target, distractor), and reference quality (HQ, LQ-Noi, LQ-Col), with LQ-Dis assigned to constant oddball quality. Lines of dots above time axis indicate at which epoch time points each waveform significantly deviates from baseline. The gray area marks the time window for ERP parameter extraction. Error bands represent 95% confidence intervals

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Uhrig, S. (2022). Discrimination of Speech Quality Change Along Perceptual Dimensions (Study I). In: Human Information Processing in Speech Quality Assessment. T-Labs Series in Telecommunication Services. Springer, Cham. https://doi.org/10.1007/978-3-030-71389-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-71389-8_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-71388-1

  • Online ISBN: 978-3-030-71389-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics