Audio Acquisition, Representation and Storage

  • Francesco CamastraEmail author
  • Alessandro Vinciarelli
Part of the Advanced Information and Knowledge Processing book series (AI&KP)


What the reader should know to understand this chapter \(\bullet \) Basic notions of physics. \(\bullet \) Basic notions of calculus (trigonometry, logarithms, exponentials, etc.)


Audio Signal Vocal Tract Basilar Membrane Mean Opinion Score Perceptual Quality 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems. Technical report, International Telecommunication Union, 1997.Google Scholar
  2. 2.
    L.L. Beranek. Concert hall acoustics. The Journal of the Acoustical Society of America, 92(1):1–39, 1992.Google Scholar
  3. 3.
    D.T. Blackstock. Fundamentals of Physical Acoustics. John Wiley and Sons, 2000.Google Scholar
  4. 4.
    J. Bormans, J. Gelissen, and A. Perkis. MPEG-21: The 21st century multimedia framework. IEEE Signal Processing Magazine, 20(2):53–62, 2003.Google Scholar
  5. 5.
    M. Bosi and R.E. Goldberg. Introduction to Digital Audio Coding and Standards. Kluwer, 2003.Google Scholar
  6. 6.
    J.C. Brown. Determination of the meter of musical scores by autocorrelation. The Journal of the Acoustical Society of America, 94(4):1953–1957, 1993.Google Scholar
  7. 7.
    R. Burnett, I. and van de Walle, K. Hill, J. Bormans, and F. Pereira. MPEG-21: Goals and achievements. IEEE Multimedia, 10(4):60–70, 2003.Google Scholar
  8. 8.
    M.J. Carey, E.S. Parris, and H. Lloyd-Thomas. A comparison of features for speech-music discrimination. In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, pages 149–152, 1999.Google Scholar
  9. 9.
    J.C. Catford. Theoretical Acoustics. Oxford University Press, 2002.Google Scholar
  10. 10.
    P. Cummiskey. Adaptive quantization in differential PCM coding of speech. Bell Systems Technical Journal, 7:1105, 1973.Google Scholar
  11. 11.
    T.F.W. Embleton. Tutorial on sound propagation outdoors. The Journal of the Acoustical Society of America, 100(1):31–48, 1996.Google Scholar
  12. 12.
    H. Fletcher. Auditory patterns. Review of Modern Physics, pages 47–65, 1940.Google Scholar
  13. 13.
    A. Ghias, J. Logan, D. Chamberlin, and B.C. Smith. Query by humming: musical information retrieval in audio database. In Proceedings of the ACM Conference on Multimedia, pages 231–236, 1995.Google Scholar
  14. 14.
    A. Hanjalic and L.-Q. Xu. Affective video content representation and modeling. IEEE Transactions on Multimedia, 7(1):143–154, 2005.Google Scholar
  15. 15.
    X. Huang, A. Acero, and H.-W. Hon. Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice-Hall, 2001.Google Scholar
  16. 16.
    L.E. Kinsler, A.R. Frey, A.B. Coppens, and J.V. Sanders. Fundamentals of Acoustics. John Wiley and Sons, New York, 2000.Google Scholar
  17. 17.
    P. Ladefoged. Vowels and consonants. Blackwell Publishing, 2001.Google Scholar
  18. 18.
    C.M. Lee and S.S. Narayanan. Toward detecting emotions in spoken dialogs. IEEE Transactions on Multimedia, 13(2):293–303, 2005.Google Scholar
  19. 19.
    L. Lu, H. Jiang, and H.J. Zhang. A robust audio classification and segmentation method. In Proceedings of the ACM Conference on Multimedia, pages 203–211, 2001.Google Scholar
  20. 20.
    Y.-F. Ma, X.-S Hua, L. Lu, and H.-J. Zhang. A generic framework for user attention model and its application in video summarization. IEEE Transactions on Multimedia, 7(5):907–919, 2005.Google Scholar
  21. 21.
    J. Makhoul. Linear prediction: A tutorial review. Proceedings of the IEEE, 63(4):561–580, 1975.Google Scholar
  22. 22.
    B.S. Manjunath, P. Salembier, and T. Sikora, editors. Introduction to MPEG-7. John Wiley and Sons, Chichester, UK, 2002.Google Scholar
  23. 23.
    S.K. Mitra. Digital Signal Processing - A Computer Based Approach. McGraw-Hill, 1998.Google Scholar
  24. 24.
    B.C.J. Moore. An Introduction to the Psychology of Hearing. Academic Press, 1997.Google Scholar
  25. 25.
    P.M. Morse and K. Ingard. Theoretical Acoustics. McGraw-Hill, 1968.Google Scholar
  26. 26.
    P. Noll. Wideband speech and audio coding. IEEE Communications Magazine, (11):34–44, november 1993.Google Scholar
  27. 27.
    P. Noll. MPEG digital audio coding. IEEE Signal Processing Magazine, 14(5):59–81, 1997.Google Scholar
  28. 28.
    B.M. Oliver, J. Pierce, and C.E. Shannon. The philosophy of PCM. Proceedings of IEEE, 36:1324–1331, 1948.Google Scholar
  29. 29.
    A.V. Oppenheim and R.W. Schafer. Discrete-Time Signal Processing. Prentice-Hall, 1989.Google Scholar
  30. 30.
    T. Painter and A. Spanias. Perceptual coding of digital audio. Proceedings of IEEE, 88(4):451–513, 2000.Google Scholar
  31. 31.
    J.O. Pickles. An Introduction to the Physiology of Hearing. Academic Press, 1988.Google Scholar
  32. 32.
    L. Rabiner. On the use of autocorrelation analysis for pitch detection. IEEE Transactions on Acoustics, Speech and Signal Processing, 25(1):24–33, 1977.Google Scholar
  33. 33.
    L.R. Rabiner and R.W. Schafer, editors. Digital Processing of Speech Signals. Prentice-Hall, 1978.Google Scholar
  34. 34.
    L.R. R Rabiner and M.R. Sambur. An algorithm for determining the endpoints of isolated utterances. Bell System Technical Journal, 54(2):297–315, 1975.Google Scholar
  35. 35.
    E. Scheirer and M. Slaney. Construction and evaluation of a robust multifeature speech/music discriminator. In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, pages 1331–1334, 1997.Google Scholar
  36. 36.
    A. Spanias. Speech coding: a tutorial review. Proceedings of IEEE, 82(10):1541–1582, 1994.Google Scholar
  37. 37.
    A.S. Spanias. Speech coding: A tutorial review. Proceedings of the IEEE, 82(10):1541–1582, 1994.Google Scholar
  38. 38.
    S. Sukittanon and L.E. Atlas. Modulation frequency features for audio fingerprinting. In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, pages 1773–1776, 2002.Google Scholar
  39. 39.
    E. Wold, T. Blum, D. Keislar, and J. Wheaten. Content-based classification, search, and retrieval of audio. IEEE MultiMedia, 3(3):27–36, 1996.Google Scholar

Copyright information

© Springer-Verlag London 2015

Authors and Affiliations

  1. 1.Department of Science and TechnologyParthenope University of NaplesNaplesItaly
  2. 2.School of Computing Science and the Institute of Neuroscience and PsychologyUniversity of GlasgowGlasgowUK

Personalised recommendations