Skip to main content

A Compact and Malleable Sines+Transients+Noise Model for Sound

  • Chapter
Analysis, Synthesis, and Perception of Musical Sounds

Part of the book series: Modern Acoustics and Signal Processing ((MASP))

Abstract

This chapter describes an audio representation which supports time and frequency scale modifications in a compressed domain. The input audio is segregated into three component representations: sinusoids, transients, and noise. Each component can be individually quantized and/or time-scaled and/or pitch-shifted.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Ali, M. (1996). “Adaptive signal representation with application in audio coding,” doctoral dissertation, Univ. of Minnesota, Minneapolis, MN, Dissertation Abstracts Int.-B 57-04, 2739.

    Google Scholar 

  • Bosi, M., Brandenburg, K., Quackenbush, S., Fielder, L., Akagiri, K., Fuchs, H., Dietz, M., Herre, J., Davidson, G., and Oikawa, Y. (1997). “ISO-IEC MPEG-2 advanced audio coding,” J. Audio Eng. Soc. 45, 789–814.

    Google Scholar 

  • Bosi, M. and Goldberg, R.E. (2003). Introduction to Digital Audio Coding and Standards (Klumer Academic, Boston).

    Google Scholar 

  • Brandenburg, K. and Bosi, M. (1997), “Overview of MPEG audio: Current and future standards for low-bit-rate audio coding,” J. Audio Eng. Soc. 45(1/2), 4–21.

    Google Scholar 

  • Dolson, M. (1986). “The phase vocoder: A tutorial,” Computer Music J. 10(4), 14–27.

    Article  Google Scholar 

  • Dudley, H. (1939). “Remaking speech,” J. Acoustical Soc. Am. 11, 169–177.

    Article  ADS  Google Scholar 

  • Edler, B., Purnhagen, H., and Ferekidis, C. (1996). “ASAC—analysis/synthesis audio codec for very low-bit rates,” 100th Convention of the Audio Engineering Society, Copenhagen, Audio Eng. Soc. Preprint No. 4179.

    Google Scholar 

  • Flanagan, J. L., and Golden, R. M. (1966). “Phase vocoder,” Bell Syst. Tech. J. 45, 1493–1509. [reprinted in Speech Analysis, R. W. Schafer and J. D. Markel (eds.), IEEE Press, New York, 1979, pp. 388–404].

    Google Scholar 

  • Fliege, N. J., and Zolzer, U. (1993). “Multi-complementary filter bank,” Proc. 1993 Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-93), Minneapolis (IEEE, New York), Vol. 3, pp. 193–196.

    Google Scholar 

  • General Electric Co. (1977). “ADEC subroutine description,” Technical Report, Heavy Military Electronics Department (General Electric Co., Syracuse, NY).

    Google Scholar 

  • George, E. B. and Smith, M. J. T. (1987). “A new speech coding model based on least-squares sinusoidal representation,” Proc. 1987 Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-87), Dallas, TX (IEEE, New York), pp. 1641–1644.

    Google Scholar 

  • George, E. B., and Smith, M. J. T. (1992). “Analysis-by-synthesis/Overlap-add sinusoidal modeling applied to the analysis and synthesis of musical tones,” J. Audio Eng. Soc. 40(6), 497–516.

    Google Scholar 

  • Goodwin, M. (1996). “Residual modeling in music analysis/synthesis,” in Proc. 1996 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-96), Atlanta, GA (IEEE, New York), pp. 1005–1008.

    Chapter  Google Scholar 

  • Griffin, D. W., and Lim, J. S. (1988). “Multiband excitation vocoder,” IEEE Trans. on Acoustics, Speech, Signal Processing 36(8), 1223–1235.

    Article  MATH  Google Scholar 

  • Hamdy, K. N., Ali, M., and Tewfik, A. H. (1996). “Low bit rate high quality audio coding with combined harmonic and wavelet representations,” Proc. 1996 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-96), Atlanta, GA (IEEE, New York), pp. 1045–1048.

    Chapter  Google Scholar 

  • Horner, A. and Beauchamp, J. (1996). “Piecewise Linear Approximation of Additive Synthesis Envelopes: A Comparison of Various Methods,” Computer Music J. 20(2), 72–95.

    Article  Google Scholar 

  • Horner, A., Ayers, L., and Law, D., (1997). “Modeling Small Chinese and Tibetan Bells,” J. Audio Eng. Soc. 45(3), 148–159.

    Google Scholar 

  • Huffman, D. A. (1952). “A Method for the Construction of Minimum-Redundancy Codes,” Proc. IRE 40, 1098–1101.

    Article  Google Scholar 

  • ISE/IEC JTC 1/SC 29/WG 11 (1993). “ISO/IEC 11172-3: Information technology—coding of moving pictures and associated audio for digital storage media at up to about 1.5 mbit/s—Part 3: Audio” (Motion Picture Experts Group, Los Angeles, CA).

    Google Scholar 

  • Laroche, J., Stylianou, Y., and Moulines, E. (1993). “HNM: A simple, efficient harmonic + noise model for speech,” Proc. 1993 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA-93), New Paltz, NY (IEEE, New York), pp. 169–172.

    Chapter  Google Scholar 

  • Laroche, J., and Dolson, M. (1999). “Improved Phase-Vocoder Time-Scale Modification of Audio,” IEEE Trans. Speech and Audio Processing 7(3), 323–332.

    Article  Google Scholar 

  • Levine, S. N. (1998). “Audio representations for data compression and compressed domain processing,” doctoral dissertation, Stanford University, Dissertation Abstracts Int.-B 60/04, 1767. [available for download at http://www-ccrma.stanford.edu/thesis.html; this site also includes audio examples.]

    Google Scholar 

  • Levine, S. N., and Smith, J. O. (1998). “A sines+transients+noise audio representation for data compression and time/pitch-scale modications,” 105th Convention of the Audio Eng. Soc., San Francisco, Audio Eng. Soc. Preprint 4781. [available for download at http://www-ccrma.stanford.edu/papers.html.]

    Google Scholar 

  • Levine, S. N., Verma, T. S., and Smith, J. O. (1998). “Multiresolution sinusoidal modeling for wideband audio with modifications,” Proc. 1998 Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-98), Seattle (IEEE, New York), pp. 3585–3588.

    Chapter  Google Scholar 

  • Levine, S. N., and Smith, J. O. (1999). “A switched parametric and transform audio coder,” in Proc. 1999 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-99), Phoenix (IEEE, New York), pp. 985–988. [available for download at http://www-ccrma.stanford.edu/papers.html.]

    Google Scholar 

  • Malvar, H. (1992). Signal Processing with Lapped Transforms (Artech House Telecommunications Library, Boston), pp. 175–179.

    MATH  Google Scholar 

  • McAulay, R. J. and Quatieri, T. F. (1984). “Magnitude-only reconstruction using a sinusoidal speech model,” Proc. 1984 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-84), San Diego (IEEE, New York), pp. 27.6.1–27.6.4.

    Google Scholar 

  • McAulay, R. J. and Quatieri, T. F. (1985). “Mid-rate coding based on a sinusoidal representation of speech,” Proc. 1985 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-85), Tampa, FL (IEEE, New York), pp. 945–948.

    Chapter  Google Scholar 

  • McAulay, R. J., and Quatieri, T. F. (1986). “Speech analysis/synthesis based on a sinusoidal representation,” IEEE Trans. on Acoustics, Speech and Signal Processing 34, 744–754.

    Article  Google Scholar 

  • McAulay, R. J., and Quatieri, T. F. (1990). “Pitch estimation and voicing detection based on a sinusoidal speech model,” Proc. 1990 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-90), Albuquerque, NM (IEEE, New York), pp. 249–252.

    Google Scholar 

  • McAulay, R. J., and Quatieri, T. F. (1991). “Sine-wave phase coding at low data rates,” Proc. 1991 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-91), Toronto, Canada (IEEE, New York), pp. 577–580.

    Google Scholar 

  • Moorer, J. A. (1978). “The use of the phase vocoder in computer music applications,” J. Audio Eng. Soc. 26, 42–45.

    Google Scholar 

  • Painter, T. and Spanias, A. (2000). “Perceptual coding of digital audio,” Proc. IEEE 88(4), 451–513.

    Article  Google Scholar 

  • Peterson, E., and Cooper, F. S. (1957). “Peakpicker: A bandwidth compression device” (abstract), J. Acoust. Soc. Am. 29, 777.

    Article  ADS  Google Scholar 

  • Portnoff, M. R. (1976). “Implementation of the digital phase vocoder using the fast Fourier transform,” IEEE Trans. on Acoustics, Speech, Signal Processing ASSP-24, 243–248.

    Article  Google Scholar 

  • Princen, J. P., and Bradley, A. B. (1986). “Analysis/synthesis filter bank design based on time domain aliasing cancellation,” IEEE Trans. on Acoustics, Speech, Signal Processing ASSP-34, 1153–1161.

    Article  Google Scholar 

  • Quatieri, T. F. and McAulay, R. J. (1986). “Speech transformations based on a sinusoidal representation,” IEEE Trans. on Acoustics, Speech, Signal Processing ASSP-34, 1449–1464.

    Article  Google Scholar 

  • Quatieri, T. F., and McAulay, R. J. (1989). “Phase coherence in speech reconstruction for enhancement and coding applications,” Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-89), Glasgow, Scotland (IEEE, New York), pp. 207–210.

    Google Scholar 

  • Quatieri, T. F., and McAulay, R. J. (1998). “Audio signal processing based on sinusoidal analysis/synthesis,” in Applications of Digital Signal Processing to Audio and Acoustics, M. Kahrs and K. Brandenburg, eds. (Kluwer, Boston, MA), pp. 343–416.

    Google Scholar 

  • Risset, J.-C. (1985). “Computer music experiments, 1964-⋯,” Computer Music J. 9(1), 11–18.

    Google Scholar 

  • Roads, C. (Ed.). (1989). The Music Machine: Selected Readings from Computer Music Journal (MIT Press, Cambridge, MA).

    Google Scholar 

  • Roads, C., Pope, S. T., Piccialli, A., and De Poli, G. (eds.). (1997). Musical Signal Processing (Swets and Zietlinger, Exton, PA).

    Google Scholar 

  • Rodet, X. and Depalle, P. (1992). “Spectral envelopes and inverse FFT synthesis,” 93rd Convention of the Audio Eng. Soc., San Francisco, CA, Audio Eng. Soc. Preprint 3393.

    Google Scholar 

  • Schafer, R. W., and Markel, J. D. (eds.). (1979). Speech Analysis (IEEE Press, New York).

    Google Scholar 

  • Schroeder, M. R. (1966). “Vocoders: Analysis and synthesis of speech (a review of 30 years of applied speech research),” Proc. IEEE 56, 720–734. [reprinted in Speech Analysis, R. W. Schafer and J. D. Markel (eds.), (IEEE Press, New York), 1979, pp. 352–366].

    Article  Google Scholar 

  • Serra, X. (1989). “A System for Sound Analysis/Transformation/Synthesis Based on a Deterministic Plus Stochastic Decomposition,” doctoral dissertation, Stanford University, Dissertation Abstracts Int.-A, 51/01, 18 [also available as Dept. of Music Report No. STAN-M-58, Stanford Univ., 1989].

    Google Scholar 

  • Serra, X. and Smith, J. O. (1990). “Spectral modeling synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition,” Computer Music J. 14, 12–24.

    Article  Google Scholar 

  • Serra, X. and Smith, J. O. (1991). “Soundsheet examples for a sound analysis/synthesis system based on a deterministic plus stochastic decomposition,” Computer Music J. 15, 86–87.

    Article  Google Scholar 

  • Smirnov, A. (1998). “Proto musique concrete: Russian futurism in the 10s and 20s and early ideas of sonic art and art of noises,” presented at Inventionen 98 Festival, September 28, 1998, Haus des Rundfunks, Berlin, Germany.

    Google Scholar 

  • Smith, J. O. and Serra, X. (1987). “PARSHL: An analysis/synthesis program for non-harmonic sounds based on a sinusoidal representation,” Proc. 1987 Int. Computer Music Conf. (ICMC-87), Urbana, IL (Computer Music Assoc., San Francisco), pp. 290–297. (also available as Dept. of Music Technical Report STAN-M-43, Stanford Univ., 1987.)

    Google Scholar 

  • Smith, J. O. (1998). “Principles of digital waveguide models of musical instruments,” in Applications of Digital Signal Processing to Audio and Acoustics, M. Kahrs and K. Brandenburg, eds. (Kluwer Academic Publishers, Boston), pp. 417–466.

    Google Scholar 

  • Smith, J. O. (2004). Physical Audio Signal Processing: Digital Waveguide Modeling of Musical Instruments and Audio Effects, available online at http://ccrma.stanford.edu/pasp.

    Google Scholar 

  • Thomson, D. J. (1982). “Spectrum estimation and harmonic analysis,” Proc. IEEE 70(9), 1055–1096.

    Article  ADS  Google Scholar 

  • Verma, T. S., Levine, S. N., and Meng, T. H. Y. (1997). “Transient modeling synthesis: a flexible analysis/synthesis tool for transient signals,” Proc. 1997 Int. Computer Music Conf. (ICMC-97), Thessaloniki, Greece (Int. Computer Music Assoc., San Francisco), pp. 164–167.

    Google Scholar 

  • Wang, A. L. (1995). “Instantaneous and frequency-warped techniques for source separation and signal parametrization,” Proc. 1995 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA-95), New Paltz, NY (IEEE, New York), Paper 2.5.

    Google Scholar 

  • Zwicker, E. (1961). “Subdivision of the Audible Frequency Range into Critical Bands (Frequenzgruppen),” J. Acoust. Soc. Am. 33(2), 248.

    Article  ADS  Google Scholar 

  • Zwicker, E., and Fastl, H. (1990). Psychoacoustics, Facts, and Models (Springer-Verlag, Berlin).

    Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer

About this chapter

Cite this chapter

LEVINE, S.N., SMITH III, J.O. (2007). A Compact and Malleable Sines+Transients+Noise Model for Sound. In: Beauchamp, J.W. (eds) Analysis, Synthesis, and Perception of Musical Sounds. Modern Acoustics and Signal Processing. Springer, New York, NY. https://doi.org/10.1007/978-0-387-32576-7_4

Download citation

Publish with us

Policies and ethics