A Compact and Malleable Sines+Transients+Noise Model for Sound

LEVINE, SCOTT N.; SMITH III, JULIUS O.

doi:10.1007/978-0-387-32576-7_4

SCOTT N. LEVINE &
JULIUS O. SMITH III

Part of the book series: Modern Acoustics and Signal Processing ((MASP))

2446 Accesses
1 Citations

Abstract

This chapter describes an audio representation which supports time and frequency scale modifications in a compressed domain. The input audio is segregated into three component representations: sinusoids, transients, and noise. Each component can be individually quantized and/or time-scaled and/or pitch-shifted.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ali, M. (1996). “Adaptive signal representation with application in audio coding,” doctoral dissertation, Univ. of Minnesota, Minneapolis, MN, Dissertation Abstracts Int.-B 57-04, 2739.
Google Scholar
Bosi, M., Brandenburg, K., Quackenbush, S., Fielder, L., Akagiri, K., Fuchs, H., Dietz, M., Herre, J., Davidson, G., and Oikawa, Y. (1997). “ISO-IEC MPEG-2 advanced audio coding,” J. Audio Eng. Soc. 45, 789–814.
Google Scholar
Bosi, M. and Goldberg, R.E. (2003). Introduction to Digital Audio Coding and Standards (Klumer Academic, Boston).
Google Scholar
Brandenburg, K. and Bosi, M. (1997), “Overview of MPEG audio: Current and future standards for low-bit-rate audio coding,” J. Audio Eng. Soc. 45(1/2), 4–21.
Google Scholar
Dolson, M. (1986). “The phase vocoder: A tutorial,” Computer Music J. 10(4), 14–27.
Article Google Scholar
Dudley, H. (1939). “Remaking speech,” J. Acoustical Soc. Am. 11, 169–177.
Article ADS Google Scholar
Edler, B., Purnhagen, H., and Ferekidis, C. (1996). “ASAC—analysis/synthesis audio codec for very low-bit rates,” 100th Convention of the Audio Engineering Society, Copenhagen, Audio Eng. Soc. Preprint No. 4179.
Google Scholar
Flanagan, J. L., and Golden, R. M. (1966). “Phase vocoder,” Bell Syst. Tech. J. 45, 1493–1509. [reprinted in Speech Analysis, R. W. Schafer and J. D. Markel (eds.), IEEE Press, New York, 1979, pp. 388–404].
Google Scholar
Fliege, N. J., and Zolzer, U. (1993). “Multi-complementary filter bank,” Proc. 1993 Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-93), Minneapolis (IEEE, New York), Vol. 3, pp. 193–196.
Google Scholar
General Electric Co. (1977). “ADEC subroutine description,” Technical Report, Heavy Military Electronics Department (General Electric Co., Syracuse, NY).
Google Scholar
George, E. B. and Smith, M. J. T. (1987). “A new speech coding model based on least-squares sinusoidal representation,” Proc. 1987 Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-87), Dallas, TX (IEEE, New York), pp. 1641–1644.
Google Scholar
George, E. B., and Smith, M. J. T. (1992). “Analysis-by-synthesis/Overlap-add sinusoidal modeling applied to the analysis and synthesis of musical tones,” J. Audio Eng. Soc. 40(6), 497–516.
Google Scholar
Goodwin, M. (1996). “Residual modeling in music analysis/synthesis,” in Proc. 1996 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-96), Atlanta, GA (IEEE, New York), pp. 1005–1008.
Chapter Google Scholar
Griffin, D. W., and Lim, J. S. (1988). “Multiband excitation vocoder,” IEEE Trans. on Acoustics, Speech, Signal Processing 36(8), 1223–1235.
Article MATH Google Scholar
Hamdy, K. N., Ali, M., and Tewfik, A. H. (1996). “Low bit rate high quality audio coding with combined harmonic and wavelet representations,” Proc. 1996 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-96), Atlanta, GA (IEEE, New York), pp. 1045–1048.
Chapter Google Scholar
Horner, A. and Beauchamp, J. (1996). “Piecewise Linear Approximation of Additive Synthesis Envelopes: A Comparison of Various Methods,” Computer Music J. 20(2), 72–95.
Article Google Scholar
Horner, A., Ayers, L., and Law, D., (1997). “Modeling Small Chinese and Tibetan Bells,” J. Audio Eng. Soc. 45(3), 148–159.
Google Scholar
Huffman, D. A. (1952). “A Method for the Construction of Minimum-Redundancy Codes,” Proc. IRE 40, 1098–1101.
Article Google Scholar
ISE/IEC JTC 1/SC 29/WG 11 (1993). “ISO/IEC 11172-3: Information technology—coding of moving pictures and associated audio for digital storage media at up to about 1.5 mbit/s—Part 3: Audio” (Motion Picture Experts Group, Los Angeles, CA).
Google Scholar
Laroche, J., Stylianou, Y., and Moulines, E. (1993). “HNM: A simple, efficient harmonic + noise model for speech,” Proc. 1993 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA-93), New Paltz, NY (IEEE, New York), pp. 169–172.
Chapter Google Scholar
Laroche, J., and Dolson, M. (1999). “Improved Phase-Vocoder Time-Scale Modification of Audio,” IEEE Trans. Speech and Audio Processing 7(3), 323–332.
Article Google Scholar
Levine, S. N. (1998). “Audio representations for data compression and compressed domain processing,” doctoral dissertation, Stanford University, Dissertation Abstracts Int.-B 60/04, 1767. [available for download at http://www-ccrma.stanford.edu/thesis.html; this site also includes audio examples.]
Google Scholar
Levine, S. N., and Smith, J. O. (1998). “A sines+transients+noise audio representation for data compression and time/pitch-scale modications,” 105th Convention of the Audio Eng. Soc., San Francisco, Audio Eng. Soc. Preprint 4781. [available for download at http://www-ccrma.stanford.edu/papers.html.]
Google Scholar
Levine, S. N., Verma, T. S., and Smith, J. O. (1998). “Multiresolution sinusoidal modeling for wideband audio with modifications,” Proc. 1998 Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-98), Seattle (IEEE, New York), pp. 3585–3588.
Chapter Google Scholar
Levine, S. N., and Smith, J. O. (1999). “A switched parametric and transform audio coder,” in Proc. 1999 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-99), Phoenix (IEEE, New York), pp. 985–988. [available for download at http://www-ccrma.stanford.edu/papers.html.]
Google Scholar
Malvar, H. (1992). Signal Processing with Lapped Transforms (Artech House Telecommunications Library, Boston), pp. 175–179.
MATH Google Scholar
McAulay, R. J. and Quatieri, T. F. (1984). “Magnitude-only reconstruction using a sinusoidal speech model,” Proc. 1984 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-84), San Diego (IEEE, New York), pp. 27.6.1–27.6.4.
Google Scholar
McAulay, R. J. and Quatieri, T. F. (1985). “Mid-rate coding based on a sinusoidal representation of speech,” Proc. 1985 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-85), Tampa, FL (IEEE, New York), pp. 945–948.
Chapter Google Scholar
McAulay, R. J., and Quatieri, T. F. (1986). “Speech analysis/synthesis based on a sinusoidal representation,” IEEE Trans. on Acoustics, Speech and Signal Processing 34, 744–754.
Article Google Scholar
McAulay, R. J., and Quatieri, T. F. (1990). “Pitch estimation and voicing detection based on a sinusoidal speech model,” Proc. 1990 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-90), Albuquerque, NM (IEEE, New York), pp. 249–252.
Google Scholar
McAulay, R. J., and Quatieri, T. F. (1991). “Sine-wave phase coding at low data rates,” Proc. 1991 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-91), Toronto, Canada (IEEE, New York), pp. 577–580.
Google Scholar
Moorer, J. A. (1978). “The use of the phase vocoder in computer music applications,” J. Audio Eng. Soc. 26, 42–45.
Google Scholar
Painter, T. and Spanias, A. (2000). “Perceptual coding of digital audio,” Proc. IEEE 88(4), 451–513.
Article Google Scholar
Peterson, E., and Cooper, F. S. (1957). “Peakpicker: A bandwidth compression device” (abstract), J. Acoust. Soc. Am. 29, 777.
Article ADS Google Scholar
Portnoff, M. R. (1976). “Implementation of the digital phase vocoder using the fast Fourier transform,” IEEE Trans. on Acoustics, Speech, Signal Processing ASSP-24, 243–248.
Article Google Scholar
Princen, J. P., and Bradley, A. B. (1986). “Analysis/synthesis filter bank design based on time domain aliasing cancellation,” IEEE Trans. on Acoustics, Speech, Signal Processing ASSP-34, 1153–1161.
Article Google Scholar
Quatieri, T. F. and McAulay, R. J. (1986). “Speech transformations based on a sinusoidal representation,” IEEE Trans. on Acoustics, Speech, Signal Processing ASSP-34, 1449–1464.
Article Google Scholar
Quatieri, T. F., and McAulay, R. J. (1989). “Phase coherence in speech reconstruction for enhancement and coding applications,” Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP-89), Glasgow, Scotland (IEEE, New York), pp. 207–210.
Google Scholar
Quatieri, T. F., and McAulay, R. J. (1998). “Audio signal processing based on sinusoidal analysis/synthesis,” in Applications of Digital Signal Processing to Audio and Acoustics, M. Kahrs and K. Brandenburg, eds. (Kluwer, Boston, MA), pp. 343–416.
Google Scholar
Risset, J.-C. (1985). “Computer music experiments, 1964-⋯,” Computer Music J. 9(1), 11–18.
Google Scholar
Roads, C. (Ed.). (1989). The Music Machine: Selected Readings from Computer Music Journal (MIT Press, Cambridge, MA).
Google Scholar
Roads, C., Pope, S. T., Piccialli, A., and De Poli, G. (eds.). (1997). Musical Signal Processing (Swets and Zietlinger, Exton, PA).
Google Scholar
Rodet, X. and Depalle, P. (1992). “Spectral envelopes and inverse FFT synthesis,” 93rd Convention of the Audio Eng. Soc., San Francisco, CA, Audio Eng. Soc. Preprint 3393.
Google Scholar
Schafer, R. W., and Markel, J. D. (eds.). (1979). Speech Analysis (IEEE Press, New York).
Google Scholar
Schroeder, M. R. (1966). “Vocoders: Analysis and synthesis of speech (a review of 30 years of applied speech research),” Proc. IEEE 56, 720–734. [reprinted in Speech Analysis, R. W. Schafer and J. D. Markel (eds.), (IEEE Press, New York), 1979, pp. 352–366].
Article Google Scholar
Serra, X. (1989). “A System for Sound Analysis/Transformation/Synthesis Based on a Deterministic Plus Stochastic Decomposition,” doctoral dissertation, Stanford University, Dissertation Abstracts Int.-A, 51/01, 18 [also available as Dept. of Music Report No. STAN-M-58, Stanford Univ., 1989].
Google Scholar
Serra, X. and Smith, J. O. (1990). “Spectral modeling synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition,” Computer Music J. 14, 12–24.
Article Google Scholar
Serra, X. and Smith, J. O. (1991). “Soundsheet examples for a sound analysis/synthesis system based on a deterministic plus stochastic decomposition,” Computer Music J. 15, 86–87.
Article Google Scholar
Smirnov, A. (1998). “Proto musique concrete: Russian futurism in the 10s and 20s and early ideas of sonic art and art of noises,” presented at Inventionen 98 Festival, September 28, 1998, Haus des Rundfunks, Berlin, Germany.
Google Scholar
Smith, J. O. and Serra, X. (1987). “PARSHL: An analysis/synthesis program for non-harmonic sounds based on a sinusoidal representation,” Proc. 1987 Int. Computer Music Conf. (ICMC-87), Urbana, IL (Computer Music Assoc., San Francisco), pp. 290–297. (also available as Dept. of Music Technical Report STAN-M-43, Stanford Univ., 1987.)
Google Scholar
Smith, J. O. (1998). “Principles of digital waveguide models of musical instruments,” in Applications of Digital Signal Processing to Audio and Acoustics, M. Kahrs and K. Brandenburg, eds. (Kluwer Academic Publishers, Boston), pp. 417–466.
Google Scholar
Smith, J. O. (2004). Physical Audio Signal Processing: Digital Waveguide Modeling of Musical Instruments and Audio Effects, available online at http://ccrma.stanford.edu/pasp.
Google Scholar
Thomson, D. J. (1982). “Spectrum estimation and harmonic analysis,” Proc. IEEE 70(9), 1055–1096.
Article ADS Google Scholar
Verma, T. S., Levine, S. N., and Meng, T. H. Y. (1997). “Transient modeling synthesis: a flexible analysis/synthesis tool for transient signals,” Proc. 1997 Int. Computer Music Conf. (ICMC-97), Thessaloniki, Greece (Int. Computer Music Assoc., San Francisco), pp. 164–167.
Google Scholar
Wang, A. L. (1995). “Instantaneous and frequency-warped techniques for source separation and signal parametrization,” Proc. 1995 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA-95), New Paltz, NY (IEEE, New York), Paper 2.5.
Google Scholar
Zwicker, E. (1961). “Subdivision of the Audible Frequency Range into Critical Bands (Frequenzgruppen),” J. Acoust. Soc. Am. 33(2), 248.
Article ADS Google Scholar
Zwicker, E., and Fastl, H. (1990). Psychoacoustics, Facts, and Models (Springer-Verlag, Berlin).
Google Scholar

Download references

Authors

SCOTT N. LEVINE
View author publications
You can also search for this author in PubMed Google Scholar
JULIUS O. SMITH III
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Professor Emeritus School of Music Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, 61801, Urbana, IL, USA
James W. Beauchamp

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

LEVINE, S.N., SMITH III, J.O. (2007). A Compact and Malleable Sines+Transients+Noise Model for Sound. In: Beauchamp, J.W. (eds) Analysis, Synthesis, and Perception of Musical Sounds. Modern Acoustics and Signal Processing. Springer, New York, NY. https://doi.org/10.1007/978-0-387-32576-7_4

Download citation

DOI: https://doi.org/10.1007/978-0-387-32576-7_4
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-32496-8
Online ISBN: 978-0-387-32576-7
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)

Publish with us

Policies and ethics