Name: Speech and Audio Processing for Coding, Enhancement and Recognition
ISBN: 978-1-4939-1456-2

Overview

Editors:

Tokunbo Ogunfunmi⁰,
Roberto Togneri¹,
Madihally (Sim) Narasimha²

Tokunbo Ogunfunmi
1. Dept. of Electrical Engineering, Santa Clara University, Santa Clara, USA
View editor publications

You can also search for this editor in PubMed Google Scholar
Roberto Togneri
1. School of EE&C Engineering, The University of Western Australia, Crawley, Australia
View editor publications

You can also search for this editor in PubMed Google Scholar
Madihally (Sim) Narasimha
1. Qualcomm Inc., Santa Clara, USA
View editor publications

You can also search for this editor in PubMed Google Scholar

Offers readers a single-source reference on the significant applications of speech and audio processing to speech coding, speech enhancement and speech/speaker recognition. Enables readers involved in algorithm development and implementation issues for speech coding to understand the historical development and future challenges in speech coding research
Discusses speech coding methods yielding bit-streams that are multi-rate and scalable for Voice-over-IP (VoIP) Networks
Presents an overview of recent developments in conversational speech coding technologies, important new algorithmic advances, and recent standardization activities in ITU-T, 3GPP, 3GPP2, MPEG and IETF that offer a significantly improved user experience during voice calls on existing and future communication systems
Presents an overview of ensemble learning efforts based on different machine learning techniques that have emerged in automatic speech recognition in recent years
Emphasizes signal processing for efficient time-domain and spectral-domain representations, reduction of noise, channel and session variabilities, extraction of temporal and spectral features for recognition and modeling
Informs readers of the latest research and developments in advanced statistical estimation and deep neural networks for speech recognition
Presents readers with the architectural framework and key approaches involved in the “hot” research areas of emotion recognition and speaker diairization systems
Provides readers with a more enriching view of state of the art research in speech enhancement arising from novel multi-microphone and time-frequency solutions
Includes supplementary material: sn.pub/extras

20k Accesses
43 Citations
11 Altmetric

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 79.99

Price excludes VAT (USA)

Softcover Book USD 99.99

Price excludes VAT (USA)

Hardcover Book USD 109.99

Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (10 chapters)

Front Matter

Pages i-x

Download chapter PDF
Overview of Speech and Audio Coding
1. Front Matter
  
  Pages 1-1
  
  Download chapter PDF
2. From “Harmonic Telegraph” to Cellular Phones
  
  Bishnu S. Atal
  
  Pages 3-17
3. Challenges in Speech Coding Research
  
  Jerry D. Gibson
  
  Pages 19-39
4. Scalable and Multi-Rate Speech Coding for Voice-over-Internet Protocol (VoIP) Networks
  
  Tokunbo Ogunfunmi, Koji Seto
  
  Pages 41-74
5. Recent Speech Coding Technologies and Standards
  
  Daniel J. Sinder, Imre Varga, Venkatesh Krishnan, Vivek Rajendran, Stéphane Villette
  
  Pages 75-109
Review and Challenges in Speech, Speaker and Emotion Recognition
1. Front Matter
  
  Pages 111-111
  
  Download chapter PDF
2. Ensemble Learning Approaches in Speech Recognition
  
  Yunxin Zhao, Jian Xue, Xin Chen
  
  Pages 113-152
3. Deep Dynamic Models for Learning Hidden Representations of Speech Features
  
  Li Deng, Roberto Togneri
  
  Pages 153-195
4. Speech Based Emotion Recognition
  
  Vidhyasaharan Sethu, Julien Epps, Eliathamby Ambikairajah
  
  Pages 197-228
5. Speaker Diarization: An Emerging Research
  
  Trung Hieu Nguyen, Eng Siong Chng, Haizhou Li
  
  Pages 229-277
Current Trends in Speech Enhancement
1. Front Matter
  
  Pages 279-279
  
  Download chapter PDF
2. Maximum A Posteriori Spectral Estimation with Source Log-Spectral Priors for Multichannel Speech Enhancement
  
  Yasuaki Iwata, Tomohiro Nakatani, Takuya Yoshioka, Masakiyo Fujimoto, Hirofumi Saito
  
  Pages 281-317
3. Modulation Processing for Speech Enhancement
  
  Kuldip Paliwal, Belinda Schwerin
  
  Pages 319-345

Keywords

About this book

This book describes the basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas. Key research undertaken in speech coding, speech enhancement, speech recognition, emotion recognition and speaker diarization are also presented, along with recent advances and new paradigms in these areas.

Editors and Affiliations

Dept. of Electrical Engineering, Santa Clara University, Santa Clara, USA

Tokunbo Ogunfunmi
School of EE&C Engineering, The University of Western Australia, Crawley, Australia

Roberto Togneri
Qualcomm Inc., Santa Clara, USA

Madihally (Sim) Narasimha

About the editors

Tokunbo Ogunfunmi is an Associate Professor of Electrical Engineering and an Associate Dean for Research and Fac. Dev. at Santa Clara University.

Roberto Togneri is a professor with the School of Electrical, Electronic and Computer Engineering at The University of Western Australia.

Madihally (Sim) Narasimha is a Senior Director of Technology at Qualcomm Inc.

Bibliographic Information

Book Title: Speech and Audio Processing for Coding, Enhancement and Recognition
Editors: Tokunbo Ogunfunmi, Roberto Togneri, Madihally (Sim) Narasimha
DOI: https://doi.org/10.1007/978-1-4939-1456-2
Publisher: Springer New York, NY
eBook Packages: Engineering, Engineering (R0)
Copyright Information: Springer Science+Business Media New York 2015
Hardcover ISBN: 978-1-4939-1455-5
Softcover ISBN: 978-1-4939-4804-8
eBook ISBN: 978-1-4939-1456-2
Edition Number: 1
Number of Pages: X, 345
Number of Illustrations: 47 b/w illustrations, 32 illustrations in colour
Topics: Signal, Image and Speech Processing, User Interfaces and Human Computer Interaction, Multimedia Information Systems
Industry Sectors: Aerospace, Electronics, Engineering, IT & Software, Telecommunications

Publish with us

Policies and ethics

Speech and Audio Processing for Coding, Enhancement and Recognition

Overview

Access this book

Other ways to access

Table of contents (10 chapters)

Front Matter

Overview of Speech and Audio Coding

Front Matter

Review and Challenges in Speech, Speaker and Emotion Recognition

Front Matter

Current Trends in Speech Enhancement

Front Matter

Keywords

About this book

Editors and Affiliations

Dept. of Electrical Engineering, Santa Clara University, Santa Clara, USA

School of EE&C Engineering, The University of Western Australia, Crawley, Australia

Qualcomm Inc., Santa Clara, USA

About the editors

Bibliographic Information

Publish with us

Search

Navigation