© 2019

Deep Learning for NLP and Speech Recognition


  • A comprehensive resource that builds up from elementary deep learning, text, and speech principles to advanced state-of-the-art neural architectures

  • A ready reference for deep learning techniques applicable to common NLP and speech recognition applications

  • A useful resource on successful architectures and algorithms with essential mathematical insights explained in detail

  • An in-depth reference and comparison of the latest end-to-end neural speech processing approach

  • A panoramic resource on leading edge transfer learning, domain adaptation and deep reinforcement learning architectures for text and speech

  • Practical aspects of using these techniques with tips and tricks essential for real-world applications

  • A hands-on approach to using Python-based deep learning libraries such as Keras, TensorFlow, and PyTorch to apply these techniques in the context of real-world case studies

  • Thirteen case studies with code, data, and configurations across different approaches for NLP and Speech recognition tasks such as Embeddings, Classification, Distributed Representation, Summarization, Machine Translation, Sentiment Analysis, Cross Domain Transfer Learning, Multi-Task NLP, End to End Speech, and Question Answering


Table of contents

  1. Front Matter
    Pages i-xxviii
  2. Machine Learning, NLP, and Speech Introduction

    1. Front Matter
      Pages 1-1
    2. Uday Kamath, John Liu, James Whitaker
      Pages 3-38
    3. Uday Kamath, John Liu, James Whitaker
      Pages 39-86
    4. Uday Kamath, John Liu, James Whitaker
      Pages 87-138
  3. Deep Learning Basics

    1. Front Matter
      Pages 139-139
    2. Uday Kamath, John Liu, James Whitaker
      Pages 141-201
    3. Uday Kamath, John Liu, James Whitaker
      Pages 203-261
    4. Uday Kamath, John Liu, James Whitaker
      Pages 263-314
    5. Uday Kamath, John Liu, James Whitaker
      Pages 315-368
    6. Uday Kamath, John Liu, James Whitaker
      Pages 369-404
  4. Advanced Deep Learning Techniques for Text and Speech

    1. Front Matter
      Pages 407-407
    2. Uday Kamath, John Liu, James Whitaker
      Pages 407-462
    3. Uday Kamath, John Liu, James Whitaker
      Pages 463-493
    4. Uday Kamath, John Liu, James Whitaker
      Pages 495-535
    5. Uday Kamath, John Liu, James Whitaker
      Pages 537-574
    6. Uday Kamath, John Liu, James Whitaker
      Pages 575-613
  5. Back Matter
    Pages 615-621

About this book


With the widespread adoption of deep learning, natural language processing (NLP),and speech applications in many areas (including Finance, Healthcare, and Government) there is a growing need for one comprehensive resource that maps deep learning techniques to NLP and speech and provides insights  into  using  the  tools  and  libraries  for  real-world  applications. Deep Learning for NLP and Speech Recognition explains recent deep learning methods applicable to NLP and speech, provides state-of-the-art approaches, and offers real-world case studies with code to provide hands-on experience. 

The book is organized into three parts, aligning to different groups of readers and their expertise. The three parts are:

      Machine Learning, NLP, and Speech Introduction

The first part has three chapters that introduce readers to the fields of  NLP, speech recognition,  deep learning and machine learning with basic theory and hands-on case studies using Python-based tools and libraries.

      Deep Learning Basics

The five chapters in the second part introduce deep learning and various topics that are crucial for speech and text processing, including word embeddings, convolutional neural networks, recurrent neural networks and speech recognition basics. Theory, practical tips, state-of-the-art methods, experimentations and analysis in using the methods discussed in theory on real-world tasks.

      Advanced Deep Learning Techniques for Text and Speech

The third part has five chapters that discuss the latest and cutting-edge research in the areas of deep learning that intersect with NLP and speech. Topics including attention mechanisms, memory augmented networks, transfer learning, multi-task learning, domain adaptation, reinforcement learning, and end-to-end deep learning for speech recognition are covered using case studies. 


Deep Learning Architecture Document Classification Machine Translation Language Modeling Speech Recognition Natural Language Processing

Authors and affiliations

  1. 1.Digital Reasoning Systems Inc.McLeanUSA
  2. 2.Intelluron CorporationNashvilleUSA
  3. 3.Digital Reasoning Systems Inc.McLeanUSA

About the authors

Uday Kamath has more than 20 years of experience architecting and building analytics-based commercial solutions. He currently works as the Chief Analytics Officer at Digital Reasoning, one of the leading companies in AI for NLP and Speech Recognition, heading the Applied Machine Learning research group. Most recently, Uday served as the Chief Data Scientist at BAE Systems Applied Intelligence, building machine learning products and solutions for the financial industry, focused on fraud, compliance, and cybersecurity. Uday has previously authored many books on machine learning such as Machine Learning: End-to-End guide for Java developers: Data Analysis, Machine Learning, and Neural Networks simplified and Mastering Java Machine Learning: A Java developer's guide to implementing machine learning and big data architectures. Uday has published many academic papers in different machine learning journals and conferences. Uday has a Ph.D. in Big Data Machine Learning and was one of the first in generalized scaling of machine learning algorithms using evolutionary computing.

John Liu spent the past 22 years managing quantitative research, portfolio management and data science teams. He is currently CEO of Intelluron Corporation, an emerging AI-as-a-service solution company. Most recently, John was head of data science and data strategy as VP at Digital Reasoning. Previously, he was CIO of Spartus Capital, a quantitative investment firm in New York. Prior to that, John held senior executive roles at Citigroup, where he oversaw the portfolio solutions group that advised institutional clients on quantitative investment and risk strategies; at the Indiana Public Employees pension, where he managed the $7B public equities portfolio; at Vanderbilt University, where he oversaw the $2B equity and alternative investment portfolios; and at BNP Paribas, where he managed the US index options and MSCI delta-one trading desks. He is known for his expertise in reinforcement learning applied to investment management and has authored numerous papers and book chapters on topics including natural language processing, representation learning, systemic risk, asset allocation, and EM theory. In 2016, John was named Nashville's Data Scientist of the Year. He earned his B.S., M.S., and Ph.D. in electrical engineering from the University of Pennsylvania and is a CFA Charterholder.

James (Jimmy) Whitaker manages Applied Research at Digital Reasoning. He currently leads deep learning developments in speech analytics in the FinTech space, and has spent the last 4 years building machine learning applications for NLP, Speech Recognition, and Computer Vision. He received his masters in Computer Science from the University of Oxford, where he received a distinction for his application of machine learning in the field of Steganalysis after completing his undergraduate degrees in Electrical Engineering and Computer Science from Christian Brothers University. Prior to his work in deep learning, Jimmy worked as a concept engineer and risk manager for complex transportation initiatives.

Bibliographic information

Industry Sectors
Chemical Manufacturing
IT & Software
Consumer Packaged Goods
Materials & Steel
Finance, Business & Banking
Energy, Utilities & Environment
Oil, Gas & Geosciences