In the previous chapter, we explored how initialization of the parameters affects the outcome of the model. In this chapter, we will explore ways to address the above issues by
  • Implementing optimization algorithms— mini-batch gradient descent, momentum, RMSprop, and Adam, and check for their convergence.

  • \(\ell _2\)-regularization, dropout regularization, and batch normalization gradient checking.

  • How to adjust train/dev/test data sets and analyze bias/variance.

  • Use TensorFlow for deep learning.

