## Abstract

The computers are able to perform complex calculus operations in a short amount of time. However computers cannot compete with humans in dealing with: common sense, ability to recognize people, objects, sounds, comprehension of natural language, ability to learn, categorize, generalize.

Therefore, why does the human brain show to be superior w.r.t common computers for these kind of problems? Is there any chance to mimic the mechanisms characterizing the way of working of our brain in order to produce more efficient machines?

In the field of signal analysis, the aim is the characterization of such real-world signals in terms of *signal models*, which can provide the basis for a theoretical description of a signal processing system. They are potentially capable of letting us learn a great deal about the signal source, without having to have the source available.

Therefore, in this chapter two families of modelling technique are described, i.e., the hidden Markov models (HMM) and the Deep Neural Network (DNN). After a theoretical description, the algorithms used for their parameter estimation are described, with a focus on the most widely model structure used in the field of the NILM.

## Keywords

Machine learning Hidden Markov Model Baum Welch algorithm Deep neural network Stochastic gradient descent## References

- 76.L. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE
**77**(2), 257–286 (1989)CrossRefGoogle Scholar - 77.Z. Ghahramani, M.I. Jordan, Factorial Hidden Markov models. Mach. Learn.
**29**(2–3), 245–273 (1997)CrossRefGoogle Scholar - 78.S. Haykin,
*Adaptive Filter Theory*, 3rd edn. (Prentice-Hall, Upper Saddle River, 1996)zbMATHGoogle Scholar - 79.D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning representations by back-propagating errors. Nature
**323**, 533–536 (1986)CrossRefGoogle Scholar - 80.Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition. Proc. IEEE
**86**(11), 2278–2324 (1998)CrossRefGoogle Scholar - 81.I. Goodfellow, Y. Bengio, A. Courville,
*Deep Learning*(MIT Press, Cambridge, 2016). http://www.deeplearningbook.org zbMATHGoogle Scholar - 82.I. Goodfellow, H. Lee, Q.V. Le, A. Saxe, A.Y. Ng, Measuring invariances in deep networks, in
*Advances in Neural Information Processing Systems*, vol. 22, ed. by Y. Bengio, D. Schuurmans, J.D. Lafferty, C.K.I. Williams, A. Culotta (Curran Associates, Red Hook, 2009), pp. 646–654Google Scholar - 83.Y. Bengio, P. Lamblin, D. Popovici, H. Larochelle, Greedy layer-wise training of deep networks, in
*Advances in Neural Information Processing Systems*, vol. 19, ed. by B. Schölkopf, J.C. Platt, T. Hoffman (MIT Press, Cambridge, 2007), pp. 153–160Google Scholar - 84.P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, P.-A. Manzagol, Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res.
**11**, 3371–3408 (2010)MathSciNetzbMATHGoogle Scholar