Advertisement

A CNN-Based Method for Infant Cry Detection and Recognition

  • Chuan-Yu ChangEmail author
  • Lung-Yu Tsai
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 927)

Abstract

Crying is the primary means of communication between the baby and the outside world. When a baby is crying, it is difficult for a novice parent to immediately understand the baby’s needs. If parents can accurately determine the cause of the baby’s cry, they can understand the baby’s emotional and physiological changes and needs. In real-world applications, recording devices may record sounds that are not produced by a baby. To reduce the burden on the recognition server and improve the accuracy of the classifier, this study proposes the conversion of the baby’s crying signal into a two-dimensional spectrogram. A convolutional neural network is used to determine if the input spectrum represents a baby’s cry. A baby’s cry is ultimately divided into four categories (including pain, hunger, sleepiness, and wet diaper) through additional one-dimensional convolutional neural networks. Experimental results showed that the proposed method achieves high crying detection and recognition rates.

References

  1. 1.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  2. 2.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  3. 3.
    Ballester, P., de Araújo, R.M.: On the performance of GoogLeNet and AlexNet applied to sketches. In: AAAI (2016)Google Scholar
  4. 4.
    Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)Google Scholar
  5. 5.
    Rader, C., Brenner, N.: A new principle for fast Fourier transformation. IEEE Trans. Acoust. Speech Signal Process. 24(3), 264–266 (1976)CrossRefGoogle Scholar
  6. 6.
    Tyagi, V., Wellekens, C.: On desensitizing the Mel-Cepstrum to spurious spectral components for robust speech recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP 2005) , vol. 1 (2005)Google Scholar
  7. 7.
    Garcia, J.O., Garcia, C.R.: Mel-frequency cepstrum coefficients extraction from infant cry for classification of normal and pathological cry with feed-forward neural networks. Neural Netw. 4, 3140–3145 (2003)Google Scholar
  8. 8.
    Petroni, M., et al.: Identification of pain from infant cry vocalizations using artificial neural networks (ANNs). In: Applications and Science of Artificial Neural Networks, vol. 2492. International Society for Optics and Photonics (1995)Google Scholar
  9. 9.
    Yong, B.F., Ting, H.N., Ng, K.H.: Baby cry recognition using deep neural networks. In: World Congress on Medical Physics and Biomedical Engineering 2018. Springer, Singapore (2019)Google Scholar
  10. 10.
    Abdel-Hamid, O., et al.: Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio Speech, Lang. Process. 22(10), 1533–1545 (2014)CrossRefGoogle Scholar
  11. 11.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012)Google Scholar
  12. 12.
    Muda, L., Begam, M., Elamvazuthi, I.: Voice recognition algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) techniques. arXiv preprint arXiv:1003.4083 (2010)
  13. 13.
    Sohn, J., Sung, W.: A voice activity detector employing soft decision based noise spectrum adaptation. Acoust. Speech Signal Process. 1, 365–368 (1998)Google Scholar
  14. 14.
    Fushiki, T.: Estimation of prediction error by using K-fold cross-validation. Stat. Comput. 21(2), 137–146 (2011)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computer Science and Information EngineeringNational Yunlin University of Science and TechnologyDouliuTaiwan

Personalised recommendations