Pitch Delay Based Adaptive Steganography for AMR Speech Stream

  • Chen Gong
  • Xiaowei YiEmail author
  • Xianfeng Zhao
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11378)


Most existing speech steganography breaks the continuity of adjacent pitch delay, which obviously degrades their statistically undetectability. This paper presents a novel steganographic scheme for low bit-rate speech stream against pitch delay steganalysis. Three measures are adopted to enhance steganographic security. First, the short-term stability of pitch delay and the statistical distribution of adjacent subframe are considered for designing a distortion function. Second, syndrome-trellis codes (STCs) is utilized to minimize the overall embedding impact based on the defined distortion function. Third, the suboptimal pitch delay is searched to maintain speech quality. Experimental results demonstrate that our scheme achieves higher level of security, especially in the case of low embedding rate. When the relative embedding rate is 0.2 for 10.2 kbit/s AMR stream, the test error rate of our method rises by 12.44% compared with the existing algorithm.


Adaptive steganography Speech steganography ACELP Pitch delay Adaptive multi-rate 


  1. 1.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011). Scholar
  2. 2.
    Chi, Y.U., Huang, L.S., Yang, W., Chen, Z.L., Miao, H.B.: A 3G speech data hiding method based on pitch period. J. Chin. Comput. Syst. 33(7), 1445–1449 (2012)Google Scholar
  3. 3.
    Wang, D., Zhang, X.: Thchs-30: A Free Chinese Speech Corpus (2015).
  4. 4.
    Filler, T., Judas, J., Fridrich, J.: Minimizing additive distortion in steganography using syndrome-trellis codes. IEEE Trans. Inf. Forensics Secur. 6(3), 920–935 (2011)CrossRefGoogle Scholar
  5. 5.
    MSCSP Functions: Adaptive multi-rate (AMR) speech codec. Voice Activity Detector (VAD) (2012)Google Scholar
  6. 6.
    Group, I.T.S., et al.: Coding of speech at 8 kbits/s using conjugate-structure algebraic-code-excited linear-prediction (CS-ACELP). In: International Telecommunication Union Telecommunication Standardization Sector, Draft Recommendation, Version 6 (1995)Google Scholar
  7. 7.
    Guo, L., Ni, J., Shi, Y.Q.: An efficient JPEG steganographic scheme using uniform embedding. In: IEEE International Workshop on Information Forensics and Security, pp. 169–174 (2012)Google Scholar
  8. 8.
    Hess, W., OShaughnessy, D.: Pitch determination of speech signals: Algorithms and devices by Wolfgang Hess (1984)Google Scholar
  9. 9.
    Holub, V., Fridrich, J., Denemark, T.: Universal distortion function for steganography in an arbitrary domain. EURASIP J. Inf. Secur. 1(1), 1 (2014)Google Scholar
  10. 10.
    Huang, Y., Liu, C., Tang, S., Bai, S.: Steganography integration into a low-bit rate speech codec. IEEE Trans. Inf. Forensics Secur. 7(6), 1865–1875 (2012)CrossRefGoogle Scholar
  11. 11.
    DRSC ITU-T for multimedia communications transmitting at 5.3 and 6.3 kbit/s. ITU-T Recommendation G 723 (2006)Google Scholar
  12. 12.
    Iwakiri, M., Matsui, K.: Embedding a text into conjugate structure algebraic code excited linear prediction audio codes. Trans. Inf. Process. Soc. Jpn 39, 2623–2630 (1998)Google Scholar
  13. 13.
    Liang, X.H.Y., Xia, M.: Steganalysis of speech compressed based on voicing features. J. Comput. Res. Develop. 46(s1), 173–176 (2009)Google Scholar
  14. 14.
    Liu, C.H., Bai, S., Huang, Y.F., Yang, Y., Song-Bin, L.I.: An information hiding algorithm based on pitch prediction. Comput. Eng. 39(2), 137–140 (2013)CrossRefGoogle Scholar
  15. 15.
    Nishimura, A.: Data hiding in pitch delay data of the adaptive multi-rate narrow-band speech codec. In: International Conference on Intelligent Information Hiding & Multimedia Signal Processing, pp. 483–486 (2009)Google Scholar
  16. 16.
    Nishimura, A.: Steganographic band width extension for the AMR codec of low-bit-rate modes. In: INTERSPEECH 2009, Conference of the International Speech Communication Association, Brighton, United Kingdom, September, pp. 2611–2614 (2009)Google Scholar
  17. 17.
    Ren, Y., Yang, J., Wang, J., Wang, L.: AMR steganalysis based on second-order difference of pitch delay. IEEE Trans. Inf. Forensics Secur. 12(6), 1345–1357 (2017)CrossRefGoogle Scholar
  18. 18.
    Rix, A.W., Beerends, J.G., Hollier, M.P., Hekstra, A.P.: Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, Proceedings (ICASSP 2001), vol. 2, pp. 749–752. IEEE (2001)Google Scholar
  19. 19.
    Song-Bin, L.I., Jia, Y.Z., Jiang-Yun, F.U., Dai, Q.X.: Detection of pitch modulation information hiding based on codebook correlation network. Chin. J. Comput. 37(10), 2107–2116 (2014)Google Scholar
  20. 20.
    Sullivan, T.: The CMU audio databases (1996)Google Scholar
  21. 21.
    Wu, Z.J., Yang, W., Yang, Y.X.: ABS-based speech information hiding approach. Electron. Lett. 39(22), 1617–1619 (2003)CrossRefGoogle Scholar
  22. 22.
    Yan, S., Tang, G., Sun, Y.: Steganography for low bit-rate speech based on pitch period prediction. Appl. Res. Comput. 32(6), 1774–1777 (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.State Key Laboratory of Information SecurityInstitute of Information Engineering, Chinese Academy of SciencesBeijingChina
  2. 2.School of Cyber SecurityUniversity of Chinese Academy of SciencesBeijingChina

Personalised recommendations