Abstract
Automatic segmentation of continuous speech plays an important role in building promising acoustic models for a standard continuous speech recognition system. This needs a lot of segmented data which is rarely available for many languages. As there are no industry standard speech segmentation tools for Indian languages like Tamil, there arises a need to work on Tamil speech segmentation. Here, a segmentation algorithm that is based on Graph cut is proposed for automatic phonetic level segmentation of continuous speech. Using graph cut for speech segmentation allows viewing speech globally rather locally which helps in segmentation of vocabulary, speaker independent speech. The input speech is represented as a graph and the proposed algorithm is applied on it. Experiments on the speech database comprising utterances of various speakers shows the proposed method outperforms the existing methods Blind Segmentation using Non-Linear Filtering and Non-Uniform Segmentation using Discrete Wavelet Transform.
References
Juneja, A., Espy-Wilson, C.: Segmentation of continuous speech using acoustic-phonetic parameters and statistical learning. In: Proceedings of 9th International Conference on Neural Information Processing, vol. 2, pp. 726–730, November 2002
Räsänen, O., Laine, U.K., Altosaar, T.: Blind segmentation of speech using non-linear filtering methods. In: Ipsic, I. (ed.) Speech Technologies, chap. 5. InTech Open Access, June 2011. ISBN 978-953-307-996-7
Cosi, P.: SLAM: a PC-based multi-level segmentation tool. In: Rubio Ayuso, A.J., López Soler, J.M. (eds.) Speech Recognition and Coding. NATO ASI Series, vol. 147, pp. 124–127. Springer, Heidelberg (1995)
Wickerhauser, V.: Proceedings of the Third International Conference on Wavelet Analysis and Its Applications (WAA), Chongqing, PR China. World Scientific, 29–31 May 2003
Tan, B.T., Lang, R., Schroder, H., Spray, A., Dermody, P.: Applying wavelet analysis to speech segmentation and classification. In: SPIE’s International Symposium on Optical Engineering and Photonics in Aerospace Sensing, pp. 750–761. International Society for Optics and Photonics, March 1994
Ziółko, M., Gałka, J., Drwiega, T.: Wavelet transform in speech segmentation. In: Fitt, A.D., Norbury, J., Ockendon, H., Wilson, E. (eds.) Progress in Industrial Mathematics at ECMI 2008, pp. 1073–1078. Springer, Heidelberg (2010)
Ziółko, B., Manandhar, S., Wilson, R., Ziółko, M.: Phoneme segmentation based on wavelet spectra analysis. Arch. Acoust. 36(1), 29–47 (2011)
Sarada, G.L., Lakshmi, A., Murthy, H.A., Nagarajan, T.: Automatic transcription of continuous speech into syllable-like units for Indian languages. Sadhana 34(2), 221–233 (2009)
Jayasankar, T., Thangarajan, R., Selvi, J.A.V.: Automatic continuous speech segmentation to improve Tamil text-to-speech synthesis. Int. J. Comput. Appl. 25(1), 31–36 (2011)
Nagarajan, T., Murthy, H.A.: Subband-based group delay segmentation of spontaneous speech into syllable-like units. EURASIP J. Adv. Signal Process. 2004, 1–12 (2004)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Bleyer, M., Gelautz, M.: Graph-cut-based stereo matching using image segmentation with symmetrical treatment of occlusions. Signal Process. Image Commun. 22(2), 127–143 (2007)
Xiang, T., Gong, S.: Spectral clustering with eigenvector selection. Pattern Recogn. 41(3), 1012–1029 (2008)
Wu, Z., Leahy, R.: An optimal graph theoretic approach to data clustering: theory and its application to image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 15(11), 1101–1113 (1993)
Stan, A., Mamiya, Y., Yamagishi, J., Bell, P., Watts, O., Clark, R.A.J., King, S.: ALISA: an automatic lightly supervised speech segmentation and alignment tool. Comput. Speech Lang. 35, 116–133 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Laxmi Sree, B.R., Vijaya, M.S. (2016). Graph Cut Based Segmentation Method for Tamil Continuous Speech. In: Subramanian, S., Nadarajan, R., Rao, S., Sheen, S. (eds) Digital Connectivity – Social Impact. CSI 2016. Communications in Computer and Information Science, vol 679. Springer, Singapore. https://doi.org/10.1007/978-981-10-3274-5_21
Download citation
DOI: https://doi.org/10.1007/978-981-10-3274-5_21
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3273-8
Online ISBN: 978-981-10-3274-5
eBook Packages: Computer ScienceComputer Science (R0)