Graph Cut Based Segmentation Method for Tamil Continuous Speech

Laxmi Sree, B. R.; Vijaya, M. S.

doi:10.1007/978-981-10-3274-5_21

B. R. Laxmi Sree¹³ &
M. S. Vijaya¹³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 679))

Included in the following conference series:

Annual Convention of the Computer Society of India

562 Accesses

Abstract

Automatic segmentation of continuous speech plays an important role in building promising acoustic models for a standard continuous speech recognition system. This needs a lot of segmented data which is rarely available for many languages. As there are no industry standard speech segmentation tools for Indian languages like Tamil, there arises a need to work on Tamil speech segmentation. Here, a segmentation algorithm that is based on Graph cut is proposed for automatic phonetic level segmentation of continuous speech. Using graph cut for speech segmentation allows viewing speech globally rather locally which helps in segmentation of vocabulary, speaker independent speech. The input speech is represented as a graph and the proposed algorithm is applied on it. Experiments on the speech database comprising utterances of various speakers shows the proposed method outperforms the existing methods Blind Segmentation using Non-Linear Filtering and Non-Uniform Segmentation using Discrete Wavelet Transform.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Juneja, A., Espy-Wilson, C.: Segmentation of continuous speech using acoustic-phonetic parameters and statistical learning. In: Proceedings of 9th International Conference on Neural Information Processing, vol. 2, pp. 726–730, November 2002
Google Scholar
Räsänen, O., Laine, U.K., Altosaar, T.: Blind segmentation of speech using non-linear filtering methods. In: Ipsic, I. (ed.) Speech Technologies, chap. 5. InTech Open Access, June 2011. ISBN 978-953-307-996-7
Google Scholar
Cosi, P.: SLAM: a PC-based multi-level segmentation tool. In: Rubio Ayuso, A.J., López Soler, J.M. (eds.) Speech Recognition and Coding. NATO ASI Series, vol. 147, pp. 124–127. Springer, Heidelberg (1995)
Chapter Google Scholar
Wickerhauser, V.: Proceedings of the Third International Conference on Wavelet Analysis and Its Applications (WAA), Chongqing, PR China. World Scientific, 29–31 May 2003
Google Scholar
Tan, B.T., Lang, R., Schroder, H., Spray, A., Dermody, P.: Applying wavelet analysis to speech segmentation and classification. In: SPIE’s International Symposium on Optical Engineering and Photonics in Aerospace Sensing, pp. 750–761. International Society for Optics and Photonics, March 1994
Google Scholar
Ziółko, M., Gałka, J., Drwiega, T.: Wavelet transform in speech segmentation. In: Fitt, A.D., Norbury, J., Ockendon, H., Wilson, E. (eds.) Progress in Industrial Mathematics at ECMI 2008, pp. 1073–1078. Springer, Heidelberg (2010)
Chapter MATH Google Scholar
Ziółko, B., Manandhar, S., Wilson, R., Ziółko, M.: Phoneme segmentation based on wavelet spectra analysis. Arch. Acoust. 36(1), 29–47 (2011)
Article Google Scholar
Sarada, G.L., Lakshmi, A., Murthy, H.A., Nagarajan, T.: Automatic transcription of continuous speech into syllable-like units for Indian languages. Sadhana 34(2), 221–233 (2009)
Article Google Scholar
Jayasankar, T., Thangarajan, R., Selvi, J.A.V.: Automatic continuous speech segmentation to improve Tamil text-to-speech synthesis. Int. J. Comput. Appl. 25(1), 31–36 (2011)
Google Scholar
Nagarajan, T., Murthy, H.A.: Subband-based group delay segmentation of spontaneous speech into syllable-like units. EURASIP J. Adv. Signal Process. 2004, 1–12 (2004)
Article MATH Google Scholar
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Article Google Scholar
Bleyer, M., Gelautz, M.: Graph-cut-based stereo matching using image segmentation with symmetrical treatment of occlusions. Signal Process. Image Commun. 22(2), 127–143 (2007)
Article Google Scholar
Xiang, T., Gong, S.: Spectral clustering with eigenvector selection. Pattern Recogn. 41(3), 1012–1029 (2008)
Article MATH Google Scholar
Wu, Z., Leahy, R.: An optimal graph theoretic approach to data clustering: theory and its application to image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 15(11), 1101–1113 (1993)
Article Google Scholar
Stan, A., Mamiya, Y., Yamagishi, J., Bell, P., Watts, O., Clark, R.A.J., King, S.: ALISA: an automatic lightly supervised speech segmentation and alignment tool. Comput. Speech Lang. 35, 116–133 (2016)
Article Google Scholar

Download references

Author information

Authors and Affiliations

PSGR Krishnammal College for Women, Avinashi Road, Peelamedu, Coimbatore, 641004, Tamilnadu, India
B. R. Laxmi Sree & M. S. Vijaya

Authors

B. R. Laxmi Sree
View author publications
You can also search for this author in PubMed Google Scholar
M. S. Vijaya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to B. R. Laxmi Sree .

Editor information

Editors and Affiliations

Karpagam Academy of Higher Education, Coimbatore, India
S. Subramanian
PSG College of Technology, Coimbatore, India
R. Nadarajan
International Institute of Information Technology, Bengaluru, Karnataka, India
Shrisha Rao
Applied Mathematics and Computational Sciences, PSG College of Technology, Coimbatore, Tamil Nadu, India
Shina Sheen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Laxmi Sree, B.R., Vijaya, M.S. (2016). Graph Cut Based Segmentation Method for Tamil Continuous Speech. In: Subramanian, S., Nadarajan, R., Rao, S., Sheen, S. (eds) Digital Connectivity – Social Impact. CSI 2016. Communications in Computer and Information Science, vol 679. Springer, Singapore. https://doi.org/10.1007/978-981-10-3274-5_21

Download citation

DOI: https://doi.org/10.1007/978-981-10-3274-5_21
Published: 23 November 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3273-8
Online ISBN: 978-981-10-3274-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics