Abstract
We propose a unified non-linear approach that offers an efficient closed-form solution for the problem of sparse linear prediction analysis. The approach is based on our previous work for minimization of the weighted l 2-norm of the prediction error. The weighting of the l 2-norm is done in a way that less emphasis is given to the prediction error around the Glottal Closure Instants (GCI) as they are expected to attain the largest values of error and hence, the resulting cost function approaches the ideal l 0-norm cost function for sparse residual recovery. As such, the method requires knowledge of the GCIs. In this paper we use our recently developed GCI detection algorithm which is particularly suitable for this problem as it does not rely on residuals themselves for detection of GCIs. We show that our GCI detection algorithm provides slightly better sparsity properties in comparison to a recent powerful GCI detection algorithm. Moreover, as the computational cost of our GCI detection algorithm is quite low, the computational cost of the overall solution is considerably lower.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alku, P., Pohjalainen, J., Vainio, M., Laukkanen, A., Story, B.: Improved formant frequency estimation from high-pitched vowels by downgrading the contribution of the glottal source with weighted linear prediction. In: INTERSPEECH (2012)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press (2004)
Candès, E.J., Romberg, J.: l1-magic: Recovery of sparse signals via convex programming (2005)
Candès, E.J., Wakin, M.B.: Enhancing sparsity by reweighted l1 minimization. Journal of Fourier Analysis and Applications 14, 877–905 (2008)
Denoel, E., Solvay, J.P.: Linear prediction of speech with a least absolute error criterion. IEEE Transactions on Acoustics, Speech and Signal Processing 33, 1397–1403 (1985)
Drugman, T.: Gloat toolbox, http://tcts.fpms.ac.be/drugman/
Drugman, T., Thomas, M., Gudnason, J., Naylor, P., Dutoit, T.: Detection of glottal closure instants from speech signals: A quantitative review. IEEE Transactions on Audio, Speech, and Language Processing 20(3), 994–1006 (2012)
Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L., Zue, V.: DARPA TIMIT acoustic-phonetic continuous speech corpus. Tech. rep., U.S. Dept. of Commerce, NIST, Gaithersburg, MD (1993)
Giacobello, D.: Sparsity in Linear Predictive Coding of Speech. Ph.D. thesis, Multimedia Information and Signal Processing, Department of Electronic Systems, Aalborg University (2010)
Giacobello, D., Christensen, M.G., Dahl, J., Jensen, S.H., Moonen, M.: Sparse linear predictors for speech processing. In: Proceedings of the INTERSPEECH (2009)
Giacobello, D., Christensen, M.G., Murth, M.N., Jensen, S.H., Marc Moonen, F.: Sparse linear prediction and its applications to speech processing. IEEE Transactions on Audio, Speech and Language Processing 20, 1644–1657 (2012)
Giacobello, D., Christensen, M.G., Murthi, M.N., Jensen, S.H., Moonen, M.: Enhancing sparsity in linear prediction of speech by iteratively reweighted 1-norm minimization. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, ICASSP (2010)
Giacobello, D., Christensen, M., Murthi, M., Jensen, S., Moonen, M.: Retrieving sparse patterns using a compressed sensing framework: Applications to speech coding based on sparse linear prediction. IEEE Signal Processing Letters 17 (2010)
Hurley, N., Rickard, S.: Comparing measures of sparsity. IEEE Transactions on Information Theory 55, 4723–4740 (2009)
Khanagha, V.: Novel Multiscale methods for non-linear speech analysis, Ph.D. thesis, University of Bordeaux 1 (2013), http://geostat.bordeaux.inria.fr/index.php/vahid-khanagha.html
Khanagha, V., Daoudi, K.: An efficient solution to sparse linear prediction analysis of speech. EURASIP Journal on Audio, Speech, and Music Processing (2013)
Khanagha, V., Daoudi, K., Yahia, H.: A novel multiscale method for detection of glottal closure instants. Submitted to IEEE Transactions on Audio, Speech, and Language Processing (2013)
Meng, D., Zhao, Q., Xu, Z.: Improved robustness of sparse pca by l1-norm maximization. Pattern Recognition 45, 487–497 (2012)
Turiel, A., del Pozo, A.: Reconstructing images from their most singular fractal manifold. IEEE Transactions on Image Processing 11, 345–350 (2002)
Turiel, A., Parga, N.: The multi-fractal structure of contrast changes in natural images: from sharp edges to textures. Neural Computation 12, 763–793 (2000)
Turiel, A., Yahia, H., Pérez-Vicente, C.: Microcanonical multifractal formalism: a geometrical approach to multifractal systems. part 1: singularity analysis. Journal of Physics A: Mathematical and Theoretical 41, 015501 (2008)
Yahia, H., Sudre, J.: Garçon, V., Pottier, C.: High-resolution ocean dynamics from microcanonical formulations in non linear complex signal analysis. In: AGU Fall Meeting. American Geophysical Union, San Francisco (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Khanagha, V., Daoudi, K. (2013). Efficient GCI Detection for Efficient Sparse Linear Prediction. In: Drugman, T., Dutoit, T. (eds) Advances in Nonlinear Speech Processing. NOLISP 2013. Lecture Notes in Computer Science(), vol 7911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38847-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-38847-7_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38846-0
Online ISBN: 978-3-642-38847-7
eBook Packages: Computer ScienceComputer Science (R0)