Skip to main content
Log in

Face detection and facial expression recognition using simultaneous clustering and feature selection via an expectation propagation statistical learning framework

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, we focus on developing a novel framework which can be effectively used for both face detection (i.e. discriminate faces from non-face patterns) and facial expression recognition. The proposed statistical framework is based on a Dirichlet process mixture of generalized Dirichlet (GD) distributions used to model local binary pattern (LBP) features. Our method is built on nonparametric Bayesian analysis where the determination of the number of clusters is sidestepped by assuming an infinite number of mixture components. An unsupervised feature selection scheme is also integrated with the proposed nonparametric framework to improve modeling performance and generalization capabilities. By learning the proposed model using an expectation propagation (EP) inference approach, all the involved model parameters and feature saliencies can be evaluated simultaneously in a single optimization framework. Furthermore, the proposed framework is extended by adopting a localized feature selection scheme which has shown, according to our results, superior performance, to determine the most important facial features, as compared to the global one. The effectiveness and utility of the proposed method is illustrated through extensive empirical results using both synthetic data and two challenging applications involving face detection, and facial expression recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. It is worth mentioning that, our approach is based on infinite mixture models and works as an unsupervised learning technique. Thus, it may not applicable to some of the facial analysis problems, such as face identification.

  2. http://vision.ucsd.edu/~leekc/ExtYaleDatabase/ExtYaleB.html

  3. http://www.anefian.com/research/face_reco.htm

  4. http://www.vision.caltech.edu/html-files/archive.html

  5. http://www.vision.caltech.edu/html-files/archive.html

  6. http://www.pitt.edu/~jeffcohn/CKandCK+.htm

  7. http://vision.ucsd.edu/~pdollar/

References

  1. Ahonen T, Hadid A, Pietikainen M (2006) Face description with local binary patterns: application to face recognition. IEEE Trans Pattern Anal Mach Intell 28(12):2037–2041

    Article  Google Scholar 

  2. Amit Y, Trouvé A (2007) POP: patchwork of parts models for object recognition. Int J Comput Vis 75:267–282

    Article  Google Scholar 

  3. Bartlett M, Movellan J, Sejnowski T (2002) Face recognition by independent component analysis. IEEE Trans Neural Netw 13(6):1450–1464

    Article  Google Scholar 

  4. Bartlett M, Littlewort G, Frank M, Lainscsek C, Fasel I, Movellan J (2005) Recognizing facial expression: machine learning and application to spontaneous behavior. In: Proc. of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 568–573

  5. Belhumeur PN, Hespanha JP, Kriegman DJ (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720

    Article  Google Scholar 

  6. Berrani SA, Garcia C (2008) Robust detection of outliers for projection-based face recognition methods. Multimedia Tools Appl 38:271–291

    Article  Google Scholar 

  7. Black MJ, Yacoob Y (1995) Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion. In: Proc. of the IEEE international conference on computer vision (ICCV), pp 374–381

  8. Blei DM, Jordan MI (2005) Variational inference for Dirichlet process mixtures. Bayesian Anal 1:121–144

    Article  MathSciNet  Google Scholar 

  9. Bouguila N, Ziou D (2006) A hybrid SEM algorithm for high-dimensional unsupervised learning using a finite generalized Dirichlet mixture. IEEE Trans Image Process 15(9):2657–2668

    Article  Google Scholar 

  10. Bouguila N, Ziou D (2007) High-dimensional unsupervised selection and estimation of a finite generalized Dirichlet mixture model based on minimum message length. IEEE Trans Pattern Anal Mach Intell 29:1716–1731

    Article  Google Scholar 

  11. Bouguila N, Ziou D (2012) A countably infinite mixture model for clustering and feature selection. Knowl Inf Syst 33(2):351–370

    Article  Google Scholar 

  12. Boutemedjet S, Bouguila N, Ziou D (2009) A hybrid feature extraction selection approach for high-dimensional non-Gaussian data clustering. IEEE Trans Pattern Anal Mach Intell 31(8):1429–1443

    Article  Google Scholar 

  13. Cevikalp H, Neamtu M, Wilkes DM, Barkana A (2005) Discriminative common vectors for face recognition. IEEE Trans Pattern Anal Mach Intell 27(1):4–13

    Article  Google Scholar 

  14. Chang S, Dasgupta N, Carin L (2005) A Bayesian approach to unsupervised feature selection and density estimation using expectation propagation. In: Proc. of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 1043–1050

  15. Cheng F, Yu J, Xiong H (2010) Facial expression recognition in JAFFE dataset based on Gaussian process classification. IEEE Trans Neural Netw 21(10):1685–1690

    Article  Google Scholar 

  16. Constantinopoulos C, Titsias M, Likas A (2006) Bayesian feature and model selection for Gaussian mixture models. IEEE Trans Pattern Anal Mach Intell 28(6):1013–1018

    Article  Google Scholar 

  17. Dollar P, Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temporal features. In: Proc. of 2nd joint IEEE international workshop on visual surveillance and performance evaluation of tracking and surveillance, pp 65–72

  18. Donato G, Bartlett M, Hager J, Ekman P, Sejnowski T (1999) Classifying facial actions. IEEE Trans Pattern Anal Mach Intell 21(10):974–989

    Article  Google Scholar 

  19. Fan W, Bouguila N (2012) Face detection and facial expression recognition using a novel variational statistical framework. In: Dziech A, Czyzewski A (eds) Multimedia communications, services and security, communications in computer and information science, vol 287. Springer, Berlin Heidelberg, pp 95–106

    Google Scholar 

  20. Fan W, Bouguila N (2012) Nonparametric localized feature selection via a dirichlet process mixture of generalized Dirichlet distributions. In: Huang T, Zeng Z, Li C, Leung CS (eds) ICONIP (3). Lecture notes in computer science, vol 7665. Springer, pp 25–33

  21. Fan W, Bouguila N (2012) Online learning of a Dirichlet process mixture of generalized Dirichlet distributions for simultaneous clustering and localized feature selection. J Mach Learn Res- Proceedings Track 25:113–128

    Google Scholar 

  22. Fan W, Bouguila N (2012) Variational learning for dirichlet process mixtures of Dirichlet distributions and applications. Multimedia Tools Appl 1–18. doi:10.1007/s11042-012-1191-0

  23. Fan W, Bouguila N, Ziou D (2011) Unsupervised anomaly intrusion detection via localized bayesian feature selection. In: Proc. of the IEEE international conference on data mining (ICDM), pp 1032–1037

  24. Fan W, Bouguila N, Ziou D (2012) Variational learning for finite Dirichlet mixture models and applications. IEEE Trans Neural Netw Learn Syst 23(5):762–774

    Article  Google Scholar 

  25. Fan W, Bouguila N, Ziou D (2013) Unsupervised hybrid feature extraction selection for high-dimensional non-Gaussian data clustering with variational inference. IEEE Trans Knowl Data Eng 25(7):1670–1685

    Article  Google Scholar 

  26. Fasel B, Luettin J (1999) Automatic facial expression analysis: a survey. Pattern Recogn 36(1):259–275

    Article  Google Scholar 

  27. Ferguson TS (1983) Bayesian density estimation by mixtures of normal distributions. Recent Adv Stat 24:287–302

    MathSciNet  Google Scholar 

  28. Georghiades A, Belhumeur P, Kriegman D (2001) From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Trans Pattern Anal Mach Intell 23(6):643–660

    Article  Google Scholar 

  29. Guan Y, Dy JG, Jordan MI (2011) A unified probabilistic model for global and local unsupervised feature selection. In: Proc. of the international conference on machine learning (ICML), pp 1073–1080

  30. Hwang WS, Weng J (2000) Hierarchical discriminant regression. IEEE Trans Pattern Anal Mach Intell 22(11):1277–1293

    Article  Google Scholar 

  31. Jun B, Lee J, Kim D (2011) A novel illumination-robust face recognition using statistical and non-statistical method. Pattern Recogn Lett 32(2):329–336

    Article  Google Scholar 

  32. Kanade T, Cohn J, Tian Y (2000) Comprehensive database for facial expression analysis. In: Proc. of the IEEE international conference on automatic face and gesture recognition, pp 46–53

  33. Kim KI, Jung K, Kim HJ (2002) Face recognition using kernel principal component analysis. IEEE Signal Proc Lett 9(2):40–42

    Article  Google Scholar 

  34. Korwar RM, Hollander M (1973) Contributions to the theory of Dirichlet processes. Ann Probab 1:705–711

    Article  MATH  MathSciNet  Google Scholar 

  35. Kotsia I, Pitas I, Zafeiriou S, Zafeiriou S (2009) Novel multiclass classifiers based on the minimization of the within-class variance. IEEE Trans Neural Netw 20(1):14–34

    Article  Google Scholar 

  36. Kwon YH, da Vitoria Lobo N (1994) Age classification from facial images. In: Proc. of the IEEE conference on computer vision and pattern recognition (CVPR), pp 762–767

  37. Law MHC, Figueiredo MAT, Jain AK (2004) Simultaneous feature selection and clustering using mixture models. IEEE Trans Pattern Anal Mach Intell 26(9):1154–1166

    Article  Google Scholar 

  38. Li Y, dong M, Hua J (2009) Simultaneous localized feature selection and model detection for Gaussian mixtures. IEEE Trans Pattern Anal Mach Intell 31:953–960

    Article  Google Scholar 

  39. Liao S, Fan W, Chung A, Yeung DY (2006) Facial expression recognition using advanced local binary patterns, tsallis entropies and global appearance features. In: Proc. of the IEEE international conference on image processing (ICIP), pp 665–668

  40. Lin WY, Chen MY (2012) A novel framework for automatic 3D face recognition using quality assessment. Multimedia Tools Appl 1–17. doi:10.1007/s11042-012-1092-2

  41. Liu C (2004) Gabor-based kernel PCA with fractional power polynomial models for face recognition. IEEE Trans Pattern Anal Mach Intell 26(5):572–581

    Article  Google Scholar 

  42. Liu C (2006) Capitalize on dimensionality increasing techniques for improving face recognition grand challenge performance. IEEE Trans Pattern Anal Mach Intell 28(5):725–737

    Article  Google Scholar 

  43. Lucey P, Cohn J, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The extended Cohn–Kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: Proc. of IEEE computer society conference on computer vision and pattern recognition workshops (CVPRW), pp 94–101

  44. Lyons M, Akamatsu S, Kamachi M, Gyoba J (1998) Coding facial expressions with Gabor wavelets. In: Proc. of the IEEE international conference on automatic face and gesture recognition, pp 200–205

  45. Lyons MJ, Budynek J, Akamatsu S (1999) Automatic classification of single facial images. IEEE Trans Pattern Anal Mach Intell 21(12):1357–1362

    Article  Google Scholar 

  46. Ma Z, Leijon A (2010) Expectation propagation for estimating the parameters of the Beta distribution. In: Proc. of the IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp 2082–2085

  47. Maybeck PS (1982) Stochastic models, estimation and control. Academic, New York

  48. Minka T (2001) Expectation propagation for approximate Bayesian inference. In: Proc. of the conference on uncertainty in artificial intelligence (UAI), pp 362–369

  49. Minka T, Lafferty J (2002) Expectation-propagation for the generative aspect model. In: Proc. of the conference on uncertainty in artificial intelligence (UAI), pp 352–359

  50. Nefian A (2002) Embedded Bayesian networks for face recognition. In: Proc. of the IEEE international conference on multimedia and expo (ICME), vol 2, pp 133–136

  51. Ojala T, Pietikainen M, Harwood D (1996) A comparative study of texture measures with classification based on featured distributions. Pattern Recogn 29(1):51–59

    Article  Google Scholar 

  52. Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987

    Article  Google Scholar 

  53. Pantic M, Patras I (2006) Dynamics of facial expression: recognition of facial actions and their temporal segments from face profile image sequences. IEEE Trans Syst Man Cybern Part B, Cybern 36(2):433–449

    Article  Google Scholar 

  54. Pantic M, Rothkrantz L (2004) Facial action recognition for facial expression analysis from static face images. IEEE Trans Syst Man Cybern Part B, Cybern 34(3):1449–1461

    Article  Google Scholar 

  55. Sethuraman J (1994) A constructive definition of Dirichlet priors. Stat Sin 4:639–650

    MATH  MathSciNet  Google Scholar 

  56. Shan C, Gong S, McOwan PW (2005) Robust facial expression recognition using local binary patterns. In: Proc. of the IEEE international conference on image processing (ICIP), pp 370–373

  57. Shan C, Gong S, McOwan PW (2009) Facial expression recognition based on local binary patterns: a comprehensive study. Image Vision Comput 27(6):803–816

    Article  Google Scholar 

  58. Tefas A, Kotropoulos C, Pitas I (2001) Using support vector machines to enhance the performance of elastic graph matching for frontal face authentication. IEEE Trans Pattern Anal Mach Intell 23(7):735–746

    Article  Google Scholar 

  59. Tian YI, Kanade T, Cohn J (2001) Recognizing action units for facial expression analysis. IEEE Trans Pattern Anal Mach Intell 23(2):97–115

    Article  Google Scholar 

  60. Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 13(1):71–86

    Article  Google Scholar 

  61. Turk M, Pentland A (1991) Face recognition using eigenfaces. In: Proc. of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 586–591

  62. Viola P, Jones M (2004) Robust real-time face detection. Int J Comput Vis 57:137–154

    Article  Google Scholar 

  63. Xie X, Lam KM (2006) Gabor-based kernel PCA with doubly nonlinear mapping for face recognition with a single face image. IEEE Trans Image Proces 15(9):2481–2492

    Article  Google Scholar 

  64. Zhang W, Shan S, Gao W, Chen X, Zhang H (2005) Local gabor binary pattern histogram sequence (LGBPHS): a novel non-statistical model for face representation and recognition. In: Proc. of the IEEE international conference on computer vision (ICCV), pp 786–791

Download references

Acknowledgements

The completion of this research was made possible thanks to the Natural Sciences and Engineering Research Council of Canada (NSERC). We would like to thank Profs. Jeffery Cohn and Kuang-chih Lee for making the Cohn–Kanade database and the extended Yale B face database, respectively, available. We would like to thank the associate editor and the three reviewers for their helpful comments, also.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nizar Bouguila.

Appendices

Appendix A: Proof of equation (30)

The partial derivative of ln Z i with respect to the hyperparameter \(a_j^{\setminus i}\) is calculated by

$$ \begin{array}{rll} \nabla_{a_j}^{\setminus i}\ln Z_i &=& \frac{1}{Z_i}\int f_i(\Lambda)\frac{q^{\setminus i}(\Lambda)}{q^{\setminus i}\left(\phi_j^{\setminus i}\right)}\frac{\partial}{\partial a_j^{\setminus i}}q^{\setminus i}\left(\phi_j^{\setminus i}\right) d\Lambda \\ &=&\int \frac{f_i(\Lambda)q^{\setminus i}(\Lambda)}{Z_i}\left[\ln \phi^{\setminus i}_j + \Psi\left(a_j^{\setminus i}+b_j^{\setminus i}\right)-\Psi\left(a_j^{\setminus i}\right)\right]d\Lambda \\ &=&\int \widehat{p}(\Lambda)\left[\ln \phi^{\setminus i}_j + \Psi\left(a_j^{\setminus i}+b_j^{\setminus i}\right)-\Psi\left(a_j^{\setminus i}\right)\right]d\Lambda \\ &=&E_{\widehat{p}}\left[\ln\phi_j\right] + \Psi\left(a_j^{\setminus i}+b_j^{\setminus i}\right) - \Psi\left(a_j^{\setminus i}\right) \\ &=&\Psi\left(a_j^\ast\right)-\Psi\left(a_j^\ast+b_j^\ast\right) + \Psi\left(a_j^{\setminus i}+b_j^{\setminus i}\right) - \Psi\left(a_j^{\setminus i}\right) \end{array} $$
(39)

Appendix B: Calculating \(\nabla_{b_j}^{\setminus i}\ln Z_i\), \(\nabla_{c_1}^{\setminus i}\ln Z_i\), \(\nabla_{\vec{u}_{jl}}^{\setminus i}\ln Z_i\), \(\nabla_{A_{jl}}^{\setminus i}\ln Z_i\), \(\nabla_{\rho_l}^{\setminus i}\ln Z_i\) and \(\nabla_{B_l}^{\setminus i}\ln Z_i\)

We can calculate the partial derivative of ln Z i with respect to the hyperparameter \(b_j^{\setminus i}\) as

$$ \begin{array}{rll} \nabla_{b_j}^{\setminus i}\ln Z_i &=& \frac{1}{Z_i}\int f_i(\Lambda)\frac{q^{\setminus i}(\Lambda)}{q^{\setminus i}\left(\phi_j^{\setminus i}\right)}\frac{\partial}{\partial b_j^{\setminus i}}q^{\setminus i}\left(\phi_j^{\setminus i}\right) d\Lambda \\ &=&\int \frac{f_i(\Lambda)q^{\setminus i}(\Lambda)}{Z_i}\left[\ln \left(1-\phi^{\setminus i}_j\right) + \Psi\left(a_j^{\setminus i}+b_j^{\setminus i}\right)-\Psi\left(b_j^{\setminus i}\right)\right]d\Lambda \\ &=&\int \widehat{p}(\Lambda)\left[\ln \left(1-\phi^{\setminus i}_j\right) + \Psi\left(a_j^{\setminus i}+b_j^{\setminus i}\right)-\Psi\left(b_j^{\setminus i}\right)\right]d\Lambda \\ &=&E_{\widehat{p}}\left[\ln\left(1-\phi_j\right)\right] + \Psi\left(a_j^{\setminus i}+b_j^{\setminus i}\right) - \Psi\left(b_j^{\setminus i}\right) \\ &=&\Psi(b_j^\ast)-\Psi\left(a_j^\ast+b_j^\ast\right) + \Psi\left(a_j^{\setminus i}+b_j^{\setminus i}\right) - \Psi\left(b_j^{\setminus i}\right) \end{array} $$
(40)

Next, the partial derivative of ln Z i with respect to the hyperparameter \(c_1^{\setminus i}\) is calculated by

$$ \begin{array}{rll} \nabla_{c_1}^{\setminus i}\ln Z_i &=& \frac{1}{Z_i}\int f_i(\Lambda)\frac{q^{\setminus i}(\Lambda)}{q^{\setminus i}\left(\epsilon_{l_1}^{\setminus i}\right)}\frac{\partial}{\partial c_1^{\setminus i}}q^{\setminus i}\left(\epsilon_{l_1}^{\setminus i}\right) d\Lambda\nonumber\\ &=&\int \widehat{p}(\Lambda)\left[\ln \epsilon_{l_1} + \Psi\left(c_1^{\setminus i}+c_2^{\setminus i}\right)-\Psi\left(c_1^{\setminus i}\right)\right]d\Lambda\nonumber\\ &=&E_{\widehat{p}}[\ln \epsilon_{l_1}] + \Psi\left(c_1^{\setminus i}+c_2^{\setminus i}\right)-\Psi\left(c_1^{\setminus i}\right)\nonumber\\ &=&\Psi(c_1^{\ast})- \Psi\left(c_1^\ast+c_2^\ast\right) + \Psi\left(c_1^{\setminus i}+c_2^{\setminus i}\right)-\Psi\left(c_1^{\setminus i}\right) \end{array} $$
(41)

Similarly, we can compute \(\nabla_{c_2}^{\setminus i}\ln Z_i\) as

$$ \nabla_{c_2}^{\setminus i}\ln Z_i = \Psi\left(c_2^{\ast}\right)- \Psi\left(c_1^\ast+c_2^\ast\right) + \Psi\left(c_1^{\setminus i}+c_2^{\setminus i}\right)-\Psi\left(c_2^{\setminus i}\right) $$
(42)

Then, the partial derivative of ln Z i with respect to the hyperparameter \(\boldsymbol{\mu}_{jl}^{\setminus i}\) is given by

$$ \begin{array}{rll} \nabla_{\boldsymbol{\mu}_{jl}}^{\setminus i}\ln Z_i &=& \frac{1}{Z_i}\int f_i(\Lambda)\frac{q^{\setminus i}(\Lambda)}{q^{\setminus i}\left(\theta_{jl}^{\setminus i}\right)}\frac{\partial}{\partial \boldsymbol{\mu}_{jl}^{\setminus i}}q^{\setminus i}\left(\theta_{jl}^{\setminus i}\right) d\Lambda\nonumber\\ &=&\int \frac{f_i(\Lambda)q^{\setminus i}(\Lambda)}{Z_i}\left[A_{jl}^{\setminus i}\theta_{jl}^{\setminus i}-A_{jl}^{\setminus i}\boldsymbol{\mu}_{jl}^{\setminus i}\right]d\Lambda\nonumber\\ &=&A_{jl}^{\setminus i}E_{\widehat{p}}[\theta_{jl}]-A_{jl}^{\setminus i}\boldsymbol{\mu}_{jl}^{\setminus i}\nonumber\\ &=&A_{jl}^{\setminus i}\boldsymbol{\mu}_{jl}^\ast-A_{jl}^{\setminus i}\boldsymbol{\mu}_{jl}^{\setminus i} \end{array} $$
(43)

We can also compute the partial derivative of ln Z i with respect to the hyperparameter \(A_{jl}^{\setminus i}\) as

$$ \begin{array}{rll} \nabla_{A_{jl}}^{\setminus i}\ln Z_i &=&\frac{1}{Z_i}\int f_i(\Lambda)\frac{q^{\setminus i}(\Lambda)}{q^{\setminus i}\left(\theta_{jl}^{\setminus i}\right)}\frac{\partial}{\partial A_{jl}^{\setminus i}}q^{\setminus i}\left(\theta_{jl}^{\setminus i}\right) d\Lambda \nonumber\\ &=& \int\frac{f_i(\Lambda)q^{\setminus i}(\Lambda)}{Z_i} \frac{1}{2}\left\{\left|\left(A_{jl}^{\setminus i}\right)^{-1}\right|-\left[\sum_{d=1}^2\left(\theta_{jld}^{\setminus i}\right)^2-2\theta_{jld}^{\setminus i}\mu_{jld}^{\setminus i}+\left(\mu_{jld}^{\setminus i}\right)^2\right] \right\}d\Lambda\nonumber\\ &=& \int \widehat{p}(\Lambda) \frac{1}{2}\left\{\left|\left(A_{jl}^{\setminus i}\right)^{-1}\right|-\left[\sum_{d=1}^2\left(\theta_{jld}^{\setminus i}\right)^2-2\theta_{jld}^{\setminus i}\mu_{jld}^{\setminus i}+\left(\mu_{jld}^{\setminus i}\right)^2\right] \right\}d\Lambda\nonumber\\ &=& \frac{1}{2}\left\{\left|\left(A_{jl}^{\setminus i}\right)^{-1}\right|-\left[\sum_{d=1}^2\left(\mu_{jld}^\ast\right)^2-2\mu_{jld}^\ast\mu_{jld}^{\setminus i}+\left(\mu_{jld}^{\setminus i}\right)^2\right] \right\} \end{array} $$
(44)

We then compute \(\nabla_{\boldsymbol{\rho}_{l}}^{\setminus i}\ln Z_i\) and \(\nabla_{B_{l}}^{\setminus i}\ln Z_i\) using similar ways as for \(\nabla_{\boldsymbol{\mu}_{jl}}^{\setminus i}\ln Z_i\) and \(\nabla_{A_{jl}}^{\setminus i}\ln Z_i\), such that

$$ \nabla_{\boldsymbol{\rho}_{l}}^{\setminus i}\ln Z_i = B_{l}^{\setminus i}\boldsymbol{\rho}_{l}^\ast-B_{l}^{\setminus i}\boldsymbol{\rho}_{jl}^{\setminus i} $$
(45)
$$ \nabla_{B_{l}}^{\setminus i}\ln Z_i = \frac{1}{2}\left\{\left|\left(B_{l}^{\setminus i}\right)^{-1}\right|-\left[\sum_{d=1}^2(\rho_{ld}^\ast)^2-2\rho_{ld}^\ast\rho_{ld}^{\setminus i}+\left(\rho_{ld}^{\setminus i}\right)^2\right] \right\} $$
(46)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fan, W., Bouguila, N. Face detection and facial expression recognition using simultaneous clustering and feature selection via an expectation propagation statistical learning framework. Multimed Tools Appl 74, 4303–4327 (2015). https://doi.org/10.1007/s11042-013-1548-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1548-z

Keywords

Navigation