Advertisement

Multi-View AAM Fitting and Construction

Abstract

Active Appearance Models (AAMs) are generative, parametric models that have been successfully used in the past to model deformable objects such as human faces. The original AAMs formulation was 2D, but they have recently been extended to include a 3D shape model. A variety of single-view algorithms exist for fitting and constructing 3D AAMs but one area that has not been studied is multi-view algorithms. In this paper we present multi-view algorithms for both fitting and constructing 3D AAMs.

Fitting an AAM to an image consists of minimizing the error between the input image and the closest model instance; i.e. solving a nonlinear optimization problem. In the first part of the paper we describe an algorithm for fitting a single AAM to multiple images, captured simultaneously by cameras with arbitrary locations, rotations, and response functions. This algorithm uses the scaled orthographic imaging model used by previous authors, and in the process of fitting computes, or calibrates, the scaled orthographic camera matrices. In the second part of the paper we describe an extension of this algorithm to calibrate weak perspective (or full perspective) camera models for each of the cameras. In essence, we use the human face as a (non-rigid) calibration grid. We demonstrate that the performance of this algorithm is roughly comparable to a standard algorithm using a calibration grid. In the third part of the paper, we show how camera calibration improves the performance of AAM fitting.

A variety of non-rigid structure-from-motion algorithms, both single-view and multi-view, have been proposed that can be used to construct the corresponding 3D non-rigid shape models of a 2D AAM. In the final part of the paper, we show that constructing a 3D face model using non-rigid structure-from-motion suffers from the Bas-Relief ambiguity and may result in a “scaled” (stretched/compressed) model. We outline a robust non-rigid motion-stereo algorithm for calibrated multi-view 3D AAM construction and show how using calibrated multi-view motion-stereo can eliminate the Bas-Relief ambiguity and yield face models with higher 3D fidelity.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 199

This is the net price. Taxes to be calculated in checkout.

References

  1. Ahlberg, J. (2001). Using the active appearance algorithm for face and facial feature tracking. In Proceedings of the international conference on computer vision workshop on recognition, analysis, and tracking of faces and gestures in real-time systems (pp. 68–72).

  2. Baker, S., & Matthews, I. (2004). Lucas–Kanade 20 years on: a unifying framework. International Journal of Computer Vision, 56(3), 221–255.

  3. Baker, S., Matthews, I., & Schneider, J. (2004). Automatic construction of active appearance models as an image coding problem. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(10), 1380–1384.

  4. Blanz, V., & Vetter, T. (1999). A morphable model for the synthesis of 3D faces. In Computer graphics, annual conference series (SIGGRAPH) (pp. 187–194).

  5. Bouguet, J.-Y. (2005). Camera calibration toolbox for Matlab. http://www.vision.caltech.edu/bouguetj/calib_doc.

  6. Brand, M. (2001). Morphable 3D models from video. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (Vol. 2, pp. 456–463).

  7. Bregler, C., Hertzmann, A., & Biermann, H. (2000). Recovering non-rigid 3D shape from image streams. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 690–696).

  8. Cootes, T., & Kittipanyangam, P. (2002). Comparing variations on the active appearance model algorithm. In Proceedings of the British machine vision conference (Vol. 2, pp. 837–846).

  9. Cootes, T., Di Mauro, E., Taylor, C., & Lanitis, A. (1996). Flexible 3D models from uncalibrated cameras. Image and Vision Computing, 14, 581–587.

  10. Cootes, T., Edwards, G., & Taylor, C. (1998a). Active appearance models. In Proceedings of the European conference on computer vision (Vol. 2, pp.  484–498).

  11. Cootes, T., Edwards, G., & Taylor, C. (1998b). A comparative evaluation of active appearance model algorithms. In Proceedings of the British machine vision conference (Vol. 2, pp. 680–689).

  12. Cootes, T., Wheeler, G., Walker, K., & Taylor, C. (2000). Coupled-view active appearance models. In Proceedings of the British machine vision conference (Vol. 1, pp. 52–61).

  13. Cootes, T., Edwards, G., & Taylor, C. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 681–685.

  14. Dornaika, F., & Ahlberg, J. (2004). Fast and reliable active appearance model search for 3D face tracking. In Proceedings of the IEEE transactions on systems, man and cybernetics (Vol. 34, pp. 1838–1853).

  15. Edwards, G. J. (1999). Learning to identify faces in images and video sequences. PhD thesis, University of Manchester, Division of Imaging Science and Biomedical Engineering.

  16. Gokturk, S., Bouguet, J., & Grzeszczuk, R. (2001). A data driven model for monocular face tracking. In Proceedings of the international conference on computer vision (pp. 701–708).

  17. Gross, R., Matthews, I., & Baker, S. (2004). Appearance-based face recognition and light-fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(4), 449–465.

  18. Gross, R., Matthews, I., & Baker, S. (2006). Active appearance models with occlusion. Image and Vision Computing, 24(6), 593–604.

  19. Hager, G., & Belhumeur, P. (1998). Efficient region tracking with parametric models of geometry and illumination. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 1025–1039.

  20. Hartley, R. (1995). In defense of the 8-point algorithm. In Proceedings of the international conference on computer vision (pp. 1064–1070).

  21. Hartley, R., & Zisserman, A. (2000). Multiple view geometry in computer vision. Cambridge: Cambridge University Press.

  22. Hu, C., Xiao, J., Matthews, I., Baker, S., Cohn, J., & Kanade, T. (2004). Fitting a single active appearance model simultaneously to multiple images. In Proceedings of the British machine vision conference (pp. 437–446).

  23. Jones, M., & Poggio, T. (1998). Multidimensional morphable models: a framework for representing and matching object classes. In Proceedings of the international conference on computer vision (pp. 683–688).

  24. Koterba, S., Baker, S., Matthews, I., Hu, C., Xiao, J., Cohn, J., & Kanade, T. (2005). Multi-view AAM fitting and camera calibration. In Proceedings of the international conference on computer vision (pp. 511–518).

  25. Matthews, I., & Baker, S. (2004). Active Appearance Models revisited. International Journal of Computer Vision, 60(2), 135–164. Also appeared as Carnegie Mellon University Robotics Institute Technical Report CMU-RI-TR-03-02.

  26. Matthews, I., Xiao, J., & Baker, S. (2007). 2D vs 3D deformable face models: representational power, construction, and real-time fitting. International Journal of Computer Vision. 10.1007/s11263-007-0043-2.

  27. Pighin, F. H., Szeliski, R., & Salesin, D. (1999). Resynthesizing facial animation through 3d model-based tracking. In Proceedings of the international conference on computer vision (pp. 143–150).

  28. Romdhani, S., & Vetter, T. (2003). Efficient, robust and accurate fitting of a 3D morphable model. In Proceedings of the international conference on computer vision (pp. 59–66).

  29. Sclaroff, S., & Isidoro, J. (1998). Active blobs. In Proceedings of the international conference on computer vision (pp. 1146–1153).

  30. Sclaroff, S., & Isidoro, J. (2003). Active blobs: region-based, deformable appearance models. Computer Vision and Image Understanding, 89(2/3), 197–225.

  31. Soatto, S., & Brockett, R. (1998). Optimal structure from motion: local ambiguities and global estimates. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 282–288).

  32. Sung, J., & Kim, D. (2004). Extension of AAM with 3D shape model for facial shape tracking. In Proceedings of the IEEE international conference on image processing (Vol. 5, pp. 3363–3366).

  33. Szeliski, R., & Kang, S.-B. (1997). Shape ambiguities in structure from motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(5), 506–512.

  34. Torresani, L., Yang, D., Alexander, G., & Bregler, C. (2001). Tracking and modeling non-rigid objects with rank constraints. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp.  493–500).

  35. Vetter, T., & Poggio, T. (1997). Linear object classes and image synthesis from a single example image. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7), 733–742.

  36. Waxman, A., & Duncan, J. (1986). Binocular image flows: steps toward stereo-motion fusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6), 715–729.

  37. Wen, Z., & Huang, T. S. (2003). Capturing subtle facial motions in 3D face tracking. In Proceedings of the international conference on computer vision (p. 1343).

  38. Xiao, J., & Kanade, T. (2005). Uncalibrated perspective reconstruction of deformable structures. In Proceedings of the international conference on computer vision (pp. 1075–1082).

  39. Xiao, J., Baker, S., Matthews, I., & Kanade, T. (2004a). Real-time combined 2D+3D active appearance models. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (Vol. 2, pp. 535–542).

  40. Xiao, J., Chai, J., & Kanade, T. (2004b). A closed-form solution to non-rigid shape and motion recovery. In Proceedings of the European conference on computer vision (pp. 573–587).

  41. Zhang, Z., & Faugeras, O. (1992a). 3D dynamic scene analysis. Berlin: Springer.

  42. Zhang, Z., & Faugeras, O. (1992b). Estimation of displacements from two 3-D frames obtained from stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(12), 1141–1156.

Download references

Author information

Correspondence to Krishnan Ramnath.

Electronic Supplementary Material

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Ramnath, K., Koterba, S., Xiao, J. et al. Multi-View AAM Fitting and Construction. Int J Comput Vis 76, 183–204 (2008). https://doi.org/10.1007/s11263-007-0050-3

Download citation

Keywords

  • Active appearance models
  • Multi-view 3D face model construction
  • Multi-view AAM fitting
  • Non-rigid structure-from-motion
  • Motion-stereo
  • Camera calibration