Evolutionary Biology

, Volume 46, Issue 4, pp 303–316 | Cite as

Seeing Distinct Groups Where There are None: Spurious Patterns from Between-Group PCA

  • Andrea Cardini
  • Paul O’Higgins
  • F. James RohlfEmail author
Research Article


Using sampling experiments, we found that, when there are fewer groups than variables, between-groups PCA (bgPCA) may suggest surprisingly distinct differences among groups for data in which none exist. While apparently not noticed before, the reasons for this problem are easy to understand. A bgPCA captures the g − 1 dimensions of variation among the g group means, but only a fraction of the \(\sum {n_{i} } - g\) dimensions of within-group variation (\(n_{i}\) are the sample sizes), when the number of variables, p, is greater than g − 1. This introduces a distortion in the appearance of the bgPCA plots because the within-group variation will be underrepresented, unless the variables are sufficiently correlated so that the total variation can be accounted for with just g − 1 dimensions. The effect is most obvious when sample sizes are small relative to the number of variables, because smaller samples spread out less, but the distortion is present even for large samples. Strong covariance among variables largely reduces the magnitude of the problem, because it effectively reduces the dimensionality of the data and thus enables a larger proportion of the within-group variation to be accounted for within the g − 1-dimensional space of a bgPCA. The distortion will still be relevant though its strength will vary from case to case depending on the structure of the data (p, g, covariances etc.). These are important problems for a method mainly designed for the analysis of variation among groups when there are very large numbers of variables and relatively small samples. In such cases, users are likely to conclude that the groups they are comparing are much more distinct than they really are. Having many variables but just small sample sizes is a common problem in fields ranging from morphometrics (as in our examples) to molecular analyses.


Covariance Geometric morphometrics Group separation Isotropic model Spurious clustering 



We are very grateful to Jessica Grisenti, who carefully collected the marmot data for her undergraduate thesis and gave AC permission to use them. The authors appreciate the most helpful comments of Julien Claude who reviewed this paper.

Compliance with Ethical Standards

Conflict of interest

The authors declare that they have no conflicts of interest.


  1. Astúa, D. (2009). Evolution of scapula size and shape in Didelphid Marsupials (Didelphimorphia: Didelphidae). Evolution, 63(9), 2438–2456. Scholar
  2. Baab, K. L. (2016). The role of neurocranial shape in defining the boundaries of an expanded Homo erectus hypodigm. Journal of Human Evolution, 92, 1–21. Scholar
  3. Benazzi, S., Douka, K., Fornai, C., Bauer, C. C., Kullmer, O., Svoboda, J., et al. (2011). Early dispersal of modern humans in Europe and implications for Neanderthal behaviour. Nature, 479(7374), 525–528. Scholar
  4. Blackith, R. E., & Reyment, R. A. (1971). Multivariate morphometrics. New York: Academic Press.Google Scholar
  5. Bookstein, F. L. (1991). Morphometric tools for landmark data: Geometry and Biology. New York: Cambridge Univ. Press.Google Scholar
  6. Bookstein, F. L. (1997). Landmark methods for forms without landmarks: Morphometrics of group differences in outline shape. Medical Image Analysis, 1, 225–243.CrossRefGoogle Scholar
  7. Bookstein, F. L. (2017). A newly noticed formula enforces fundamental limits on geometric morphometric analyses. Evolutionary Biology, 44(4), 522–541. Scholar
  8. Bookstein, F. L. (2018). A course in morphometrics for biologists. New York: Cambridge Univ. Press.CrossRefGoogle Scholar
  9. Bookstein, F. L. (2019). Pathologies of between-groups principal components analysis in geometric morphometrics. Evolutionary Biology. Scholar
  10. Bookstein, F., Schäfer, K., Prossinger, H., Seidler, H., Fieder, M., Stringer, C., et al. (1999). Comparing frontal cranial profiles in archaic and modern Homo by morphometric analysis. The Anatomical Record, 257(6), 217–224.;2-W.CrossRefPubMedGoogle Scholar
  11. Boulesteix, A.-L. (2005). A note on between-group PCA. International Journal of Pure and Applied Mathematics, 19, 359–366.Google Scholar
  12. Cardini, A. (2003). The geometry of the marmot (Rodentia: Sciuridae) mandible: Phylogeny and patterns of morphological evolution. Systematic Biology, 52(2), 186–205. Scholar
  13. Cardini, A. (2018). Integration and modularity in Procrustes shape data: Is there a risk of spurious results? Evolutionary Biology. Scholar
  14. Cardini, A., & Elton, S. (2007). Sample size and sampling error in geometric morphometric studies of size and shape. Zoomorphology, 126(2), 121–134. Scholar
  15. Cardini, A., & Elton, S. (2008). Does the skull carry a phylogenetic signal? Evolution and modularity in the guenons. Biological Journal of the Linnean Society, 93(4), 813–834. Scholar
  16. Cardini, A., & Elton, S. (2017). Is there a “Wainer’s rule”? Testing which sex varies most as an example analysis using GueSDat, the free Guenon Skull Database. Hystrix, the Italian Journal of Mammalogy, 28(2), 147–156. Scholar
  17. Cardini, A., Jansson, A., & Elton, S. (2007). A geometric morphometric approach to the study of ecogeographical and clinal variation in vervet monkeys. Journal of Biogeography, 34(10), 1663–1678. Scholar
  18. Cardini, A., & O’Higgins, P. (2004). Patterns of morphological evolution in Marmota (Rodentia, Sciuridae): Geometric morphometrics of the cranium in the context of marmot phylogeny, ecology and conservation. Biological Journal of the Linnean Society, 82(3), 385–407. Scholar
  19. Cardini, A., Seetah, K., & Barker, G. (2015). How many specimens do I need? Sampling error in geometric morphometrics: testing the sensitivity of means and variances in simple randomized selection experiments. Zoomorphology, 134(2), 149–163. Scholar
  20. Chemisquy, M. A., Prevosti, F. J., Martin, G., & Flores, D. A. (2015). Evolution of molar shape in didelphid marsupials (Marsupialia: Didelphidae): Analysis of the influence of ecological factors and phylogenetic legacy. Zoological Journal of the Linnean Society, 173(1), 217–235. Scholar
  21. Chiozzi, G., Bardelli, G., Ricci, M., De Marchi, G., & Cardini, A. (2014). Just another island dwarf? Phenotypic distinctiveness in the poorly known Soemmerring’s Gazelle, Nanger soemmerringii (Cetartiodactyla: Bovidae), of Dahlak Kebir Island. Biological Journal of the Linnean Society, 111(3), 603–620. Scholar
  22. Cooke, S. B., & Terhune, C. E. (2015). Form, function, and geometric morphometrics. The Anatomical Record, 298(1), 5–28. Scholar
  23. Corti, M., Aguilera, M., & Capanna, E. (2001). Size and shape changes in the skull accompanying speciation of South American spiny rats (Rodentia: Proechimys spp.). Journal of Zoology, 253(4), 537–547. Scholar
  24. Cucchi, T., Hulme-Beaman, A., Yuan, J., & Dobney, K. (2011). Early Neolithic pig domestication at Jiahu, Henan Province, China: Clues from molar shape analyses using geometric morphometric approaches. Journal of Archaeological Science, 38(1), 11–22. Scholar
  25. Culhane, A. C., Perrière, G., Considine, E. C., Cotter, T. G., & Higgins, D. G. (2002). Between-group analysis of microarray data. Bioinformatics, 18(12), 1600–1608. Scholar
  26. Dapporto, L., Petrocelli, I., & Turillazzi, S. (2011). Incipient morphological castes in Polistes gallicus (Vespidae, Hymenoptera). Zoomorphology, 130(3), 197–201. Scholar
  27. Domjanic, J., Seidler, H., & Mitteroecker, P. (2015). A combined morphometric analysis of foot form and its association with sex, stature, and body mass. American Journal of Physical Anthropology, 157(4), 582–591. Scholar
  28. Ferretti, A., Cardini, A., Crampton, J. S., Serpagli, E., Sheets, H. D., & Štorch, P. (2013). Rings without a lord? Enigmatic fossils from the lower Palaeozoic of Bohemia and the Carnic Alps. Lethaia, 46(2), 211–222. Scholar
  29. Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of eugenics, 7(2), 179–188. Scholar
  30. Franchini, P., Fruciano, C., Spreitzer, M. L., Jones, J. C., Elmer, K. R., Henning, F., et al. (2014). Genomic architecture of ecologically divergent body shape in a pair of sympatric crater lake cichlid fishes. Molecular Ecology, 23(7), 1828–1845. Scholar
  31. Franklin, D., Cardini, A., Flavel, A., & Kuliukas, A. (2013). Estimation of sex from cranial measurements in a Western Australian population. Forensic Science International, 229(1), 158.e151–158.e158. Scholar
  32. Fruciano, C., Celik, M. A., Butler, K., Dooley, T., Weisbecker, V., & Phillips, M. J. (2017). Sharing is caring? Measurement error and the issues arising from combining 3D morphometric datasets. Ecology and Evolution, 7(17), 7034–7046. Scholar
  33. Fruciano, C., Franchini, P., Raffini, F., Fan, S., & Meyer, A. (2016). Are sympatrically speciating Midas cichlid fish special? Patterns of morphological and genetic variation in the closely related species Archocentrus centrarchus. Ecology and Evolution, 6(12), 4102–4114. Scholar
  34. Fruciano, C., Tigano, C., & Ferrito, V. (2011). Geographical and morphological variation within and between colour phases in Coris julis (L. 1758), a protogynous marine fish. Biological Journal of the Linnean Society, 104(1), 148–162. Scholar
  35. Galimberti, F., Sanvito, S., Vinesi, M. C., & Cardini, A. (2019). Nose-metrics of wild southern elephant seal (Mirounga leonina) males using photogrammetry and geometric morphometry. Journal of Zoological Systematics & Evolutionary Research. Scholar
  36. Gómez-Robles, A., Olejniczak, A. J., Martinón-Torres, M., Prado-Simón, L., & Castro, J. M. B. (2011). Evolutionary novelties and losses in geometric morphometrics: A practical approach through hominin molar morphology. Evolution, 65(6), 1772–1790. Scholar
  37. Gonzalez, P. N., Kristensen, E., Morck, D. W., Boyd, S., & Hallgrímsson, B. (2013). Effects of growth hormone on the ontogenetic allometry of craniofacial bones. Evolution & Development, 15(2), 133–145. Scholar
  38. Green, D. J., Sugiura, Y., Seitelman, B. C., & Gunz, P. (2015). Reconciling the convergence of supraspinous fossa shape among hominoids in light of locomotor differences. American Journal of Physical Anthropology, 156(4), 498–510. Scholar
  39. Gunz, P., & Mitteroecker, P. (2013). Semilandmarks: A method for quantifying curves and surfaces. Hystrix, the Italian journal of mammalogy, 24(1), 103–109. Scholar
  40. Gunz, P., Ramsier, M., Kuhrig, M., Hublin, J.-J., & Spoor, F. (2012). The mammalian bony labyrinth reconsidered, introducing a comprehensive geometric morphometric approach. Journal of Anatomy, 220(6), 529–543. Scholar
  41. Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2009). Multivariate data analysis (7th ed.). Upper Saddle River: Pearson Prentice Hall.Google Scholar
  42. Hublin, J.-J., Ben-Ncer, A., Bailey, S. E., Freidline, S. E., Neubauer, S., Skinner, M. M., et al. (2017). New fossils from Jebel Irhoud, Morocco and the pan-African origin of Homo sapiens. Nature, 546(7657), 289–292. Scholar
  43. Ivanović, A., Sotiropoulos, K., Džukić, G., & Kalezić, M. L. (2009). Skull size and shape variation versus molecular phylogeny: A case study of alpine newts (Mesotriton alpestris, Salamandridae) from the Balkan Peninsula. Zoomorphology, 128(2), 157–167. Scholar
  44. Izenman, A. J. (2008). Modern statistical techniques: regression, classification, and manifold learning. New York: Springer.Google Scholar
  45. Klenovšek, T., & Jojić, V. (2016). Modularity and cranial integration across ontogenetic stages in Martino’s vole, Dinaromys bogdanovi. Contributions to Zoology, 85(3), 275–289. Scholar
  46. Knigge, R. P., Tocheri, M. W., Orr, C. M., & McNulty, K. P. (2015). Three-dimensional geometric morphometric analysis of talar morphology in extant gorilla taxa from highland and lowland habitats. The Anatomical Record, 298(1), 277–290. Scholar
  47. Kovarovic, K., Aiello, L. C., Cardini, A., & Lockwood, C. A. (2011). Discriminant function analyses in archaeology: Are classification rates too good to be true? Journal of Archaeological Science, 38(11), 3006–3018.CrossRefGoogle Scholar
  48. Kubiak, B. B., Gutiérrez, E. E., Galiano, D., Maestri, R., & Freitas, T. R. O. (2017). Can niche modeling and geometric morphometrics document competitive exclusion in a pair of subterranean rodents (Genus Ctenomys) with tiny parapatric distributions? Scientific Reports, 7(1), 1–13. Scholar
  49. Mahalanobis, P. C. (1936). On the generalized distance in statistics. Proceedings National Institute of Science, India, 2(1), 49–55.Google Scholar
  50. Mitteroecker, P., & Bookstein, F. (2011). Linear discrimination, ordination, and the visualization of selection gradients in modern morphometrics. Evolutionary Biology, 38(1), 100–114. Scholar
  51. Mitteroecker, P., Gunz, P., & Bookstein, F. L. (2005). Heterochrony and geometric morphometrics: A comparison of cranial growth in Pan paniscus versus Pan troglodytes. Evolution & Development, 7(3), 244–258. Scholar
  52. Neubauer, S., Gunz, P., Leakey, L., Leakey, M., Hublin, J.-J., & Spoor, F. (2018). Reconstruction, endocranial form and taxonomic affinity of the early Homo calvaria KNM-ER 42700. Journal of Human Evolution, 121, 25–39. Scholar
  53. Oxnard, C., & Higgins, P. (2011). Biology clearly needs morphometrics. Does morphometrics need biology? Biological Theory, 4(1), 84–97. Scholar
  54. Pallares, L. F., Turner, L. M., & Tautz, D. (2016). Craniofacial shape transition across the house mouse hybrid zone: Implications for the genetic architecture and evolution of between-species differences. Development Genes and Evolution, 226(3), 173–186. Scholar
  55. Ritzman, T. B., Terhune, C. E., Gunz, P., & Robinson, C. A. (2016). Mandibular ramus shape of Australopithecus sediba suggests a single variable species. Journal of Human Evolution, 100, 54–64. Scholar
  56. Rohlf, F. J. (2015). The tps series of software. Hystrix, the Italian Journal of Mammalogy, 26, 1-4. Scholar
  57. Rohlf, F. J., & Slice, D. (1990). Extensions of the Procrustes method for the optimal superimposition of landmarks. Systematic Zoology, 39(1), 40–59. Scholar
  58. Sanfilippo, P. G., Cardini, A., Sigal, I. A., Ruddle, J. B., Chua, B. E., Hewitt, A. W., et al. (2010). A geometric morphometric assessment of the optic cup in glaucoma. Experimental Eye Research, 91(3), 405–414. Scholar
  59. Sansalone, G., Colangelo, P., Kotsakis, T., Loy, A., Castiglia, R., Bannikova, A. A., et al. (2018). Influence of evolutionary allometry on rates of morphological evolution and disparity in strictly subterranean Moles (Talpinae, Talpidae, Lipotyphla, Mammalia). Journal of Mammalian Evolution, 25(1), 1–14. Scholar
  60. Schlager, S. (2017). Morpho and Rvcg—Shape analysis in R. In G. Zheng, S. Li, & G. Szekely (Eds.), Statistical shape and deformation analysis (pp. 217–256). New York: Academic Press.CrossRefGoogle Scholar
  61. Schlager, S., & Rüdell, A. (2015). Analysis of the human osseous nasal shape—Population differences and sexual dimorphism. American Journal of Physical Anthropology, 157(4), 571–581. Scholar
  62. Seetah, T. K., Cardini, A., & Miracle, P. T. (2012). Can morphospace shed light on cave bear spatial-temporal variation? Population dynamics of Ursus spelaeus from Romualdova pećina and Vindija, (Croatia). Journal of Archaeological Science, 39(2), 500–510. Scholar
  63. Serb, J. M., Sherratt, E., Alejandrino, A., & Adams, D. C. (2017). Phylogenetic convergence and multiple shell shape optima for gliding scallops (Bivalvia: Pectinidae). Journal of Evolutionary Biology, 30(9), 1736–1747. Scholar
  64. Siberchicot, A., Julien-Laferrière, A., Dufour, A.-B., Thioulouse, J., & Dray, S. (2017). adegraphics: An s4 lattice-based package for the representation of multivariate data. The R Journal, 9(2), 198–212. Scholar
  65. Skinner, M. M., Gunz, P., Wood, B. A., & Hublin, J. J. (2009). How many landmarks? Assessing the classification accuracy of Pan lower molars using a geometric morphometric analysis of the occlusal basin as seen at the enamel-dentine junction. In T. Koppe (Ed.), Comparative dental morphology. Basel: Karger Publishers. Scholar
  66. Slice, D. E. (2005). Modern morphometrics. In D. E. Slice (Ed.), Modern morphometrics in physical anthropology (pp. 1–45). Boston, MA: Springer.CrossRefGoogle Scholar
  67. Souto-Lima, R. B., & Millien, V. (2014). The influence of environmental factors on the morphology of red-backed voles Myodes gapperi (Rodentia, Arvicolinae) in Québec and western Labrador. Biological Journal of the Linnean Society, 112(1), 204–218. Scholar
  68. Torres-Tamayo, N., García-Martínez, D., Zlolniski, S. L., Torres-Sánchez, I., García-Río, F., & Bastir, M. (2018). 3D analysis of sexual dimorphism in size, shape and breathing kinematics of human lungs. Journal of Anatomy, 232(2), 227–237. Scholar
  69. Watanabe, A. (2018). How many landmarks are enough to characterize shape and size variation? PLoS ONE, 13(6), e0198341. Scholar
  70. Yendle, P. W., & MacFie, H. J. (1989). Discriminant principal components analysis. Journal of Chemometrics, 3(4), 589–600. Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Dipartimento di Scienze Chimiche e GeologicheUniversità di Modena e Reggio EmiliaModenaItaly
  2. 2.Centre for Forensic AnthropologyThe University of Western AustraliaCrawleyAustralia
  3. 3.Department of Anthropology and Department of Ecology and EvolutionStony Brook UniversityStonybrookUSA
  4. 4.Department of Archaeology and Hull York Medical SchoolUniversity of YorkHeslingtonUK

Personalised recommendations