Group Evaluation: Data Mining for Biometrics

Chapter 7 presented methods for the holistic analysis of biometric systems, while Chap. 8 illustrated how the performance of individual users within a single system can vary. In a similar manner, certain subsets of the user population may be consistently having difficulty with the system, while others may be performing very well. For example, assume that a particular system has a significant goat population (recall that a goat is a user who has trouble matching against their own enrollments). On one hand, it is possible that each of these people has a unique reason for their poor performance. However, it is more likely that there are a few common underlying causes that affect whole groups of people. Discovering these factors, and the groups they have the greatest effect on, is an important part of the analysis of biometric systems that is often neglected.

The following are hypothetical systems that have groups of problem users:
  • All fingerprints can be classified based on their overall pattern of ridges and valleys. The main classes are: left loop, right loop, whorl, arch, and tented arch. An automated fingerprint identification system (AFIS) may apply a classification algorithm to an input print, and only match it against enrolled prints belonging to the same class. This reduces the number of one-to-one comparisons that need to be conducted, reducing system load and potentially avoiding false matches. However, fingerprint classification is a difficult problem in itself, with challenges distinct from those of fingerprint recognition. Consider a system that uses a fingerprint classification algorithm for pre-selection, and further assume that the algorithm often misclassifies whorl inputs as arches. In this case, the “whorl” sub-population may consistently receive low scores, for both genuine and impostor matches, leading to a group of phantoms. In this case, it is features inherent in the physiology of the subgroup that are related to their poor system performance.

  • Covert surveillance systems capture images of people without their knowledge. Therefore, unlike many biometric systems, there is very limited control over the behavior of the subjects who pass through the system. Consider a group of users who wear large sunglasses that obscure a significant portion of their face. This will hamper the ability of the face recognition algorithm to correctly identify the individual. In this case, it is a behavioral aspect of the subgroup that leads to poor recognition performance.

  • Consider a face recognition identification system that is installed at several detention centers throughout a country. At each site, detainees are enrolled in the system with a new photo, which is matched against the existing database to ensure they have not been previously enrolled under a different name. Since there are a variety of different capture locations throughout the country, the conditions at each site will vary; some variations favorable to face recognition systems, others unfavorable. For example, imagine that one site has a problem with backlighting, resulting in a disproportionate number of false accepts. In this case, the lamb population is due to environmental factors.

As these examples illustrate, there are many potential reasons why a particular group may perform poorly. Large, integrated, full-scale production systems are complex and have many sources of data. Each of these sources introduce new factors that potentially relate to system performance.


Support Vector Machine Data Mining Face Recognition Group Evaluation Problem Group 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Burges, C.: Atutorial on supportvector machines for pattern recognition. Data Mining and Knowledge Discovery 2, 121–167 (1998)CrossRefGoogle Scholar
  2. 2.
    Duda, R., Hart, P., Stork, D.: Pattern Classification. John Wiley & Sons, Inc. (2001)Google Scholar
  3. 3.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer (2001)Google Scholar
  4. 4.
    ISO: Information technology – biometric performance testing and reporting – part 1: Principles and framework (ISO/IEC 19795-1:2006) (2006)Google Scholar
  5. 5.
    Mitchell, T.: Machine Learning. McGraw-Hill Companies Inc. (1997)Google Scholar
  6. 6.
    Quinlan, J.R.: Induction of decision trees. Machine Learning 1(1), 81–106 (1980)MathSciNetGoogle Scholar
  7. 7.
    WEKA: Weka data mining software, The University of Waikato.

Copyright information

© Springer-Verlag US 2009

Personalised recommendations