Improved pairwise coupling classification with correcting classifiers

  • Miguel Moreira
  • Eddy Mayoraz
Multiple Models for Classification
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1398)


The benefits obtained from the decomposition of a classification task involving several classes, into a set of smaller classification problems involving two classes only, usually called dichotomies, have been exposed in various occasions. Among the multiple ways of applying the referred decomposition, Pairwise Coupling is one of the best known. Its principle is to separate a pair of classes in each binary subproblem, ignoring the remaining ones, resulting in a decomposition scheme containing as much subproblems as the number of possible pairs of classes in the original task. Pairwise Coupling decomposition has so far been used in different applications. In this paper, various ways of recombining the outputs of all the classifiers solving the existing subproblems are explored, and an important handicap of its intrinsic nature is exposed, which consists in the use, for the classification, of impertinent information. A solution for this problem is suggested and it is shown how it can significantly improve the classification accuracy. In addition, a powerful decomposition scheme derived from the proposed correcting procedure is presented.


Classification decomposition into binary subproblems pairwise coupling 


  1. 1.
    E. Boros, P. L. Hammer, Toshihide Ibaraki, A. Kogan, E. Mayoraz, and I. Muchnik. An implementation of logical analysis of data. RRR 22-96, RUTCOR-Rutgers University's Center For Operations Research, http://rut cor.≈rrr/, Submitted, July 1996.Google Scholar
  2. 2.
    L. Breiman, J. Olshen, and C. Stone.Classification and Regression Trees. Wadsworth International Group, 1984.Google Scholar
  3. 3.
    Pierre J. Castellano, Stefan Slomka, and Sridha Sridharan. Telephone based speaker recognition using multiplt binary classifier and Gaussian mixture models. In ICASSP, volume 2, pages 1075–1078. IEEE Computer Society Press, 1997.Google Scholar
  4. 4.
    Yves Crama, Peter L. Hammer, and Toshihide Ibaraki. Cause-effect relationships and partially defined boolean functions. Annals of Operations Research, 16:299–326, 1988.MathSciNetGoogle Scholar
  5. 5.
    Thomas G. Dietterich and Ghulum Bakiri. Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 2:263–286, 1995.Google Scholar
  6. 6.
    Thomas G. Dietterich. Statistical tests for comparing supervised classification learning algorithms. OR 97331, Department of Computer Science, Oregon State University,, 1996.Google Scholar
  7. 7.
    T. G. Dietterich and G. Bakiri. Error-correcting output codes: A general method for improving multiclass inductive learning programs. In Proceedings of AAAI-91, pages 572–577. AAAI Press / MIT Press, 1991.Google Scholar
  8. 8.
    R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. John Wiley & Sons, New York, 1973.Google Scholar
  9. 9.
    Trevor Hastie and Robert Tibshirani.Classification by pairwise coupling. Technical report, Stanford University and University of Toronto, 1996.,to appear in the Proceedings of NIPS*97.Google Scholar
  10. 10.
    E. B. Kong and T. G. Dietterich. Error-correcting output coding corrects bias and variance. In The XII International Conference on Machine Learning, pages 313–321, San Francisco, CA, 1995. Morgan Kaufmann.Google Scholar
  11. 11.
    Eddy Mayoraz and Miguel Moreira. On the decomposition of polychotomies into dichotomies. In Douglas H. Fisher, editor, The Fourteenth International Conference on Machine Learning, pages 219–226, 1997.Google Scholar
  12. 12.
    C.J. Merz and P.M. Murphy. UCI repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Sciences,≈mlearn/MLRepository.html,1996.Google Scholar
  13. 13.
    David Price, Stefan Knerr, Leon Personnaz, and Gerard Dreyfus. Pairwise neural network classifiers with probabilistic outputs. In G. Tesauro, D. Touretzky, and T. Leen, editors, Advances in Neural Information Processing Systems 7 (NIPS*94), volume 7, pages 1109–1116. The MIT Press, 1995.Google Scholar
  14. 14.
    J. R. Quinlan. Induction of decision trees. Machine Learning, 1:81–106, 1986.Google Scholar
  15. 15.
    J. R. Quinlan. C4.5 Programs for Machine Learning. Morgan Kaufmann, 1993.Google Scholar
  16. 16.
    F. Rosenblatt. The perceptron: a probabilistic model for information storage and organization in the brain. Psychological Review, 63:386–408, 1958.Google Scholar
  17. 17.
    Laszlo Rudasi and Stephen A. Zahorian. Text-independent talker indentification with neural networks. In ICASSP, volume 1, pages 389–392, 1991.Google Scholar
  18. 18.
    Robert E. Shapire. Using output codes to boost multiclass learning problems. In Douglas H. Fisher, editor, The Fourteenth International Conference on Machine Learning, pages 313–321, 1997.Google Scholar
  19. 19.
    P. Werbos. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD thesis, Harvard University, 1974.Google Scholar
  20. 20.
    Stephen A. Zahorian, Peter Silsbee, and Xihong Wang. Phone classification with segmental features and a binary-pair partitioned neural network classifier. In ICASSP, volume 2, pages 1011–1014. IEEE Computer Society Press, 1997.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Miguel Moreira
    • 1
  • Eddy Mayoraz
    • 1
  1. 1.IDIAP-Dalle Molle Institute of Perceptual Artificial IntelligenceSwitzerland

Personalised recommendations