Advertisement

Geometric problems in machine learning

  • David Dobkin
  • Dimitrios Gunopulos
Submitted Contributions
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1148)

Abstract

We present some problems with geometric characterizations that arise naturally in practical applications of machine learning. Our motivation comes from a well known machine learning problem, the problem of computing decision trees. Typically one is given a dataset of positive and negative points, and has to compute a decision tree that fits it. The points are in a low dimensional space, and the data are collected experimentally. Most practical solutions use heuristic algorithms.

To compute decision trees quickly, one has to solve optimization problems in one or more dimensions efficiently. In this paper we give geometric characterizations for these problems. We present a selection of algorithms for some of them. These algorithms are motivated from practice, and have been in many cases implemented and used as well. In addition, they are theoretically interesting, and typically employ sophisticated geometric techniques. Finally we present future research directions.

Keywords

Decision Tree Greedy Algorithm Dynamic Algorithm Geometric Characterization Hypothesis Class 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [AMMRS]
    E. Arkin, H. Meijer, J. Mitchell, D. Rappaport, and S. Skiena, Decision Trees for Geometric Models. Proc. Comput. Geom. Conf. (1993), 369–378.Google Scholar
  2. [ACKT]
    T. Asano, D. Chen, N. Katoh, and T. Tokuyama, Polynomial-Time solutions to Image Segmentation. Proc. 7th ACM-SIAM Symp. on Disc. Algorithms (1996), 104–113.Google Scholar
  3. [AHM]
    P. Auer, R. Holte and W. Maass, Theory and Applications of Agnostic PAC-Learning with Small Decision Trees. Proc. 12th Int. Conf. Machine Learning (1995).Google Scholar
  4. [BDG]
    C. Boutilier, R. Dearden and M. Goldszmidt. Exploiting Structure in Policy Construction Proceedings of the International Joint Conference on Artificial Intelligence, Montreal, Michigan, 1995.Google Scholar
  5. [BFOS]
    L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone. Classification and Regression Trees, Belmont, CA: Wadsworth International Group, 1984.Google Scholar
  6. [BN]
    W. Buntine and T. Niblett, A further comparison of splitting rules for decision-tree induction. Machine Learning, 8 (1992), 75–82.Google Scholar
  7. [DaG]
    G. Das and M. Goodrich, On the complexity of Optimization Problems for 3-Dimensional Convex Polyhedra and Decision Trees. WADS 1995.Google Scholar
  8. [DEM]
    D. Dobkin, D. Eppstein and D. Mitchell, Computing the Discrepancy with Applications to Supersampling Patterns. ACM Transactions on Graphics, to appear.Google Scholar
  9. [DG]
    D. Dobkin and D. Gunopulos, Concept Learning with Geometric Hypotheses, 8th ACM Conference on Learning Theory (1995).Google Scholar
  10. [DGK]
    D. Dobkin, D. Gunopulos and S. Kasif, Computing optimum shallow decision trees. 4th AI and Math. Symposium (1996).Google Scholar
  11. [DGKFS]
    D. Dobkin, D. Gunopulos, S. Kasif, J. Fulton and S. Salzberg, Induction of Shallow Decision Trees. submitted to IEEE PAMI.Google Scholar
  12. [DGM]
    D. Dobkin, D. Gunopulos and W. Maass, Computing the maximum Bichromatic Discrepancy, with applications in Computer Graphics and Machine Learning. J. Comp. Syst. Sciences, to appear.Google Scholar
  13. [F]
    P. Fischer, More or less efficient agnostic learning of convex polygons. 8th ACM Conference on Computational Learning Theory (1995).Google Scholar
  14. [FKS]
    T. Fulton, S. Kasif and S. Salzberg, An Efficient Algorithm for Finding Multi-way Splits in Decision Trees. Proc. Machine Learning 1995.Google Scholar
  15. [Hau]
    D. Haussler, Decision theoretic generations of the PAC-model for neural nets and other applications. Inf. and Comp., 100 (1992), 78–150.CrossRefGoogle Scholar
  16. [Ho]
    R.C. Holte, Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11 (1993), 63–91.CrossRefGoogle Scholar
  17. [KSS]
    M. Kearns, R.E. Schapire and L.M. Sellie, Toward efficient agnostic learning. 5th ACM Workshop on Computational Learning Theory (1992), 341–352.Google Scholar
  18. [K]
    S. Kwek, Minimizing disagreements for geometric regions using dynamic programming, with applications to machine learning and computer graphics. Manuscript, 1995.Google Scholar
  19. [L]
    D. Lubinsky, Bivariate splits and consistent split criteria in dichotomous classification trees. Ph.D. Thesis, Rutgers University, Department of Computer Science, 1994.Google Scholar
  20. [Ma]
    W. Maass, Efficient Agnostic PAC-Learning with Simple Hypotheses. 7th Ann. ACM Conference on Computational Learning Theory (1994), 67–75.Google Scholar
  21. [Min]
    J. Mingers, An empirical comparison of pruning methods for decision tree induction. Machine Learning, 4 (1989), 227–243.CrossRefGoogle Scholar
  22. [Mo]
    B. M.E. Moret, Decision Trees and diagrams. Computing surveys, 14(4) (1982), 593–623.CrossRefGoogle Scholar
  23. [MKS]
    S. Murthy, S. Kasif and S. Salzberg, A system for induction of oblique decision trees. Journal of Artificial Intelligence Research, 1 (1994), 257–275.Google Scholar
  24. [MKSB]
    S. Murthy, S. Kasif, S. Salzberg and R. Beigel, OC1: Randomized induction of oblique decision trees. AAAI 93 [2], 322–327.Google Scholar
  25. [PP]
    R. W. Payne and D. A. Preece, Identification trees and diagnostic tables: A review. Journal of the Royal Statistical Society: series A, 143 (1980), 253.Google Scholar
  26. [Q86]
    J.R. Quinlan. Induction of Decision Trees. Machine Learning, 1 (1986), 81–106.Google Scholar
  27. [Q93]
    J.R. Quinlan. C4.5: Programs for Machine Learning, Morgan Kaufmann, Los Altos, CA, 1993.Google Scholar
  28. [Q95]
    J. R. Quinlan. Oversearching and Layered Search in Empirical Learning. Proceedings of the International Joint Conference on Artificial Intelligence, Montreal, Michigan, 1995.Google Scholar
  29. [SL]
    S. R. Safarin and D. Landgrebe, A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man and Cybernetics, 21(3) (1994), 309–318.Google Scholar
  30. [SCFMW]
    S. Salzberg, R. Chandar, H. Ford, S. Murthy and R. White, Decision trees for automated identification of cosmic-ray hits in humble space telescope images. Publications of the Astronomical Society of the Pacific, 107, 1–10 (March 1995).CrossRefGoogle Scholar
  31. [V]
    L.G. Valiant, A theory of the learnable. Comm. of the ACM 27 (1984), 1134–1142.CrossRefGoogle Scholar
  32. [WGT]
    S.M. Weiss, R. Galen and P.V. Tadepalli, Maximizing the predictive value of production rules. Art. Int. 45 (1990), 47–71.CrossRefGoogle Scholar
  33. [WK90]
    S.M. Weiss and I. Kapouleas, An empirical comparison of pattern recognition, neural nets, and machine learning classification methods. 11th Int. Joint Conf. on Art. Int. (1990), Morgan Kauffmann, 781–787.Google Scholar
  34. [WK91]
    S.M. Weiss and C.A. Kulikowski, Computer Systems that Learn, Morgan Kauffmann Publishers, Palo Alto, CA, 1991.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1996

Authors and Affiliations

  • David Dobkin
    • 1
  • Dimitrios Gunopulos
    • 2
  1. 1.Computer Science Dept.Princeton UniversityPrincetonUSA
  2. 2.Max-Planck-Institute für InformatikIm StadtwaldSaarbrückenGermany

Personalised recommendations