Classification in Very High Dimensional Problems with Handfuls of Examples

Palatucci, Mark; Mitchell, Tom M.

doi:10.1007/978-3-540-74976-9_22

Mark Palatucci¹ &
Tom M. Mitchell¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4702))

Included in the following conference series:

European Conference on Principles of Data Mining and Knowledge Discovery

3538 Accesses
4 Citations

Abstract

Modern classification techniques perform well when the number of training examples exceed the number of features. If, however, the number of features greatly exceed the number of training examples, then these same techniques can fail. To address this problem, we present a hierarchical Bayesian framework that shares information between features by modeling similarities between their parameters. We believe this approach is applicable to many sparse, high dimensional problems and especially relevant to those with both spatial and temporal components. One such problem is fMRI time series, and we present a case study that shows how we can successfully classify in this domain with 80,000 original features and only 2 training examples per class.

Download to read the full chapter text

Chapter PDF

Small Samples of Multidimensional Feature Vectors

High-Dimensional Data Classification

Model Selection for Classification with a Large Number of Classes

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Baxter, J.: A bayesian/information theoretic model of learning to learn via multiple task sampling. Machine Learning 28, 7–39 (1997)
Article MATH Google Scholar
Ben-David, S., Schuller, R.: Exploiting task relatedness for multiple task learning. In: Sixteenth Annual Conference on Learning Theory COLT (2003)
Google Scholar
Caruana, R.: Multitask learning. Machine Learning 28(1), 41–75 (1997)
Article Google Scholar
Davatzikos, C., et al.: Classifying spatial patterns of brain activity with machine learning methods: application to lie detection. Neuroimage 28(1), 663–668 (2005)
Article Google Scholar
Friedman, J.H.: On bias, variance, 0/1 loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery 1(1), 55–77 (1997)
Article Google Scholar
Gelman, A., Carlin, J., Stern, H., Rubin, D.: Bayesian Data Analysis, 2nd edn. Chapman and Hall/CRC Press, Boca Raton, NY (2003)
Google Scholar
Heskes, T.: Solving a huge number of similar tasks: a combination of multi-task learning and a hierarchical bayesian approach. In: International Conference of Machine Learning ICML (1998)
Google Scholar
Hutchinson, R.A., Mitchell, T.M., Rustandi, I.: Hidden process models. In: International Conference of Machine Learning ICML (2006)
Google Scholar
Lee, P.M.: Bayesian Statistics, 3rd edn. Hodder Arnold, London, UK (2004)
MATH Google Scholar
Mitchell, T.M., Hutchinson, R., Niculescu, R.S., Pereira, F., Wang, X., Just, M., Newman, S.: Learning to decode cognitive states from brain images. Machine Learning 57(1-2), 145–175 (2004)
Article MATH Google Scholar
Niculescu, R.S.: Exploiting Parameter Domain Knowledge for Learning in Bayesian Networks. Carnegie Mellon Thesis: CMU-CS-05-147, Pittsburgh, PA (2005)
Google Scholar
Niculescu, R.S., Mitchell, T.M.: Bayesian network learning with parameter constraints. Journal of Machine Learning Research 7, 1357–1383 (2006)
MathSciNet Google Scholar
Rustandi, I.: Hierarchical gaussian naive bayes classifier for multiple-subject fmri data. In: NIPS Workshop: New Directions on Decoding Mental States from fMRI Data (2006)
Google Scholar
Thrun, S.: Learning to learn: Introduction. In: Learning To Learn (1996)
Google Scholar
Zhang, L., Samaras, D., Tomasi, D., Alia-Klein, N., Leskovjan, L.C.A., Volkow, N., Goldstein, R.: Exploiting temporal information in functional magnetic resonance imaging brain data. In: MICCAI Conference Proceedings, pp. 679–687 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
Mark Palatucci & Tom M. Mitchell

Authors

Mark Palatucci
View author publications
You can also search for this author in PubMed Google Scholar
Tom M. Mitchell
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Joost N. Kok Jacek Koronacki Ramon Lopez de Mantaras Stan Matwin Dunja Mladenič Andrzej Skowron

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Palatucci, M., Mitchell, T.M. (2007). Classification in Very High Dimensional Problems with Handfuls of Examples. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds) Knowledge Discovery in Databases: PKDD 2007. PKDD 2007. Lecture Notes in Computer Science(), vol 4702. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74976-9_22

Download citation

DOI: https://doi.org/10.1007/978-3-540-74976-9_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74975-2
Online ISBN: 978-3-540-74976-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Classification in Very High Dimensional Problems with Handfuls of Examples

Abstract

Chapter PDF

Similar content being viewed by others

Small Samples of Multidimensional Feature Vectors

High-Dimensional Data Classification

Model Selection for Classification with a Large Number of Classes

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Classification in Very High Dimensional Problems with Handfuls of Examples

Abstract

Chapter PDF

Similar content being viewed by others

Small Samples of Multidimensional Feature Vectors

High-Dimensional Data Classification

Model Selection for Classification with a Large Number of Classes

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation