Abstract
Modern classification techniques perform well when the number of training examples exceed the number of features. If, however, the number of features greatly exceed the number of training examples, then these same techniques can fail. To address this problem, we present a hierarchical Bayesian framework that shares information between features by modeling similarities between their parameters. We believe this approach is applicable to many sparse, high dimensional problems and especially relevant to those with both spatial and temporal components. One such problem is fMRI time series, and we present a case study that shows how we can successfully classify in this domain with 80,000 original features and only 2 training examples per class.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Baxter, J.: A bayesian/information theoretic model of learning to learn via multiple task sampling. Machine Learning 28, 7–39 (1997)
Ben-David, S., Schuller, R.: Exploiting task relatedness for multiple task learning. In: Sixteenth Annual Conference on Learning Theory COLT (2003)
Caruana, R.: Multitask learning. Machine Learning 28(1), 41–75 (1997)
Davatzikos, C., et al.: Classifying spatial patterns of brain activity with machine learning methods: application to lie detection. Neuroimage 28(1), 663–668 (2005)
Friedman, J.H.: On bias, variance, 0/1 loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery 1(1), 55–77 (1997)
Gelman, A., Carlin, J., Stern, H., Rubin, D.: Bayesian Data Analysis, 2nd edn. Chapman and Hall/CRC Press, Boca Raton, NY (2003)
Heskes, T.: Solving a huge number of similar tasks: a combination of multi-task learning and a hierarchical bayesian approach. In: International Conference of Machine Learning ICML (1998)
Hutchinson, R.A., Mitchell, T.M., Rustandi, I.: Hidden process models. In: International Conference of Machine Learning ICML (2006)
Lee, P.M.: Bayesian Statistics, 3rd edn. Hodder Arnold, London, UK (2004)
Mitchell, T.M., Hutchinson, R., Niculescu, R.S., Pereira, F., Wang, X., Just, M., Newman, S.: Learning to decode cognitive states from brain images. Machine Learning 57(1-2), 145–175 (2004)
Niculescu, R.S.: Exploiting Parameter Domain Knowledge for Learning in Bayesian Networks. Carnegie Mellon Thesis: CMU-CS-05-147, Pittsburgh, PA (2005)
Niculescu, R.S., Mitchell, T.M.: Bayesian network learning with parameter constraints. Journal of Machine Learning Research 7, 1357–1383 (2006)
Rustandi, I.: Hierarchical gaussian naive bayes classifier for multiple-subject fmri data. In: NIPS Workshop: New Directions on Decoding Mental States from fMRI Data (2006)
Thrun, S.: Learning to learn: Introduction. In: Learning To Learn (1996)
Zhang, L., Samaras, D., Tomasi, D., Alia-Klein, N., Leskovjan, L.C.A., Volkow, N., Goldstein, R.: Exploiting temporal information in functional magnetic resonance imaging brain data. In: MICCAI Conference Proceedings, pp. 679–687 (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Palatucci, M., Mitchell, T.M. (2007). Classification in Very High Dimensional Problems with Handfuls of Examples. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds) Knowledge Discovery in Databases: PKDD 2007. PKDD 2007. Lecture Notes in Computer Science(), vol 4702. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74976-9_22
Download citation
DOI: https://doi.org/10.1007/978-3-540-74976-9_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74975-2
Online ISBN: 978-3-540-74976-9
eBook Packages: Computer ScienceComputer Science (R0)