Using Permutation Tests to Study How the Dimensionality, the Number of Classes, and the Number of Samples Affect Classification Analysis

Al-Rawi, Mohammed Sadeq; Cunha, João Paulo Silva

doi:10.1007/978-3-642-31295-3_5

Mohammed Sadeq Al-Rawi^18,19,20 &
João Paulo Silva Cunha^18,19,20

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7324))

Included in the following conference series:

International Conference Image Analysis and Recognition

2080 Accesses
1 Citations

Abstract

Permutation tests have extensively been used to estimate the significance of classification. Permutation tests usually use the test error as a dataset statistic to measure the difference between two or more populations. Then, to estimate the p-value(s), the test error is compared to a set of permuted test-error(s), which is usually obtained after permuting the labels of the populations. In this study, we investigate how several dataset factors, e.g., the number of samples, the number of classes, and the dimensionality size, may affect the p-value obtained via permutation tests. We performed the analysis using the standard permutation test procedure that uses the overall all test error dataset statistic and compared it to the permutation test procedure that uses per-class test error as a dataset statistic that we recently have proposed (doi:10.1016 /j.neucom.2011.11.007). We found that permutation tests that use a per-class test error as a dataset statistic are not only more reliable in addressing the null hypothesis but also are highly sensitive to changes in the dataset factors that we investigated in this work. An important finding of this study is that when the dimensionality is low and the number of classes is up to several, say ten, highly above chance accuracy would be required to state the significance. For the same low dimensionality, however, slightly above chance accuracy would be adequate to state significance in a two-class problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Al-Rawi, M.S., Cunha, J.P.S.: On using permutation tests to estimate the classification significance of functional magnetic resonance imaging. Neurocomputing 82, 224–233 (2012)
Article Google Scholar
Al-Rawi, M.S., Cunha, J.P.S.: Using multivariate pattern analysis to study how specific brain regions respond to different visual stimulation. J. Neurol. 258, 167 (2011)
Google Scholar
Detre, G.J., Polyn, S.M., Takerkart, S., Natu, V.S., Benharrosh, M.S., Singer, B.D., Cohen, J.D., Haxby, J.V., Norman, K.A.: The Multi-Voxel Pattern Analysis (MVPA). In: 12th Meeting of the Organization of Human Brain Mapping, Florence, Italy (2006)
Google Scholar
Fisher, R.A.: Statistical methods for research workers. Oliver and Boyd (1954)
Google Scholar
Golland, P., Fischl, B.: Permutation Tests for Classification: Towards Statistical Significance in Image-Based Studies. In: Taylor, C.J., Noble, J.A. (eds.) IPMI 2003. LNCS, vol. 2732, pp. 330–341. Springer, Heidelberg (2003)
Chapter Google Scholar
Haxby, J.V., Gobbini, M.I., Furey, M.L., Ishai, A., Schouten, J.L., Pietrini, P.: Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293, 2425–2430 (2001)
Article Google Scholar
Maldjian, J.A., Laurienti, P.J., Kraft, R.A., Burdette, J.H.: An automated method for neuroanatomic and cytoarchitectonic atlas-based interrogation of fMRI datasets. NeuroImage 19, 1233–1239 (2003); WFU Pickatlas, version 2.4
Article Google Scholar
Ojala, M., Garriga, G.C.: Permutation Tests for Studying Classifier Performance. J. Mach. Learn. Res. 11, 1833–1863 (2010)
MathSciNet Google Scholar
Struyf, J., Dobrin, S., Page, D.: Combining gene expression, demographic and clinical data in modeling disease: a case study of bipolar disorder and schizophrenia. Bmc Genomics 9, 22 (2008)
Article Google Scholar
Trautmann, E., Ray, L.: Mobility characterization for autonomous mobile robots using machine learning. Auton. Robot. 30, 369–383 (2011)
Article Google Scholar

Download references

Author information

Authors and Affiliations

IEETA-Instituto de Engenharia Electrónica e Telemática de Aveiro, University of Aveiro, 3810-193, Aveiro, Portugal
Mohammed Sadeq Al-Rawi & João Paulo Silva Cunha
IEETA- Instituto de Engenharia Electrónica e Telemática de Aveiro, Dept. of Electrical and Computer Engineering, Faculty of Engineering, University of Porto, 4200-465, Porto, Portugal
Mohammed Sadeq Al-Rawi & João Paulo Silva Cunha
Brain Imaging Network (ANIFC), 3000-548, Coimbra, Portugal
Mohammed Sadeq Al-Rawi & João Paulo Silva Cunha

Authors

Mohammed Sadeq Al-Rawi
View author publications
You can also search for this author in PubMed Google Scholar
João Paulo Silva Cunha
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Engineering, Institute of Biomedical Engineering, University of Porto, Rua Dr. Roberto Frias, 4200-465, Porto, Portugal
Aurélio Campilho
Department of Electrical and Computer Engineering, University of Waterloo, N2L 3G1, Waterloo, ON, Canada
Mohamed Kamel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Al-Rawi, M.S., Cunha, J.P.S. (2012). Using Permutation Tests to Study How the Dimensionality, the Number of Classes, and the Number of Samples Affect Classification Analysis. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2012. Lecture Notes in Computer Science, vol 7324. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31295-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-31295-3_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31294-6
Online ISBN: 978-3-642-31295-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics