Abstract
Continuous improvement in medical imaging techniques allows the acquisition of higher-resolution images. When these are used in a predictive setting, a greater number of explanatory variables are potentially related to the dependent variable (the response). Meanwhile, the number of acquisitions per experiment remains limited. In such high dimension/small sample size setting, it is desirable to find the explanatory variables that are truly related to the response while controlling the rate of false discoveries. To achieve this goal, novel multivariate inference procedures, such as knockoff inference, have been proposed recently. However, they require the feature covariance to be well-defined, which is impossible in high-dimensional settings. In this paper, we propose a new algorithm, called Ensemble of Clustered Knockoffs, that allows to select explanatory variables while controlling the false discovery rate (FDR), up to a prescribed spatial tolerance. The core idea is that knockoff-based inference can be applied on groups (clusters) of voxels, which drastically reduces the problem’s dimension; an ensembling step then removes the dependence on a fixed clustering and stabilizes the results. We benchmark this algorithm and other FDR-controlling methods on brain imaging datasets and observe empirical gains in sensitivity, while the false discovery rate is controlled at the nominal level.
T.-B. Nguyen and J.-A. Chevalier—Equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abraham, A., et al.: Machine learning for neuroimaging with scikit-learn. Front. Neuroinform. 8, 14 (2014)
Barber, R.F., Candès, E.J.: Controlling the false discovery rate via knockoffs. Ann. Stat. 43(5), 2055–2085 (2015). http://arxiv.org/abs/1404.5609
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B (Methodol.) 57(1), 289–300 (1995). https://www.jstor.org/stable/2346101
Bühlmann, P., Rütimann, P., van de Geer, S., Zhang, C.H.: Correlated variables in regression: clustering and sparse estimation. J. Stat. Plan. Inference 143(11), 1835–1858 (2013)
Candès, E., Fan, Y., Janson, L., Lv, J.: Panning for gold: model-x knockoffs for high dimensional controlled variable selection. J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 80(3), 551–577
Chevalier, J.-A., Salmon, J., Thirion, B.: Statistical inference with ensemble of clustered desparsified lasso. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11070, pp. 638–646. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00928-1_72
Gaonkar, B., Shinohara, R.T., Davatzikos, C.: Interpreting support vector machine models for multivariate group wise analysis in neuroimaging. Med. Image Anal. 24(1), 190–204 (2015)
Haxby, J.V., Gobbini, M.I., Furey, M.L., Ishai, A., Schouten, J.L., Pietrini, P.: Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293(5539), 2425–2430 (2001)
Hoyos-Idrobo, A., Varoquaux, G., Schwartz, Y., Thirion, B.: Frem scalable and stable decoding with fast regularized ensemble of models. NeuroImage 180, 160–172 (2018). http://www.sciencedirect.com/science/article/pii/S1053811917308182, New advances in encoding and decoding of brain signals
Marcus, D.S., Wang, T.H., Parker, J., Csernansky, J.G., Morris, J.C., Buckner, R.L.: Open access series of imaging studies (OASIS): cross-sectional MRI data in young, middle aged, nondemented, and demented older adults. J. Cogn. Neurosci. 19(9), 1498–1507 (2007)
Meinshausen, N., Meier, L., Bhlmann, P.: P-values for high-dimensional regression. J. Am. Stat. Assoc. 104(488), 1671–1681 (2009)
Weichwald, S., Meyer, T., zdenizci, O., Schlkopf, B., Ball, T., Grosse-Wentrup, M.: Causal interpretation rules for encoding and decoding models in neuroimaging. NeuroImage 110, 48–59 (2015). http://www.sciencedirect.com/science/article/pii/S105381191500052X
Zhang, C.H., Zhang, S.S.: Confidence intervals for low dimensional parameters in high dimensional linear models. J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 76(1), 217–242 (2014)
Acknowledgement
This research is supported by French ANR (project FASTBIG ANR-17-CE23-0011) and Labex DigiCosme (project ANR-11-LABEX-0045-DIGICOSME). The authors would like to thank Sylvain Arlot and Matthieu Lerasle for fruitful discussions and helpful comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Nguyen, TB., Chevalier, JA., Thirion, B. (2019). ECKO: Ensemble of Clustered Knockoffs for Robust Multivariate Inference on fMRI Data. In: Chung, A., Gee, J., Yushkevich, P., Bao, S. (eds) Information Processing in Medical Imaging. IPMI 2019. Lecture Notes in Computer Science(), vol 11492. Springer, Cham. https://doi.org/10.1007/978-3-030-20351-1_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-20351-1_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20350-4
Online ISBN: 978-3-030-20351-1
eBook Packages: Computer ScienceComputer Science (R0)