Identifying errors in Freesurfer automated skull stripping and the incremental utility of manual intervention
- 189 Downloads
Quality assurance (QA) is vital for ensuring the integrity of processed neuroimaging data for use in clinical neurosciences research. Manual QA (visual inspection) of processed brains for cortical surface reconstruction errors is resource-intensive, particularly with large datasets. Several semi-automated QA tools use quantitative detection of subjects for editing based on outlier brain regions. There were two project goals: (1) evaluate the assumption that statistical outliers are related to errors of cortical extension, and (2) examine whether error identification and correction significantly impacts estimation of cortical parameters and established brain-behavior relationships. T1 MPRAGE images (N = 530) of healthy adults were obtained from the NKI-Rockland Sample and reconstructed using Freesurfer 5.3. Visual inspection of T1 images was conducted for: (1) participants (n = 110) with outlier values (z scores ±3 SD) for subcortical and cortical segmentation volumes (outlier group), and (2) a random sample of remaining participants (n = 110) with segmentation values that did not meet the outlier criterion (non-outlier group). The outlier group had 21% more participants with visual inspection-identified errors than participants in the non-outlier group, with a medium effect size (Φ = 0.22). Nevertheless, a considerable portion of images with errors of cortical extension were found in the non-outlier group (41%). Although nine brain regions significantly changed size from pre- to post-editing (with effect sizes ranging from 0.26 to 0.59), editing did not substantially change the correlations of neurocognitive tasks and brain volumes (ps > 0.05). Statistically-based QA, although less resource intensive, is not accurate enough to supplant visual inspection. We discuss practical implications of our findings to guide resource allocation decisions for image processing.
KeywordsQuality assurance Automated segmentation statistics Reconstruction error Freesurfer
The authors would like to acknowledge the following people and organizations for their contributions:
The editor, Dr. Andrew Saykin, and the three anonymous reviewers for their thorough review, which has strengthened the quality of this manuscript.
Douglas Greve at the MGH/HST Athinoula A. Martinos Center for Biomedical Imaging for his comments and consultation.
The NKI-Rockland Sample Initiative for providing the data used in these analyses (data collection funded through NIMH BRAINS R01MH094639-01).
The Suffolk University Psychology Department for their support of doctoral students and David Gansler’s Lab, and the contributions of undergraduate students Ms. Paige Kawai and Ms. Leah Pedersen.
Compliance with ethical standards
Conflicts of interest
Abigail B. Waters, Ryan A. Mace, Kayle S. Sawyer, and David A. Gansler declare that they have no conflicting interests. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki Declaration of 1975, and the applicable revisions at the time of the investigation. Informed consent was obtained from all patients for being included in the study.
- Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hilsdale (p. 2). NJ: Lawrence Earlbaum Associates.Google Scholar
- Collins, D. L. (1994). 3D Model-based segmentation of individual brain structures from magnetic resonance imaging data (Doctoral dissertation, McGill University).Google Scholar
- Delis, D. C., Kaplan, E., & Kramer, J. H. (2001). Delis-Kaplan executive function system (D-KEFS). Psychological Corporation.Google Scholar
- Desikan, R. S., Segonne, F., Fischl, B., Quinn, B. T., Dickerson, B. C., Blacker, D., Buckner, R. L., Dale, A. M., Maguire, R. P., & Hyman, B. T. (2006). An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. NeuroImage, 31(3), 968–980.CrossRefGoogle Scholar
- Fischl, B., Salat, D. H., Busa, E., Albert, M., Dieterich, M., Haselgrove, C., van der Kouwe, A., Killiany, R., Kennedy, D., Klaveness, S., Montillo, A., Makris, N., Rosen, B., & Dale, A. M. (2002). Whole brain segmentation: Automated labeling of neuroanatomical structures in the human brain. Neuron, 33(3), 341–355.CrossRefGoogle Scholar
- Fischl, B., van der Kouwe, A., Destrieux, C., Halgren, E., Segonne, F., Salat, D. H., Busa, E., Seidman, L. J., Goldstein, J., Kennedy, D., Caviness, V., Makris, N., Rosen, B., & Dale, A. M. (2004). Automatically parcellating the human cerebral cortex. Cerebral Cortex, 14(1), 11–22.CrossRefGoogle Scholar
- Gur, R. C., Richard, J., Hughett, P., Calkins, M. E., Macy, L., Bilker, W. B., & Gur, R. E. (2010). A cognitive neuroscience-based computerized battery for efficient measurement of individual differences: Standardization and initial construct validation. Journal of Neuroscience Methods, 187(2), 254–262.CrossRefGoogle Scholar
- Kaufmann, L. K., Baur, V., Hänggi, J., Jäncke, L., Piccirelli, M., Kollias, S., & Milos, G. (2017). Fornix under water? Ventricular enlargement biases Forniceal diffusion magnetic resonance imaging indices in anorexia nervosa. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, 2(5), 430–437.Google Scholar
- Keshavan, A., Datta, E., McDonough, I., Madan, C. R., Jordan, K., & Henry, R. G. (2017). Mindcontrol: A web application for brain segmentation quality control. NeuroImage.Google Scholar
- Li, H., Smith, S. M., Gruber, S. A., Lukas, S. E., Silveri, M. M., Hill, K. P., ... & Nickerson, L. D. (2018). Combining Multi-Site/Multi-Study MRI Data: Linked-ICA Denoising for Removing Scanner and Site Variability from Multimodal MRI Data. bioRxiv, 337576.Google Scholar
- Lichy, M. P., Wietek, B. M., Mugler III, J. P., Horger, W., Menzel, M. I., Anastasiadis, A., et al. (2005). Magnetic resonance imaging of the body trunk using a single-slab, 3-dimensional, T2-weighted turbo-spin-echo sequence with high sampling efficiency (SPACE) for high spatial resolution imaging: Initial clinical experiences. Investigative Radiology, 40(12), 754–760.CrossRefGoogle Scholar
- McCarthy, C. S., Ramprashad, A., Thompson, C., Botti, J. A., Coman, I. L., & Kates, W. R. (2015). A comparison of FreeSurfer-generated data with and without manual intervention. Frontiers in Neuroscience, 9.Google Scholar
- Meng, X. L., Rosenthal, R., & Rubin, D. B. (1992). Comparing correlated correlation coefficients. Psychological Bulletin, 111(1), 172–175.Google Scholar
- Roalf, D. R., Ruparel, K, Gur, R. E., Bilker, W., Gerraty, R., Elliott, M. A., Sean Gallagher, R., Almasy, L., Pogue-Geile, M. F., Prasad, K., Wood, J., Nimgaonkar, V. L., Gur, R. C., (2014) Neuroimaging predictors of cognitive performance across a standardized neurocognitive battery. Neuropsychology 28 (2):161–176.CrossRefGoogle Scholar
- Savalia, N. K., Agres, P. F., & Wig, G. S. (2015). Processing & editing overview. The Center for Vital Longevity.Google Scholar
- U.S. Census Bureau. (2009). Census data. US Department of Health and Human Services. D.C.: Washington.Google Scholar
- Viviani, R., Pracht, E. D., Brenner, D., Beschoner, P., Stingl, J. C., & Stöcker, T. (2017). Multimodal MEMPRAGE, FLAIR, and R2 * segmentation to resolve dura and vessels from cortical gray matter. Frontiers in Neuroscience, 11.Google Scholar
- Wechsler, D. (2011). WASI-II: Wechsler abbreviated scale of intelligence--. Psychological Corporation.Google Scholar