Abstract
Obstructive sleep apnea (OSA) is a wide-spread condition that results in debilitating consequences including death. Diagnosis is a lengthy and expensive process because OSA is a multifactorial disorder, making it necessary to study many different types of data, including DNA sequences, multiple time series, metabolites, airflow in airway, and shape analysis of airway and patients’ faces. OSA data are an example of complex and multi-dimensional data for which analysis and interpretation can be challenging, requiring sophisticated analytic techniques. It may be no longer effective to independently apply methods from a specific discipline such as statistics, mathematics, or computing science. In this article, combining the analyses of three datasets from independent OSA studies, we illustrate the complementary nature of the techniques. Specifically, we apply techniques in statistics, machine learning, geometry, and computational topology to derive automated analytic tools for each data type. Taken together, these techniques provide a sophisticated diagnostic tool. A novel geometric OSA severity index (GSI) is developed using methods from computational geometry. This index measures the volume of the airway obstruction in OSA patients. The lower the GSI value is, the more severe the airway obstruction is. Persistent homology is employed to extract the importance information from 28-dimensional polysomnography (PSG) data. Random forests and principal component analysis are used and compared to identify important variables in the PSG, while logistic regression and random forest are used and compared to verify the prediction power of the identified variables. The results indicate that persistent homology can accurately extract importance information from PSG, and the identified important variables are meaningful for predicting obstructive apnea–hypopnea index (ahi). Cluster analysis is used to identify the pattern of the survey information, and the importance of responses to individual questions in survey questionnaires is also identified by random forest. The results from all three independent studies are very meaningful in clinical studies and can be used as guidance for clinical practitioners.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
W. Almuhammadi, K. Aboalayon, M. Faezipour, Efficient obstructive sleep apnea classification based on EEG signals, in 11th IEEE Long Island Systems, Applications and Technology Conference (LISAT) (2015). https://doi.org/10.1109/LISAT.2015.7160186
N. Alsufyani, A. Hess, N. Ray, P. Major, Segmentation of the nasal and pharyngeal airway using cone beam computed tomography part I: a new approach. Preprint (2017)
C. Avci, A. Akbaş, Sleep apnea classification based on respiration signals by using ensemble methods. Bio-Med. Mater. Eng. 26, S1703–S1710 (2015)
S.M. Banabilh, A.H. Suzina, S. Dinsuhaimi, A.R. Samsudin, G.D. Singh, Craniofacial obesity in patients with obstructive sleep apnea. Sleep Breath. 13(1), 19–24 (2008)
S. Bozkurt, A. Bostanci, M. Turhan, Can statistical machine learning algorithm help for classification of obstructive sleep apnea severity to optimal utilization of polysomnography resources? Methods Inf. Med. 56(4), 308–318 (2017)
L. Breiman, Random forests. Mach. Learn. 45, 5–32 (2001)
L. Breiman, A. Cutler, A. Liaw, Matthew Wiener: R packages “randomForest” (2015)
S.E. Brietzke, E.S. Katz, D.W. Roberson, Can history and physical examination reliably diagnose pediatric obstructive sleep apnea/hypopnea syndrome? A systematic review of the literature. Otolaryngol. Head Neck Surg. 131(6), 827–832 (2004)
P.J. Brockwell, R.A. Davis, Time Series: Theory and Methods (Springer, Berlin, 2009)
P. Bubenik, Statistical topological data analysis using persistence landscapes. J. Mach. Learn. Res. 16, 77–102 (2015)
B. Caffo, M. Diener-West, N.M. Punjabi, J. Samet, A novel approach to prediction of mild obstructive sleep disordered breathing in a population-based sample: the sleep heart health study. Sleep, 33(12), 1641–1648 (2013)
G.D.L. Canto, C. Pacheco-Pereira, S. Aydinoz, P.W. Major, C. Flores-Mir, D. Gozal, Diagnostic capability of biological markers in assessment of obstructive sleep apnea: a systematic review and meta-analysis. J. Clin. Sleep Med. 11(1), 27–36 (2015)
F. Chazal, B.T. Fasy, F. Lecci, B. Michel, A. Rinaldo, L. Wasserman, Subsampling methods for persistent homology, in International Conference on Machine Learning, pp. 2143–2151 (2015)
S. Chowdhury, Facundo Mëmoli, Persistent homology of directed networks, in 50th Asilomar Conference on Signals, Systems and Computers (IEEE, Piscataway, 2016), pp. 77–81. https://doi.org/10.1109/ACSSC.2016.7868997
A. Collins, G. Zomorodian, A. Carlsson, L.J. Guibas, A barcode shape descriptor for curve point cloud data. Comput. Graph. 28, 881–894 (2004)
A. Crespo, D. Álvarez, L. Kheirandish-Gozal, G.C. Gutiérrez-Tobal, A. Cerezo-Hernández, D. Gozal, R. Hornero, F. del Campo, Assessment of oximetry-based statistical classifiers as simplified screening tools in the management of childhood obstructive sleep apnea. Sleep Breath (2018). https://doi.org/10.1007/s11325-018-1637-3
A. Cutler, D. Richard Cutler, Tree-based methods, in High-Dimensional Data Analysis in Cancer Research. Part of the Series Applied Bioinformatics and Biostatistics in Cancer Research (Springer, New York, 2008), pp. 1–19
D.J. Eckert, D.P. White, A.S. Jordan, A. Malhotra, A. Wellman, Defining phenotypic causes of obstructive sleep apnea: identification of novel therapeutic targets. Am. J. Respir. Crit. Care Med. 188(8), 996–1004 (2013)
H. Edelsbrunner, D. Letscher, A. Zomorodian, Topological persistence and simplification. Discret. Comput. Geom. 28, 511–533 (2002)
H. Eldelsbrunner, E. Mucke, Three-dimensional alpha shapes. ACM Trans. Graphics 13(1), 43–72 (1994)
B.T. Fasy, F. Lecci, Confidence sets for persistence diagrams. Ann. Stat. 42, 2301–2339 (2014)
T.K. Ho, Random decision forests, in Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC (IEEE, Piscataway, 1995), pp. 14–16, 278–282
G. James, D. Witten, T. Hastie, R. Tibshirani, An Introduction to Statistical Learning with Applications in R (Springer, New York, 2013)
S. Jeong, W. Kim, S. Sung, Numerical investigation on the flow characteristics and aerodynamic force of the upper airway of patient with obstructive sleep apnea using computational fluid dynamics. Med. Eng. Phys. 29, 637–651 (2007)
A. Jezzini, M. Ayache, A. Ibrahim, L. Elkhansa, ECG classification for sleep apnea detection, in Third International Conference on Advances in Biomedical Engineering (ICABME15) (2015). https://doi.org/10.1109/ICABME.2015.7323312
L. Kaufmann, P.J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis (Wiley, New York, 1990)
V. Kovacev-Nikolic, P. Bubenik, D. Nokolić, G. Heo, Using persistent homology and dynamical distances to analyze protein binding. Stat. Appl. Genet. Mol. Biol. 15(1), 19–38 (2016)
Z.C. Lipton, D.C. Kale, C. Elkan, R. Wetzel, Learning to diagnose with LSTM recurrent neural networks. arXiv:1511.03677v7 (2015)
C.L. Marcus, L.J. Brooks, K.A. Draper, D. Gozal, A.C. Halbower, J. Jones, M.S. Schechter, S.H. Sheldon, K. Spruyt, S.D. Ward, C. Lehmann, R. Shiffman, Diagnosis and management of childhood obstructive sleep apnea syndrome. Am. Acad. Pediatr. 130, 576–584 (2012)
B.H. Menze, B.M.L. Kelm, R. Masuch, U. Himmelreich, P. Bachert, W. Petrich, F.A. Hamprecht, A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data. BMC Bioinf. 10(1), 213 (2009). https://doi.org/10.1186/1471-2105-10-213
R.B. Mitchell, S. Garetz, R.H. Moore, C.L. Rosen, C.L. Marcus, E.S. Katz, R. Arens, R.D. Chervin, S. Paruthi, R. Amin, L. Elden, S.S. Ellenberg, S. Redline, The use of clinical parameters to predict obstructive sleep apnea syndrome severity in children: the childhood adenotonsillectomy (CHAT) study randomized clinical trial. JAMA Otolaryngol. Head Neck Surg. 141(2), 130–136 (2015)
MrOS-Visit2-PSG-Manual-of-Procedures.pdf. https://sleepdata.org/datasets/mros
S. Paruthi, C.L. Rosen, R. Wang, J. Weng, C.L. Marcus, R.D. Chervin, J.J. Stanley, E.S. Katz, R. Amin, S. Redline, End-tidal carbon dioxide measurement during pediatric polysomnography: signal quality, association with apnea severity, and prediction of neurobehavioral outcomes. Sleep 38(11), 1719–1726 (2015)
P. Petrov, S.T. Rush, Z. Zhai, C.H. Lee, P.T. Kim, G. Heo, Topological data analysis of Clostridioides difficile infection and fecal microbiota transplantation. arXiv:1707.08774v2 (2017)
S. Redline, Obstructive sleep apnea-hypopnea and incident stroke: the sleep heart health study. Am. J. Respir. Crit. Care Med. 2, 269–277 (2010)
J.S. Reininghause, S. Huber, U. Bauer, R. Kwitt, A stable multi-scale kernel for topological machine learning, in Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15), Boston, MA (2015), pp. 4741–4748
A. Roebuck, G.D. Clifford, Comparison of standard and novel signal analysis approaches to obstructive sleep apnea classification. Front. Bioeng. Biotechnol. 3, 114 (2015)
L. Rokach, O. Maimon, Clustering methods, in Data Mining and Knowledge Discovery Handbook (Springer, Boston, 2005), pp. 321–352
S. Ryali, T. Chen, K. Supekar, V. Menon, Estimation of functional connectivity in fMRI data using stability selection-based sparse partial correlation with elastic net penalty. NeuroImage 59, 3852–3861 (2012)
P. Su, X-R. Ding, Y-T. Zhang, J. Liu, F. Miao, N. Zhao, Long-term blood pressure prediction with deep recurrent neural networks. arXiv:1705.04524v3 (2017)
C. Van Holsbeke, W. Vos, K. Van Hoorenbeeck, A. Boudewyns, R. Salgado, P.R. Verdonck, J. Ramet, J. De Backer, W. De Backer, S.L. Verhulst, Functional respiratory imaging as a tool to assess upper airway patency in children with obstructive sleep apnea. Sleep Med. 14, 433–439 (2013)
V. Varvarigou, I.J. Dahabreh, A. Malhotra, S.N. Kales, A review of genetic association studies of obstructive sleep apnea: field synopsis and meta-analysis. Sleep 34(11), 1461–1468 (2011)
A. Zomorodian, G. Carlsson, Computing persistent homology. Discret. Comput. Geom. 33, 249–274 (2005)
Acknowledgements
The authors would like to thank the Institute for Computational and Experimental Research Mathematics, the National Science Foundation (NSF-HRD 1500481), and the Association for Women in Mathematics for support, financial, and otherwise, of this collaboration. We thank the National Sleep Research Resource for their permission to use the dataset. We would like to thank the National Sciences and Engineering Research Council of Canada, Seed Grant from Women and Children’s Health Research Institute, University of Alberta, and Biomedical Research Award from American Association of Orthodontists Foundation. We would like to thank Facundo Mémoli for discussion on persistent homology.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
The appendix is a brief description about polysomnography and its signals. Polysomnography is a multi-parametric test used in the study of sleep and as a diagnostic tool in sleep medicine. The test result is called a polysomnogram also abbreviated PSG. PSG is a comprehensive recording of the biophysiological changes that occur during sleep. It is usually performed at night, and in some special cases, it could also be done during the day time. The PSG monitors many body functions including brain (electroencephalography or EEG), eye movements (electrooculography or EOG), muscle activity or skeletal muscle activation (electromyography or EMG), and heart rhythm (electrocardiography or ECG) during sleep. In the 1970s, the sleep efficiency and duration, sleep stages, apnea–hypopnea index, oxygen saturation, carbon dioxide level, sleep stage changes, spontaneous arousal index breathing functions respiratory airflow, and respiratory effort indicators were added to PSG records together with peripheral pulse oximetry [32]. Basically, polysomnography records a lot of time series associated with human sleep and provides rich information about the quality of sleep. Each channel is a time series. Figure 14 shows how typical PSG data looks like. There are several channels in the PSG and each channel is a time series recorded by the units of 10 s. During the whole sleeping period (9.5–10 h often), there are millions of time points recorded and for each participant, their PSG data are multivariate time series with millions of time points. Figure 14 is from the NSRR website.
Particularly, in our study, each of the 100 participants with their PSG recorded has 28 signals in their PSG, namely, electroencephalography (EEG, which has 4 channels of signals, namely, C3, C4, A1, and A2), left outer canthus (LOC), right outer canthus (ROC), electrocardiogram (which has two signals, namely, ECG1 and ECG2), LEFT LEG1, LEFT LEG2, RIGHT LEG1, RIGHT LEG2, electromyogram (which has three signals, namely, EMG1, EMG2, and EMG3), airflow via thin catheters placed in front of nostrils and mouth (AIRFLOW), absence in the effort in the thoratic (THOR EFFORT), absence of effort in the abdominal (ABDO EFFORT), snoring (SNORE), sum channels (SUM), body position (POSITION), oxygen saturation (OX STATUS), pulse oximetry (PULSE), oxygen level (SpO2), light, heart rate (HRate), plethysmography (Pleth WV), and nasal pressure (NASAL PRES).
We write the definition of respiratory events. Respiratory events were scored if they were at least 8 s long, which represents at least 2 missed respiratory cycles at this stage. Obstructive apneas were scored when chest and abdominal efforts were asynchronous and estimated tidal volume was < 25% of baseline, irrespective of associated desaturation. Hypopneas were scored when respiratory efforts were accompanied by a 25–50% reduction in estimated tidal volume and accompanied by at least 3% oxyhemoglobin desaturation or when clearly discernible decreases in estimated tidal volume were associated with similar desaturation. Central apneas (absent effort in both channels) were excluded from sleep-disordered breathing indexes.
Rights and permissions
Copyright information
© 2019 The Author(s) and the Association for Women in Mathematics
About this chapter
Cite this chapter
Heo, G., Leonard, K., Wang, X., Zhou, Y. (2019). Interdisciplinary Approaches to Automated Obstructive Sleep Apnea Diagnosis Through High-Dimensional Multiple Scaled Data Analysis. In: Gasparovic, E., Domeniconi, C. (eds) Research in Data Science. Association for Women in Mathematics Series, vol 17. Springer, Cham. https://doi.org/10.1007/978-3-030-11566-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-11566-1_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11565-4
Online ISBN: 978-3-030-11566-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)