Abstract
Introduction
This study used an unsupervised machine learning algorithm, sidClustering and random forests, to identify clusters of risk behaviors of Bacterial Vaginosis (BV), the most common cause of abnormal vaginal discharge linked to STI and HIV acquisition.
Methods
Participants were 391 cisgender women in Miami, Florida, with a mean of 30.8 (SD = 7.81) years of age; 41.7% identified as Hispanic; 41.7% as Black and 44.8% as White. Participants completed measures of demographics, risk behaviors [sexual, medical, and reproductive history, substance use, and intravaginal practices (IVP)], and underwent collection of vaginal samples; 135 behavioral variables were analyzed. BV was diagnosed using Nugent criteria.
Results
We identified four clusters, and variables were ranked by importance in distinguishing clusters: Cluster 1: nulliparous women who engaged in IVPs to clean themselves and please sexual partners, and used substances frequently [n = 118 (30.2%)]; Cluster 2: primiparous women who engaged in IVPs using vaginal douches to clean themselves (n = 112 (28.6%)]; Cluster 3: primiparous women who did not use IVPs or substances [n = 87 (22.3%)]; and Cluster 4: nulliparous women who did not use IVPs but used substances [n = 74 (18.9%)]. Clusters were related to BV (p < 0.001). Cluster 2, the cluster of women who used vaginal douches as IVPs, had the highest prevalence of BV (52.7%).
Conclusions
Machine learning methods may be particularly useful in identifying specific clusters of high-risk behaviors, in developing interventions intended to reduce BV and IVP, and ultimately in reducing the risk of HIV infection among women.
Similar content being viewed by others
Data availability
Data is available from the corresponding author, Maria Luisa Alcaide.
References
Alcaide ML, Rodriguez VJ, Brown MR, Pallikkuth S, Arheart K, Martinez O, Roach M, Fichorova RN, Jones DL, Pahwa S (2017) High levels of inflammatory cytokines in the reproductive tract of women with BV and engaging in intravaginal douching: a cross-sectional study of participants in the women interagency HIV study. AIDS Res Hum Retroviruses 33(4):309–317
Alcaide ML, Rodriguez VJ, Fischl MA, Jones DL, Weiss SM (2017) Addressing intravaginal practices in women with HIV and at-risk for HIV infection, a mixed methods pilot study. Int J Womens Health 9:123–132. https://doi.org/10.2147/IJWH.S125883
Alduhaidhawi AHM, AlHuchaimi SN, Al-Mayah TA, Al-Ouqaili MTS, Alkafaas SS, Muthupandian S, Saki M (2022) Prevalence of CRISPR-cas systems and their possible association with antibiotic resistance in Enterococcus faecalis and Enterococcus faecium collected from hospital wastewater. Infect Drug Resist 15:1143–1154
Beck D, Foster JA (2014) Machine learning techniques accurately classify microbial communities by bacterial vaginosis characteristics. PLoS ONE 9(2):e87830
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Brown JM, Poirot E, Hess KL, Brown S, Vertucci M, Hezareh M (2016) Motivations for intravaginal product use among a cohort of women in Los Angeles. PLoS ONE 11(3):e0151378. https://doi.org/10.1371/journal.pone.0151378
Bzdok D, Krzywinski M, Altman N (2017) Machine learning: a primer. Nat Methods 14(12):1119
de Camargo KC, Alves RRF, Saddi VA (2023) Prevalence and factors associated with bacterial vaginosis in women in Brazil: a systematic review. Braz J Sex Transm Dis. https://doi.org/10.5327/DST-2177-8264-2023351223
Fethers KA, Fairley CK, Hocking JS, Gurrin LC, Bradshaw CS (2008) Sexual risk factors and bacterial vaginosis: a systematic review and meta-analysis. Clin Infect Dis 47(11):1426–1435. https://doi.org/10.1086/592974
Forcey DS, Vodstrcil LA, Hocking JS, Fairley CK, Law M, McNair RP, Bradshaw CS (2015) Factors associated with bacterial vaginosis among women who have sex with women: a systematic review. PLoS ONE 10(12):e0141905
Greenbaum S, Greenbaum G, Moran-Gilad J, Weintraub AY (2019) Ecological dynamics of the vaginal microbiome in relation to health and disease. Am J Obstet Gynecol 220(4):324–335. https://doi.org/10.1016/j.ajog.2018.11.1089
Hussein RA, Al-Ouqaili MTS, Majeed YH (2022) Association between alcohol consumption, cigarette smoking, and Helicobacter pylori infection in Iraqi patients submitted to gastrointestinal endoscopy. J Emerg Med Trauma Acute Care 2022(6):12
Ishwaran H, Kogalur UB, Kogalur MUB (2023) Package ‘randomForestSRC.’ Breast 6(1):854
Jiang T, Gradus JL, Rosellini AJ (2020) Supervised machine learning: a brief primer. Behav Ther 51(5):675–687
Johnson SR, Griffiths H, Humberstone FJ (2010) Attitudes and experience of women to common vaginal infections. J Low Genit Tract Dis 14(4):287–294
Kairys N, Garg M (2021) Bacterial vaginosis. In StatPearls. https://www.ncbi.nlm.nih.gov/pubmed/29083654
Kanaan BA, Al-Ouqaili MTS, Murshed RM (2022) In terms of the PCR-RFLP technique, genetic screening of Ala575Val inactivating mutation in patients with amenorrhea. J Emerg Med Trauma Acute Care 2022(6):8
Koumans EH, Sternberg M, Bruce C, McQuillan G, Kendrick J, Sutton M, Markowitz LE (2007) The prevalence of bacterial vaginosis in the United States, 2001–2004; associations with symptoms, sexual behaviors, and reproductive health. Sex Transm Dis 34(11):864–869. https://doi.org/10.1097/OLQ.0b013e318074e565
Loi S, Sirtaine N, Piette F, Salgado R, Viale G, Van Eenoo F, Rouas G, Francis P, Crown JP, Hitre E (2013) Prognostic and predictive value of tumor-infiltrating lymphocytes in a phase III randomized adjuvant breast cancer trial in node-positive breast cancer comparing the addition of docetaxel to doxorubicin with doxorubicin-based chemotherapy: BIG 02–98. J Clin Oncol 31(7):860–867
Mantero A, Ishwaran H (2021) Unsupervised random forests. Stat Anal Data Min: ASA Data Sci J 14(2):144–167
Messaoudene M, Mourikis TP, Michels J, Fu Y, Bonvalet M, Lacroix-Trikki M, Routy B, Fluckiger A, Rusakiewicz S, Roberti MP (2019) T-cell bispecific antibodies in node-positive breast cancer: novel therapeutic avenue for MHC class I loss variants. Ann Oncol 30(6):934–944
Nakazawa M, Nakazawa MM (2019) Package ‘fmsb’. See https://cran.r-project.org/web/packages/fmsb/fmsb.pdf, 52
Nansel TR, Riggs MA, Yu KF, Andrews WW, Schwebke JR, Klebanoff MA (2006) The association of psychosocial stress and bacterial vaginosis in a longitudinal cohort. Am J Obstet Gynecol 194(2):381–386. https://doi.org/10.1016/j.ajog.2005.07.047
Nugent RP, Krohn MA, Hillier SL (1991) Reliability of diagnosing bacterial vaginosis is improved by a standardized method of gram stain interpretation. J Clin Microbiol 29(2):297–301
Onderdonk AB, Delaney ML, Fichorova RN (2016) The human microbiome during bacterial vaginosis. Clin Microbiol Rev 29(2):223–238. https://doi.org/10.1128/CMR.00075-15
Onu EN, Ekuma UO, Judi HK, Ogbu O, Okoro N, Ajugwo GC, Akrami S, Okoli CS, Anyanwu CN, Saki M (2023) Seroprevalence of antibodies to herpes simplex virus 1 and 2 in patients with HIV positive from Ebonyi State, Nigeria: a cross-sectional study. BMJ Open 13(4):e069339
Paul K, Boutain D, Manhart L, Hitti J (2008) Racial disparity in bacterial vaginosis: the role of socioeconomic status, psychosocial stress, and neighborhood characteristics, and possible implications for preterm birth. Soc Sci Med 67(5):824–833. https://doi.org/10.1016/j.socscimed.2008.05.017
Rodriguez VJ, Salazar AS, Cherenack EM, Klatt NR, Jones DL, Alcaide ML (2022) Assessing intravaginal practices in HIV prevention research: development and validation of an intravaginal practices questionnaire. Arch Sex Behav 52:1–6
Salazar AS, Nogueira NF, Rodriguez VJ, Mantero A, Cherenack EM, Raccamarich P, Maddalon M, Brophy T, Montgomerie E, Klatt NR (2022) A Syndemic approach to explore factors associated with bacterial vaginosis. AIDS Behav 26:1–9
Sha BE, Chen HY, Wang QJ, Zariffard MR, Cohen MH, Spear GT (2005) Utility of Amsel criteria, Nugent score, and quantitative PCR for Gardnerella vaginalis, Mycoplasma hominis, and Lactobacillus spp for diagnosis of bacterial vaginosis in human immunodeficiency virus-infected women. J Clin Microbiol 43(9):4607–4612. https://doi.org/10.1128/JCM.43.9.4607-4612.2005
Thottakkara P, Ozrazgat-Baslanti T, Hupf BB, Rashidi P, Pardalos P, Momcilovic P, Bihorac A (2016) Application of machine learning techniques to high-dimensional clinical data to forecast postoperative complications. PLoS ONE 11(5):e0155705
Tibshirani R, Walther G, Hastie T (2001) Estimating the number of clusters in a data set via the gap statistic. J R Stat Soc Ser B (Stat Methodol) 63(2):411–423. https://doi.org/10.1111/1467-9868.00293
Warnes GR, Bolker B, Lumley T, Warnes MGR, Imports M (2018) Package ‘gmodels.’ R Foundation for Statistical Computing, Vienna
Weng SF, Reps J, Kai J, Garibaldi JM, Qureshi N (2017) Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS ONE 12(4):e0174944
Funding
This work was supported by National Institutes of Health grants to the University of Miami [R01AI138718 to M.L.A], Center for AIDS Research grant [P30A1073961 to M.L.A.] and the Center for HIV and Research in Mental Health [P30MH116867 to D.L.J.] This work was also partially funded by a Ford Foundation Fellowship to VJR, administered by the National Academies of Sciences, Engineering, and Medicine, a PEO Scholar Award from the PEO Sisterhood, and NIMH R36MH127838.
Author information
Authors and Affiliations
Contributions
VJR protocol/project development, manuscript writing, data analysis. YP data analysis. AS data collection or management, manuscript editing. NFN data collection or management, manuscript editing. PR data collection or management, manuscript editing. NRK protocol/project development. DLJ protocol/project development. MLA protocol/project development, manuscript editing.
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest beyond research funding related to the work presented in this manuscript.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Rodriguez, V.J., Pan, Y., Salazar, A.S. et al. Using unsupervised machine learning to classify behavioral risk markers of bacterial vaginosis. Arch Gynecol Obstet 309, 1053–1063 (2024). https://doi.org/10.1007/s00404-023-07360-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00404-023-07360-7