Skip to main content

Venous thromboembolism in COVID-19 patients and prediction model: a multicenter cohort study

Abstract

Background

Patients with COVID-19 infection are commonly reported to have an increased risk of venous thrombosis. The choice of anti-thrombotic agents and doses are currently being studied in randomized controlled trials and retrospective studies. There exists a need for individualized risk stratification of venous thromboembolism (VTE) to assist clinicians in decision-making on anticoagulation. We sought to identify the risk factors of VTE in COVID-19 patients, which could help physicians in the prevention, early identification, and management of VTE in hospitalized COVID-19 patients and improve clinical outcomes in these patients.

Method

This is a multicenter, retrospective database of four main health systems in Southeast Michigan, United States. We compiled comprehensive data for adult COVID-19 patients who were admitted between 1st March 2020 and 31st December 2020. Four models, including the random forest, multiple logistic regression, multilinear regression, and decision trees, were built on the primary outcome of in-hospital acute deep vein thrombosis (DVT) and pulmonary embolism (PE) and tested for performance. The study also reported hospital length of stay (LOS) and intensive care unit (ICU) LOS in the VTE and the non-VTE patients. Four models were assessed using the area under the receiver operating characteristic curve and confusion matrix.

Results

The cohort included 3531 admissions, 3526 had discharge diagnoses, and 6.68% of patients developed acute VTE (N = 236). VTE group had a longer hospital and ICU LOS than the non-VTE group (hospital LOS 12.2 days vs. 8.8 days, p < 0.001; ICU LOS 3.8 days vs. 1.9 days, p < 0.001). 9.8% of patients in the VTE group required more advanced oxygen support, compared to 2.7% of patients in the non-VTE group (p < 0.001). Among all four models, the random forest model had the best performance. The model suggested that blood pressure, electrolytes, renal function, hepatic enzymes, and inflammatory markers were predictors for in-hospital VTE in COVID-19 patients.

Conclusions

Patients with COVID-19 have a high risk for VTE, and patients who developed VTE had a prolonged hospital and ICU stay. This random forest prediction model for VTE in COVID-19 patients identifies predictors which could aid physicians in making a clinical judgment on empirical dosages of anticoagulation.

Peer Review reports

Introduction

Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has been causing COVID-19 illness globally since December 2019, with more than 310 million people infected and more than five million deaths reported as of 1st Jan 2022 [1]. The common manifestations of COVID-19 include fever, cough, dyspnea, myalgia, fatigue, and diarrhea. Primarily, COVID-19 infection results in respiratory complications. However, it is evident that COVID-19 infection may be associated with a hyper-coagulable state, which leads to microvascular and macrovascular arterial and venous thromboembolism (VTE) [2, 3].

The incidence of VTE complications in COVID-19 patients ranged from 1.7 to 16.5% in 35 observational studies reported from around the world (total N = 9249) [4]. Researchers postulated that a severely activated inflammatory response to COVID-19 infection causes thrombo-inflammation; through mechanisms such as cytokine storm, complement activation, and endotheliosis [5]. In addition, certain studies reported findings of microthrombi in autopsies of COVID-19 patients [6]. Recent retrospective studies proposed several risk factors associated with higher mortality and higher severity of COVID-19, including inflammatory markers such as interleukin-6 (IL-6), D-dimer, ferritin, and lactate dehydrogenase (LDH)[7, 8]. Moreover, many studies also showed VTE in COVID-19 is associated with severity of infection and mortality [8]. Hence it is critical for physicians to identify the risk factors for the prevention and early management of VTE.

Most of the prediction models built for COVID-19 patients predict prognosis [9,10,11], with only a few models predicting VTE [12,13,14]. These models were built using a limited selection of variables, mostly had a smaller sample size, and primarily involved modification and validation of pre-COVID-19 VTE prediction models. With the growing awareness of VTE risk in COVID-19, patients are now routinely placed on prophylactic dose anticoagulants per National Institute Health recommendation, except in cases of high bleeding risk, severe thrombocytopenia, or suspected hemorrhage necessitating caution in these selected patients [6, 15,16,17]. This highlights the need for a prediction model tailored for COVID-19 patients, with comprehensive variable selection and performance evaluation, which can support the use of anticoagulation in this crucial patient population. Therefore, we analyzed the independent predictors of VTE using different machine learning methods in a cohort of 3531 hospitalized COVID-19 patients from Southeastern Michigan.

Methods

In this cross-sectional retrospective observational study, we report and analyze the data from Southeastern Michigan COVID-19 Consortium Registry Database (SMCRD). As previously described, SMCRD is a multi-institutional registry database of four main health systems in Southeast Michigan, United States, including Henry Ford Health System, Beaumont Health System, Trinity Health System, and Wayne State University [18]. It is built using REDCap and is housed at Vanderbilt University Medical Center. The SMCRD registry contains de-identified data of adult patients who were hospitalized with laboratory-confirmed SARS-CoV-2 PCR tests. Each institution independently collected data from March 1, 2020, to September 5, 2021. Our study was approved by the institutional review board (IRB) of Trinity Health System.

Procedures

We compiled data for adult patients (age 18 years or older) that included baseline demographics, laboratory results, and in-hospital events, including all-cause mortality of COVID-19 patients from March 1, 2020, to the end of December 2020. All patients (with and without VTE events) were included (Fig. 1). For each patient, a total of 85 variables (Additional file 1: Table S1) from six categories were extracted, including baseline demographics, presenting vital signs, past medical history (abstracted using standard-text variables, International Classification of Diseases–Tenth Revision (ICD-10) and Current Procedural Terminology codes), social history, admission reasons, pre-admission and in-hospital medications, hospital course, laboratory values, electrocardiogram, and imaging studies (magnetic resonance imaging (MRI), computerized tomography scan, ultrasounds). Variables in our study included: personal information (age, sex, race/ethnicity, body mass index (BMI), social history), hospital summary (hospital length of stay (LOS), intensive care unit (ICU) admission and LOS, use of oxygen devices, intubation status), laboratory values (white blood cell (WBC) counts, D-dimer, ferritin, LDH, lactate, C-reactive protein (CRP), and so on), past medical history, vital signs, and in-hospital prophylactic and therapeutic anticoagulation therapy. Since COVID-19 can cause VTE in patients following discharge, we followed patients after their initial hospital discharge for readmission and development of VTE. Accordingly, patients with one-time admission and readmissions, with or without thromboembolism events, were considered when building prediction models.

Fig. 1
figure 1

Consort diagram of Southeastern Michigan COVID-19 Registry Consortium Database

Outcomes

The primary outcome was in-hospital VTE events, including acute deep vein thrombosis (DVT) and pulmonary embolism (PE) identified by ICD-10 codes (Additional file 1: Table S2), venous Doppler ultrasounds, ventilation-perfusion scan, and computed tomography angiography (CTA) of the chest. In-hospital outcomes (Table 1) included mortality, and hospital and ICU LOS.

Table 1 Baseline characteristics of COVID-19 patients with and without acute venous thromboembolism

Statistical analysis

Initial data cleaning and analysis

Laboratory values at the time of admission, peak, and minimum values were collected. For VTE, approximately 5% of patients had CTA chest images available, and 1% of patients had CTA-confirmed PE and vessel image-confirmed DVT; limited diagnostic testing was likely due to the COVID-19 hospitals’ policy of limiting exposure to the virus in the first wave of the pandemic. Of the 3531 patients, 161 patients had PE, and 121 had DVT. 3127 patients were anticoagulated with either enoxaparin or heparin. Enoxaparin dosage higher than 40 mg subcutaneous twice daily was considered as therapeutic dose (N = 340), whereas less than 40 mg subcutaneous twice daily was defined as prophylactic dose (N = 1920). Intravenous heparin was included in the therapeutic dose (N = 182) and subcutaneous heparin was considered as the prophylactic dose (N = 1315). In total, 1018 patients received therapeutic dose and 2976 patients received prophylactic dose anticoagulation.

We categorized race/ethnicity, BMI, oxygen devices, smoking, alcohol and marijuana history, and past medical history into dichotomous variables, while laboratory test values were retained as continuous variables. Initial descriptive analysis for continuous variables was described as mean with standard deviation or median with interquartile range. Categorical variables were described as frequency distributions. To compare the groups, the Chi-square test was used for categorical variables, and the t-test was used for continuous variables. Univariate analysis and principal component analysis (PCA) were used to identify potential risk factors for VTE (Additional file 1: Table S3 and Fig. S1). All data were analyzed using SAS v9.4 or R 3.6.2, and a p-value less than 0.05 was considered to indicate statistical significance. Prediction models were built using JMP Pro 14.2.0 (Additional file 1: Table S4).

Data cleaning

As part of exploratory data analysis, the distribution of all the variables was plotted. Most laboratory values were either left or right-skewed. Multiple variables could be highly correlated with each other and potentially result in interactions in the process of model building. For example, both neutrophil and lymphocyte counts comprise the neutrophil–lymphocyte ratio. Likewise, BUN and creatinine comprise the BUN-creatinine ratio, which is a parameter that could indicate different types of acute kidney injury; for example, the BUN-creatinine ratio > 20 suggests pre-renal acute kidney injury. Therefore, Spearman’s rho was performed. Twenty-three groups of variables that were highly positively or negatively correlated based on Spearman’s coefficient of more than ± 0.7 (Additional file 1: Table S5A) were aspartate aminotransferase (AST) and alanine transaminase (ALT), creatinine and BUN, maximum (max) B-type natriuretic peptide (BNP) and initial BNP, max CRP and initial CRP, max ferritin and initial ferritin, max D-dimer and initial D-dimer, neutrophil–lymphocyte ratio and neutrophils, max neutrophils and minimum lymphocyte, history of VTE, DVT and PE, systolic blood pressure and diastolic blood pressure, inpatient therapeutic anticoagulation and inpatient prophylactic anticoagulation and so on. Therefore, we downsized the variables; for example, neutrophil and lymphocyte alone were analyzed in the model building rather than the neutrophil–lymphocyte ratio. Likewise, BUN and creatinine alone were included rather than the BUN-creatinine ratio; the history of VTE was used rather than its components (DVT and PE) (Additional file 1: Table S5B). When building models, we used lab values on admission rather than the peak or lowest values as we aimed to build a prediction model which can assist physicians in predicting VTE in COVID-19 patients on admission based on the available data. The PCA was performed to reduce the dimensions used to predict VTE events. Patients without missing data (N = 1443) from the cohort were included in the PCA. A total of 32 continuous variables were included in the PCA. In the scree plot, the 1st component explained only about 16% of variations of the data, and only 24.6% of the variations were explained by the first two components (Additional file 1: Table S3 and Fig. S1). Therefore, the PCA was deemed not helpful in reducing the dimensions in our analysis. For both continuous and categorical variables, we further performed univariate analysis using the R packages (Additional file 1: Table S4).

Model building

The cohort was randomly split into the training set and test set (70:30) multiple times. We compared four models in their predictive accuracy for detecting VTE events and mortality:

  • Multiple linear regression (MLR)

  • Multiple logistic regression (LR)

  • Decision tree

  • Random forest

Results

A total of 3531 admissions were identified, of which 3416 were first admissions and 115 were readmissions; of the 115 readmitted patients, 109 were readmitted once, and 6 were readmitted twice. Overall, there were 236 patients (6.68%) with VTE events and 2907 patients with no VTE events in the dataset. In general, the VTE group had a longer LOS in hospital and ICU than the non-VTE group (hospital LOS 12.2 days vs. 8.8 days, p < 0.001; ICU LOS 3.8 days vs. 1.9 days, p < 0.001). In addition, 9.8% of patients in the VTE group required advanced oxygen support, compared to 2.7% of patients in the non-VTE group (p < 0.001). Laboratory values such as WBC, CRP, D-dimer, and platelet count were significantly different between VTE and non-VTE groups (p < 0.001). Baseline demographic characteristics of patients are summarized in Table 1. The mean age for VTE and non-VTE patients was 68 ± 16.7 years and 66.2 ± 16.4 years (p = 0.125), respectively. Morbid obesity was common in both groups (VTE vs. non-VTE: 47.6% vs. 50.2%, p = 0.329). The in-hospital all-cause mortality for VTE patients was 22.2%, whereas non-VTE patients was 14.8% [Odds ratio (OR): 1.65, 95% confidence interval (CI): 1.22, 2.22, p = 0.001]. We also found that the VTE group had a longer hospital LOS, ICU LOS, and days on ventilator than the non-VTE group. The univariate analysis of predictors of VTE upon admission are shown in Additional file 1: Table S3. The variables like IL-6 (pg/mL), CRP (mg/dL), D-dimer (ng/mL), WBC (K/uL), BUN (mg/dL) had an OR of 1.00 to 1.2 and were significant; this was not negligible as most of the variables were measured on a small scale. Moreover, these laboratory variables are of great interest in COVID-19 patients because COVID-19 infection causes cytokine storm leading to elevated inflammatory markers, such as ferritin, LDH, CRP, and IL-6. These inflammatory responses result in endotheliitis and hypercoagulopathy that predispose the patients to develop VTE.

Prediction model for VTE

The most significant variables of each model are shown in Table 2. For MLR and LR, the significant variables were selected based on the p-value of < 0.05; for decision tree and random forest, they were based on the Gini index. MLR was eliminated as it is not ideal for categorical variables. The decision tree has worse accuracy than a random forest but provides interpretability. Our decision tree was firstly split by the root node as therapeutic anticoagulation as inpatient, followed by leaf nodes of BUN (< 20, 20), hospital LOS (< 20, 20), Age (< 91, 91), race (White, non-White), D-dimer (4740 ng/mL, < 4740 ng/mL), history of VTE, and D-dimer (2170 ng/mL, < 2170 ng/mL) (Additional file 1: Fig. S2). Whereas random forests are an ensemble of decision trees that solve the overfitting of the decision tree as the predictions are based on an average of all trees. On the other hand, loss of interpretability is one of the limitations of the random forests. Both decision trees and random forests handle continuous and categorical variables that best analyze our cohort. Across all models, D-dimer was the most significant variable for MLR, LR, and decision tree models. Other common variables across the models include VTE history, inpatient therapeutic anticoagulation, requirement for oxygen devices such as high-flow nasal cannula, non-rebreather mask, and mechanical ventilation, heart rate, BUN, and so on. The four models were compared, as shown in Table 3, to analyze predictive ability in diagnosing COVID-19 associated VTE. Random forest performed the best among all in terms of R-square (R2), misclassification rate, and receiver operating characteristic (ROC) curve.

Table 2 Significant variables in prediction models, listed in descending order: (1) Multiple linear regression (2) Multiple logistic regression (3) Decision tree (4) Random forest
Table 3 Model performance for venous thromboembolism prediction in COVID-19 patients

Performance of the model

Random forest model consisted of 22 variables (significance in order): D-dimer, inpatient therapeutic anticoagulation therapy, platelet count, BUN, age, WBC, systolic blood pressure, lymphocytes, ALT, potassium, BNP, CRP, creatinine, LDH, neutrophils, heart rate, total bilirubin, AST, diastolic blood pressure, prior history of VTE, ferritin, and oxygen saturation on admission. Electrolytes, renal function, blood pressures, hepatic enzymes, and inflammatory markers were indicators of VTE risks. The evaluation of the performance and confusion matrix of the four models in training and the validation process is shown in Table 3. The R2 of the random forest model for the training and validation set was 58.87% and 18.76% (p < 0.0001); the area under the ROC curve was 0.83 (Fig. 2). We set a cutoff of 0.1 for the generation of sensitivity and specificity. The random forest model had a sensitivity of 0.68 and a specificity of 0.82. In our cohort, the classification was skewed; therefore, the default threshold (0.5) cannot represent an optimal interpretation of the predicted probabilities. Effectively, our goal was to provide a robust model for clinicians to identify COVID-19 patients at risk for VTE early in the hospital course and assist in deciding between therapeutic versus prophylactic anticoagulation management. In the validation set, the model showed that it was good at predicting the absence of a venous event more than the presence of a venous event. The negative predictive value (NPV) and positive predictive value (PPV) of the model for the validation set were 0.97 and 0.26. Due to the low prevalence of VTE in the population, the F1 score of the model was calculated as 0.35.

Fig. 2
figure 2

Receiver operating characteristic (ROC) curve of the random forest model for venous thromboembolism in COVID-19 patients. The random forest model’s area under the ROC curve was 0.83

Discussion

VTE is one of the most common complications in COVID-19 patients [19,20,21,22]. This retrospective study presents a prediction model for VTE in COVID-19 patients and the demographics, clinical parameters, and incidence rate of VTE in COVID inpatients. The incidence rate of VTE could have been underreported due to limited radiological testing to reduce staff exposure to COVID-19 infection in the first wave [23]. Our study reported an incidence rate of 6.68%, similar to other studies [24,25,26,27,28,29] (Table 4). We found that patients who developed new-onset VTE had more extended hospital LOS (12.2 days vs. 8.8 days, p < 0.001) and ICU LOS (3.8 days vs. 1.9 days, p < 0.001) compared to patients who did not have VTE. This is a robust prediction model for VTE in hospitalized patients with COVID-19 using a large multicenter database (N = 3531). We included 85 variables from a broad spectrum of parameters, demographics, vitals, comorbidities, and hospital course (oxygen requirement, ICU admission, hospital and ICU LOS). Electrolytes, renal function, blood pressures, hepatic enzymes, and inflammatory markers were indicators of VTE risks; however, further studies on whether a cutoff value could be applied to inflammatory markers for good sensitivity and specificity for VTE in COVID-19 infection would be beneficial. Physicians can assess patients’ presenting signs, renal and hepatic functions, and potentially identify patients at high risk of VTE and work on the reversible risk factors to reduce patients’ risks of developing VTE during hospitalization.

Table 4 Characteristics of retrospective COVID-19 studies on venous thromboembolism incidence rate and predictors
Table 5 Characteristics of retrospective studies on venous thromboembolism prediction models

It is worth mentioning that we used presenting data which was the initial data of patients admitted to the hospital. Models such as multiple LR models that do not handle missing data have smaller sample sizes that can potentially affect performance. Our MLR model has an R2 of 0.2569, p < 0.0001. The R2 value of MLR and LR is low, which is consistent with the fact that we did not include laboratory values that are missing and did not impute those values. The decision tree has a lower R2 value (0.19 in training and 0.11 in the testing set). However, the R2 value is most likely not appropriate for a tree-based model. Nevertheless, the random forest model has a low misclassification rate (6.87% in the training set, 8.4% in the testing set). Overall, we have low R2 values. The decision tree may have worse accuracy than a random forest, but the tree structure is easy to understand and interpret. By looking at the splitting nodes, key factors can be identified, and predictions can be made. On the other hand, random forests are an ensemble of decision trees, and the predictions are based on an average of all trees, which is a “black box” that can’t be directly described. One of the possibilities is that our study cohort has an inherently higher amount of unexplainable variability; this could be better addressed in future prospective studies.

Of 3532 records, only 1282 patients were included in the MLR model due to the missing values in the other patients. Similarly, in the LR, only 1282 records were used, which was less than 50% of the records. Although IL-6, LDH, procalcitonin, ferritin, and fibrinogen were excluded in the model building due to significant numbers of missing values, we found no significant difference in these values between non-VTE and VTE groups.

Our model can provide clinical risk stratification of VTE in COVID-19 patients and help individualize thromboprophylaxis, which supports the current consensus of customized and risk-adapted management for thromboprophylaxis in international guidelines [30]. Five papers studied VTE in COVID-19 patients using existing prediction models [26, 31,32,33,34] (Table 5). Kampouri et al. combined the Wells score and D-dimer value to predict VTE with a PPV of 18.2%, an NPV 98.5%, and accuracy of 0.905 [31]. A Dutch study reported a 41.7% incidence rate of VTE in COVID-19 patients and built a linear regression model consisting of D-dimer > 9 μg/mL and CRP > 280 mg/mL, and the authors report a predicted probability of 92% [32]. Another study by Taplin et al. modified the Caprini score using a cutoff value of 12, which is also based on the D-dimer score and showed a sensitivity of 73% and specificity of 84% in predicting VTE [33]. Unlike our study, most of these studies had a smaller sample size and number of events and included risk factors not analyzed in the original prediction model studies. Notably, the performance of the model depends on the event prevalence. Among all studies, the Dutch study had the highest predictive probability in the critically ill population due to a higher incidence of VTE [32]. A meta-analysis of 47 studies showed high prevalence of PE with high mean D-dimer values (prevalence ratio 1.3 per 1000 ng/mL increase; 95% CI: 1.11, 1.50, p = 0.002) and percentage of ICU patients (1.02 per 1% increase; 95% CI: 1.01, 1.03, p < 0.001). In addition, prevalence of DVT was also high across studies with high mean D-dimer values (1.04 per 1000 ng/mL increase; 95% CI: 1.01, 1.07, p = 0.022)[35].

After systemic review, we included six studies that reported VTE incidence rate in COVID-19 patients without prediction models (Table 4). Our study showed an incidence rate of 6.68% of VTE in COVID-19 patients, which is consistent with three of the studies [25, 28, 29], whereas Freund et al. reported a rate of 15% and two studies showed a lower incidence rate of 2–3% [24, 26, 27]. Critically ill COVID-19 patients who were admitted to ICU had a higher incidence rate of VTE. Only two studies identified risk factors for COVID-19 patients using the MLR model, including advanced age, increased Charlson Comorbidity Index, history of cardiovascular disease, ICU admission, elevated D-dimer, male gender, heart rate, clinical signs of DVT, and recent immobilization [24, 26]. Unlike other studies, we did not impute missing values to better build a model that predicts VTE individually.

Our study analyzed D-dimer, lactate, and inflammatory markers, including CRP, ferritin, and LDH that are of great interest in clinical settings and have been routinely ordered for COVID-19 patients. The utilization of laboratory values varies; many physicians trend these markers to predict the trajectory of COVID-19 patients. However, limited studies included them for VTE analysis. Our result showed no significant difference in presenting CRP, IL-6, and LDH levels among VTE and non-VTE groups (Table 1), yet the maximum value of D-dimer, CRP, and LDH were significantly higher in VTE groups. This may suggest that D-dimer, CRP and LDH could be utilized clinically for monitoring. However, further studies on the threshold, sensitivity, and specificity of certain markers are needed.

Current guidelines by the American Society of Hematology (ASH) suggest using prophylactic-intensity over intermediate-intensity anticoagulation for patients with COVID-19 related critical illness who do not have suspected or confirmed VTE [36]. Furthermore, ASH suggests that an individualized assessment of the patient’s risk of thrombosis and bleeding is important when deciding on anticoagulation intensity. Our study provides physicians with a model that could aid in risk stratification, as VTE has been well-known to be a common COVID-19 complication.

We observed that 11.5% of patients (N = 302) who did not have VTE were given a therapeutic dosage of anticoagulation, whereas 46.3% (N = 101) with VTE received therapeutic anticoagulation. It is unclear why after diagnosis of VTE, over half of the patients only received prophylactic anticoagulants. It described an unmet need for risk stratification for COVID-19 patients. Vaughn et al. reported that 16.2% of patients who had suspected VTE were given therapeutic anticoagulation and increased treatment-dose anticoagulation for VTE prophylaxis [37]. The INSPIRATION trial did not show the difference in routine empirical use of intermediate-dose prophylactic anticoagulation compared to standard dose in ICU patients with the primary composite outcome including acute VTE, arterial thrombosis, the use of extracorporeal membrane oxygenation, and all-cause mortality [absolute risk difference, 1.5% (95% CI: − 6.6, 9.8); OR: 1.06 (95% CI: 0.76, 1.48); p = 0.70] [16]. The Anti-Thrombotic Therapy to Ameliorate Complications of COVID-19 (ATTACC) randomized multicenter adaptive design trials have shown therapeutic anticoagulation to be beneficial in moderately ill patients, whereas it was futile in ICU patients requiring organ failure support [38, 39].

Our study has both strengths and limitations. The strengths include the large sample size, multi-institute-based data, and availability of broad outcomes events data. Moreover, our VTE prediction model in COVID-19 patients can most benefit clinical practice to aid clinical management in settings where a definitive diagnosis of VTE is hard to obtain, for example, for critically ill patients on mechanical ventilation who are unable to undergo CTA chest study. Since this is a retrospective study utilizing a large database, we were unable to obtain the timing of diagnosis of acute VTE in our cohort, which would have allowed exploration of the temporal relationship between VTE and potential risk factors, highlighting an important limitation of our study. Furthermore, although our models showed good predictive capacity, the lower incidence of VTE in the population study created significant hurdles. The random forest model’s PPV is 26%, NPV is 97%, and the F1 score is 0.36. Future studies on a composite outcome including both venous and arterial events could provide a bigger population. Also, the random forest model is not a panelized method and has the risk of overfitting. Lastly, our model needs to be validated externally.

Conclusions

There is a high incidence of VTE in hospitalized COVID-19 patients. Prolonged hospital and ICU stay was noted in patients who developed VTE. This random forest prediction model for VTE in COVID-19 patients is based on a broad spectrum of parameters available on initial presentation and comorbidities. Factors like D-dimer, LDH, platelet count, age, WBC, AST, ALT, BUN and creatinine, heart rate on presentation, and prior history of VTE can predict in-hospital VTE events which could aid physicians in making a clinical judgment on the empirical dosage of anticoagulation.

Availability of data and materials

Deidentified clinical data supporting the analysis in this publication will be made available from the corresponding author upon request.

Abbreviations

AIC:

Akaike information criterion

ALT:

Alanine transaminase

AST:

Aspartate aminotransferase

ATTACC:

Anti-Thrombotic Therapy to Ameliorate Complications of COVID-19

AUC:

Area under the curve

BIC:

Bayesian information criterion

BMI:

Body mass index

BNP:

B-type natriuretic peptide

BUN:

Blood urea nitrogen

CI:

Confidence interval

CRP:

C-reactive protein

CTA:

Computed tomography angiography

DVT:

Deep vein thrombosis

ICD-10:

International Classification of Diseases–Tenth Revision

ICU:

Intensive care unit

IL-6:

Interleukin-6

IRB:

Institutional review board

LDH:

Lactate dehydrogenase

LOS:

Length of stay

LR:

Logistic regression

Max:

Maximum

Min:

Minimum

MLR:

Multiple linear regression

NA:

Not applicable

NPV:

Negative predictive value

OR:

Odds ratio

PCA:

Principal component analysis

PE:

Pulmonary embolism

PPV:

Positive predictive value

R2 :

R-square

ROC:

Receiver operating characteristic

SARS-CoV-2:

Severe acute respiratory syndrome coronavirus 2

SD:

Standard deviation

SMC:

Southeastern Michigan COVID-19 Consortium

SMCRD:

Southeastern Michigan COVID-19 Consortium Registry Database

TS:

Training set

VS:

Validation set

VTE:

Venous thromboembolism

WBC:

White blood cell

References

  1. Coronavirus Resource Center. https://coronavirus.jhu.edu/. Accessed 01 Jan 2022.

  2. Lodigiani C, Iapichino G, Carenzo L, Cecconi M, Ferrazzi P, Sebastian T, Kucher N, Studt JD, Sacco C, Bertuzzi A, et al. Venous and arterial thromboembolic complications in COVID-19 patients admitted to an academic hospital in Milan, Italy. Thromb Res. 2020;191:9–14.

    CAS  Article  Google Scholar 

  3. Levi M, Thachil J, Iba T, Levy JH. Coagulation abnormalities and thrombosis in patients with COVID-19. Lancet Haematol. 2020;7(6):e438–40.

    Article  Google Scholar 

  4. Kunutsor SK, Laukkanen JA. Incidence of venous and arterial thromboembolic complications in COVID-19: a systematic review and meta-analysis. Thromb Res. 2020;196:27–30.

    CAS  Article  Google Scholar 

  5. Paranjpe I, Fuster V, Lala A, Russak AJ, Glicksberg BS, Levin MA, Charney AW, Narula J, Fayad ZA, Bagiella E, et al. Association of treatment dose anticoagulation with in-hospital survival among hospitalized patients with COVID-19. J Am Coll Cardiol. 2020;76(1):122–4.

    CAS  Article  Google Scholar 

  6. Wichmann D, Sperhake JP, Lutgehetmann M, Steurer S, Edler C, Heinemann A, Heinrich F, Mushumba H, Kniep I, Schroder AS et al. Autopsy findings and venous thromboembolism in patients with COVID-19. Ann Internal Med. 2020;173(4):268–277.

  7. Wiersinga WJ, Rhodes A, Cheng AC, Peacock SJ, Prescott HC. Pathophysiology, transmission, diagnosis, and treatment of Coronavirus disease 2019 (COVID-19): a review. JAMA. 2020;324(8):782–93.

    CAS  Article  Google Scholar 

  8. Ackermann M, Verleden SE, Kuehnel M, Haverich A, Welte T, Laenger F, Vanstapel A, Werlein C, Stark H, Tzankov A, et al. Pulmonary vascular endothelialitis, thrombosis, and angiogenesis in COVID-19. N Engl J Med. 2020;383(2):120–8.

    CAS  Article  Google Scholar 

  9. Li A, Kuderer NM, Hsu C-Y, Shyr Y, Warner JL, Shah DP, Kumar V, Shah S, Kulkarni AA, Fu J, et al. The COVID-TE risk assessment model for venous thromboembolism in hospitalized patients with cancer and COVID-19. J Thromb Haemost. 2021;19(10):2522–32.

    CAS  Article  Google Scholar 

  10. Zhou F, Yu T, Du R, Fan G, Liu Y, Liu Z, Xiang J, Wang Y, Song B, Gu X, et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet. 2020;395(10229):1054–62.

    CAS  Article  Google Scholar 

  11. Yan L, Zhang H-T, Goncalves J, Xiao Y, Wang M, Guo Y, Sun C, Tang X, Jin L, Zhang M et al. A machine learning-based model for survival prediction in patients with severe COVID-19 infection. medRxiv. 2020:2020.2002.2027.20028027.

  12. Wynants L, Van Calster B, Collins GS, Riley RD, Heinze G, Schuit E, Bonten MMJ, Dahly DL, Damen JA, Debray TPA, et al. Prediction models for diagnosis and prognosis of COVID-19: systematic review and critical appraisal. BMJ. 2020;369: m1328.

    Article  Google Scholar 

  13. Hu H, Yao N, Qiu Y. Comparing rapid scoring systems in mortality prediction of critically ill patients with novel Coronavirus disease. Acad Emerg Med. 2020;27(6):461–8.

    Article  Google Scholar 

  14. Vaid A, Somani S, Russak AJ, De Freitas JK, Chaudhry FF, Paranjpe I, Johnson KW, Lee SJ, Miotto R, Zhao S et al. Machine learning to predict mortality and critical events in COVID-19 positive New York City patients. medRxiv. 2020:2020.2004.2026.20073411.

  15. Coronavirus Disease 2019 (COVID-19) Treatment Guidelines. https://www.COVID19treatmentguidelines.nih.gov/. Accessed 20 Feb 2022.

  16. Investigators I. Effect of intermediate-dose vs standard-dose prophylactic anticoagulation on thrombotic events, extracorporeal membrane oxygenation treatment, or mortality among patients with COVID-19 admitted to the intensive care unit: the INSPIRATION randomized clinical trial. JAMA. 2021;325(16):1620–30.

    Article  Google Scholar 

  17. Moores LK, Tritschler T, Brosnahan S, Carrier M, Collen JF, Doerschug K, Holley AB, Jimenez D, Le Gal G, Rali P, et al. Prevention, diagnosis, and treatment of VTE in patients with coronavirus disease 2019: CHEST guideline and expert panel report. Chest. 2020;158(3):1143–63.

    CAS  Article  Google Scholar 

  18. Jehangir Q, Lee Y, Latack K, Poisson L, Wang DD, Song S, Apala DR, Patel K, Halabi AR, Krishnamoorthy G, et al. Incidence, Mortality, and imaging outcomes of atrial arrhythmias in COVID-19. Am J Cardiol. 2022;S0002-9149(22)00243-0.

  19. Klok FA, Kruip M, van der Meer NJM, Arbous MS, Gommers D, Kant KM, Kaptein FHJ, van Paassen J, Stals MAM, Huisman MV, et al. Incidence of thrombotic complications in critically ill ICU patients with COVID-19. Thromb Res. 2020;191:145–7.

    CAS  Article  Google Scholar 

  20. Klok FA, Kruip M, van der Meer NJM, Arbous MS, Gommers D, Kant KM, Kaptein FHJ, van Paassen J, Stals MAM, Huisman MV, et al. Confirmation of the high cumulative incidence of thrombotic complications in critically ill ICU patients with COVID-19: an updated analysis. Thromb Res. 2020;191:148–50.

    CAS  Article  Google Scholar 

  21. Llitjos JF, Leclerc M, Chochois C, Monsallier JM, Ramakers M, Auvray M, Merouani K. High incidence of venous thromboembolic events in anticoagulated severe COVID-19 patients. J Thromb Haemost. 2020;18(7):1743–6.

    CAS  Article  Google Scholar 

  22. Investigators A, Investigators AC-a, Investigators R-C, Lawler PR, Goligher EC, Berger JS, Neal MD, McVerry BJ, Nicolau JC, Gong MN, et al. Therapeutic anticoagulation with heparin in noncritically ill patients with COVID-19. N Engl J Med. 2021;385(9):790–802.

  23. Iftimie S, López-Azcona AF, Vallverdú I, Hernández-Flix S, de Febrer G, Parra S, Hernández-Aguilera A, Riu F, Joven J, Andreychuk N, et al. First and second waves of coronavirus disease-19: a comparative study in hospitalized patients in Reus, Spain. PLoS ONE. 2021;16(3): e0248029.

    CAS  Article  Google Scholar 

  24. Cohen SL, Gianos E, Barish MA, Chatterjee S, Kohn N, Lesser M, Giannis D, Coppa K, Hirsch JS, McGinn TG, et al. Prevalence and predictors of venous thromboembolism or mortality in hospitalized COVID-19 patients. Thromb Haemost. 2021;121(8):1043–53.

    Article  Google Scholar 

  25. Dalager-Pedersen M, Lund LC, Mariager T, Winther R, Hellfritzsch M, Larsen TB, Thomsen RW, Johansen NB, Sogaard OS, Nielsen SL, et al. Venous thromboembolism and major bleeding in patients with Coronavirus Disease 2019 (COVID-19): a nationwide population-based cohort study. Clin Infect Dis. 2021;73(12):2283–93.

    CAS  Article  Google Scholar 

  26. Freund Y, Drogrey M, Miro O, Marra A, Feral-Pierssens AL, Penaloza A, Hernandez BAL, Beaune S, Gorlicki J, Vaittinada Ayar P, et al. Association between pulmonary embolism and COVID-19 in emergency department patients undergoing computed tomography pulmonary angiogram: the PEPCOV international retrospective study. Acad Emerg Med. 2020;27(9):811–20.

    Article  Google Scholar 

  27. Mei F, Fan J, Yuan J, Liang Z, Wang K, Sun J, Guan W, Huang M, Li Y, Zhang WW. Comparison of venous thromboembolism risks between COVID-19 pneumonia and community-acquired pneumonia patients. Arterioscler Thromb Vasc Biol. 2020;40(9):2332–7.

    CAS  Article  Google Scholar 

  28. Poissy J, Goutay J, Caplan M, Parmentier E, Duburcq T, Lassalle F, Jeanpierre E, Rauch A, Labreuche J, Susen S, et al. Pulmonary embolism in patients with COVID-19: awareness of an increased prevalence. Circulation. 2020;142(2):184–6.

    CAS  Article  Google Scholar 

  29. Rieder M, Goller I, Jeserich M, Baldus N, Pollmeier L, Wirth L, Supady A, Bode C, Busch HJ, Schmid B, et al. Rate of venous thromboembolism in a prospective all-comers cohort with COVID-19. J Thromb Thrombolysis. 2020;50(3):558–66.

    CAS  Article  Google Scholar 

  30. Sabaka P, Koščálová A, Straka I, Hodosy J, Lipták R, Kmotorková B, Kachlíková M, Kušnírová A. Role of interleukin 6 as a predictive factor for a severe course of COVID-19: retrospective data analysis of patients from a long-term care facility during COVID-19 outbreak. BMC Infect Dis. 2021;21(1):308.

    CAS  Article  Google Scholar 

  31. Kampouri E, Filippidis P, Viala B, Mean M, Pantet O, Desgranges F, Tschopp J, Regina J, Karachalias E, Bianchi C, et al. Predicting venous thromboembolic events in patients with Coronavirus Disease 2019 Requiring hospitalization: an observational retrospective study by the COVIDIC initiative in a Swiss university hospital. Biomed Res Int. 2020;2020:9126148.

    Article  Google Scholar 

  32. Dujardin RWG, Hilderink BN, Haksteen WE, Middeldorp S, Vlaar APJ, Thachil J, Müller MCA, Juffermans NP. Biomarkers for the prediction of venous thromboembolism in critically ill COVID-19 patients. Thromb Res. 2020;196:308–12.

    CAS  Article  Google Scholar 

  33. Tsaplin S, Schastlivtsev I, Zhuravlev S, Barinov V, Lobastov K, Caprini JA. The original and modified Caprini score equally predicts venous thromboembolism in COVID-19 patients. J Vasc Surg Venous Lymphat Disord. 2021;9(6):1371–81.

    Article  Google Scholar 

  34. Spyropoulos AC, Cohen SL, Gianos E, Kohn N, Giannis D, Chatterjee S, Goldin M, Lesser M, Coppa K, Hirsch JS, et al. Validation of the IMPROVE-DD risk assessment model for venous thromboembolism among hospitalized patients with COVID-19. Res Pract Thromb Haemost. 2021;5(2):296–300.

    CAS  Article  Google Scholar 

  35. Kollias A, Kyriakoulis KG, Lagou S, Kontopantelis E, Stergiou GS, Syrigos K. Venous thromboembolism in COVID-19: a systematic review and meta-analysis. Vasc Med. 2021;26(4):415–25.

    CAS  Article  Google Scholar 

  36. Cuker A, Tseng EK, Nieuwlaat R, Angchaisuksiri P, Blair C, Dane K, Davila J, DeSancho MT, Diuguid D, Griffin DO, et al. American Society of Hematology 2021 guidelines on the use of anticoagulation for thromboprophylaxis in patients with COVID-19. Blood Adv. 2021;5(3):872–88.

    CAS  Article  Google Scholar 

  37. Vaughn VM, Yost M, Abshire C, Flanders SA, Paje D, Grant P, Kaatz S, Kim T, Barnes GD. Trends in venous thromboembolism anticoagulation in patients hospitalized with COVID-19. JAMA Netw Open. 2021;4(6):e2111788–e2111788.

    Article  Google Scholar 

  38. Investigators A, Investigators AC-a, Investigators R-C, Lawler PR, Goligher EC, Berger JS, Neal MD, McVerry BJ, Nicolau JC, Gong MN, et al. Therapeutic anticoagulation with heparin in noncritically ill patients with COVID-19. N Engl J Med. 2021;385(9):790–802.

    Article  Google Scholar 

  39. Investigators R-C, Investigators AC-a, Investigators A, Goligher EC, Bradbury CA, McVerry BJ, Lawler PR, Berger JS, Gong MN, Carrier M, et al. Therapeutic anticoagulation with heparin in critically ill patients with COVID-19. N Engl J Med. 2021;385(9):777–89.

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

None.

Author information

Authors and Affiliations

Authors

Contributions

YL and QJ: study conception, study design, data analysis and manuscript writing; DG, PM, CHL and VB: Data analysis; PL, DRA, LP: data analysis and manuscript writing; GK, ARH, KP: study conception, and manuscript writing; AAS and BGN: study design, and manuscript writing. All authors have read and approved the manuscript.

Corresponding author

Correspondence to Yi Lee.

Ethics declarations

Ethics approval and consent to participate

This study protocol was approved by the IRB of Trinity Health System (Number 2021-007). After IRB approval, the study was reviewed for scientific merit by the Southeastern Michigan COVID-19 Consortium (SMC) Publication Committee. After formal approval by the SMC Publication Committee, the SMC Steering Committee granted approval to access the data of SMCRD. The generation of SMCRD was approved by the SMC Research Committee (Number 13785). The need for informed consent was waived for the use of deidentified medical records. The data used in this study were anonymized before use.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Supplemental materials of COVID-19 venous thromboembolism prediction model.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lee, Y., Jehangir, Q., Li, P. et al. Venous thromboembolism in COVID-19 patients and prediction model: a multicenter cohort study. BMC Infect Dis 22, 462 (2022). https://doi.org/10.1186/s12879-022-07421-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12879-022-07421-3

Keywords

  • COVID-19
  • Venous thromboembolism
  • Deep vein thrombosis
  • Pulmonary embolism
  • Risk stratification
  • Risk prediction
  • Anticoagulation