Predicting cardiovascular events using three stage Discriminant Function is much more accurate than Framingham or QRISK
The two best known approaches to predicting cardiovascular risk are Framingham and QRISK. Both methods correctly predict less than 70% of cases, with a high ratio of false positive predictions to true predictions. Each uses a combination of predictors that is applied to the data only once. The present approach uses the Discriminant Function with multiple applications. A British sample of data on cardiovascular risk was analysed. Principal Components analysis was used to reveal the underlying structure of the data—it identified four independent determinants of the data. Discriminant Function analysis in three stages was then used to accommodate the difficulties of dealing with multiple determinants. Ninety-four percent of the cases with cardiovascular incidents (CVI) were predicted correctly up to more than 20 years ahead, with a misclassification rate overall of 2.8 errors for every one correct. When checked for likely shrinkage from sample to sample using the Jacknife method 92% of CVI’s were correctly predicted. Instead of a single application of a linear combination of predictors to find those people most likely to have cardiovascular events a repeated application of the predictors to the residuals from the previous prediction stage is likely to find a much higher proportion of true predictions and with much less error. The results also allow for a simple way of conveying the risk of CVI to individual patients.
KeywordsPrediction Cardiovascular incidents Discriminant function analysis Principal components
The author is especially grateful to Professor Yoav Ben-Shlomo of Bristol University for supplying the data on behalf of the Caerphilly Committee, and for his comprehensive explanation of the data. I am also most grateful to the Caerphilly project Committee for giving me access to the data without which this study would not have been possible. Further details of the Caerphilly project are available on the website of the Department of Social Medicine, Bristol University. Professor David Vere-Jones of the Department of Mathematics and Statistics at Victoria University Wellington was most helpful in checking and commenting on the results. However any possible errors and omissions remain the responsibility of the author.
Approval to use the data for this purpose was obtained through the Caerphilly Steering Committee.
- 1.Hippisley-Cox J, Coupland C, et al. Predicting cardiovascular risk in England and Wales: prospective derivation and validation of QRISK2. Br Med J. 2008. Accessed Nov 2010.Google Scholar
- 2.Marsh R. Fallacies in predicting CVD events. Br Med J. 2008. (Rapid response to Hippisley-Cox J, Coupland C, et.al. (2008). Accessed Nov 2010.Google Scholar
- 5.Lee C, Landgrebe D. Feature extraction and classification algorithms for high dimensional data. Indiana: Purdue University; 1993.Google Scholar
- 7.Tabachnick B, Fidell L. Using multivariate statistics. 3rd ed. New York: Harper Collins; 1996.Google Scholar
- 8.Hair J, Anderson R, Tatham R, Black W. Multivariate data analysis. 5th ed. London: Prentice-Hall; 1998.Google Scholar