Abstract
Kidney transplantation outcome prediction is very significant and doesn’t require emphasis. This will grant the selection of the best available kidney donor and the best immunosuppressive treatment for patients. Survival prediction before treatment could simplify patient’s decision making and boost survival by altering clinical practice. This paper proposes a new novel prediction method based on data mining techniques to predict five-year graft survival after transplantation. This new proposed prediction method composes of three stages: data preparation stage (DPS), feature selection stage (FSS), and prediction stage (PS). The new proposed prediction method merges information gain with naïve Bayes and k-nearest neighbor. Initially, it uses information gain to select the essential features, uses naïve Bayes to select the most essential features. These two methods are combined in a new hybrid feature selection method which chooses the minimum number of features that produce highest accuracy. Finally, it uses k-nearest neighbor for graft survival prediction classification. The proposed prediction method has been evaluated against recent techniques. Experimental results have proven that the proposed prediction method outperforms the recent techniques as it attains the maximum accuracy and F-measure with minimal errors. This prediction method can also be used in other transplant datasets.
This is a preview of subscription content, log in to check access.









References
- 1.
Akl A, Ismail AM, Ghoneim M (2008) Prediction of graft survival of living-donor kidney transplantation: nomograms or artificial neural networks? Transplantation 86(10):1401–1406
- 2.
Akl A, Mostafa A, Ghoneim MA (2008) Nomogram that predicts graft survival probability following living-donor kidney transplant. Exp Clin Transplant 6(1):30–36
- 3.
Arlot S, Celisse A (2010) A survey of cross-validation procedures for model selection. Statistics Surveys 4:40–79
- 4.
Atallah DM, Eldesoky AI, Amira Y, Ghoneim MA (2014) One-year renal graft survival prediction using a weighted decision tree classifier. International Journal of Engineering & Technology 3(3):327
- 5.
Ben-Bassat M (1982) Pattern recognition and reduction of dimensionality. Handbook of Statistics 2(1982):773–910
- 6.
Blum AL, Langley P (1997) Selection of relevant features and examples in machine learning. Artif Intell 97(1-2):245–271
- 7.
Breiman L (2017) Classification and regression trees. Routledge, Abingdon
- 8.
Brier ME, Ray PC, Klein JB (2003) Prediction of delayed renal allograft function using an artificial neural network. Nephrol Dial Transplant 18(12):2655–2659
- 9.
Brown TS, Elster EA, Stevens K, Graybill JC, Gillern S, Phinney S, Salifu MO, Jindal RM (2012) Bayesian modeling of pretransplant variables accurately predicts kidney graft survival. Am J Nephrol 36(6):561–569
- 10.
Cawley GC, Talbot NL (2010) On over-fitting in model selection and subsequent selection bias in performance evaluation. J Mach Learn Res 11:2079–2107
- 11.
Dag A, Oztekin A, Yucel A, Bulur S, Megahed FM (2017) Predicting heart transplantation outcomes through data analytics. Decis Support Syst 94:42–52
- 12.
Dag A, Topuz K, Oztekin A, Bulur S, Megahed FM (2016) A probabilistic data-driven framework for scoring the preoperative recipient-donor heart transplant survival. Decis Support Syst 86:1–12
- 13.
Das S (2001) Filters, wrappers and a boosting-based hybrid for feature selection. In: Icml, pp 74-81
- 14.
Dash M, Liu H (1997) Feature selection for classification. Intelligent Data Analysis 1(3):131–156
- 15.
Doak J (1992) CSE-92-18-an evaluation of feature selection methodsand their application to computer security
- 16.
Doyle HR, Dvorchik I, Mitchell S, Marino IR, Ebert FH, McMichael J, Fung JJ (1994) Predicting outcomes after liver transplantation. A connectionist approach. Ann Surg 219(4):408
- 17.
Duch W, Adamczak R, Grabczewski K (2001) A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Trans Neural Netw 12(2):277–306
- 18.
Dy JG, Brodley CE (2000) Feature subset selection and order identification for unsupervised learning. In: ICML. Citeseer, pp 247-254
- 19.
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2-3):131–163
- 20.
Ghoneim MA, Bakr MA, Refaie AF, Akl AI, Shokeir AA, El-Dein S, Ahmed B, Ammar HM, Ismail AM (2013) Sheashaa HA (2013) Factors affecting graft survival among patients receiving kidneys from live donors: a single-center experience. Biomed Res Int
- 21.
Goldfarb-Rumyantzev AS, Scandling JD, Pappas L, Smout RJ, Horn S (2003) Prediction of 3-yr cadaveric graft survival based on pre-transplant variables in a large national dataset. Clin Transpl 17(6):485–497
- 22.
Grinyó JM (2013) Why is organ transplantation clinically important? Cold Spring Harbor Perspectives in Medicine 3(6):a014985
- 23.
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam
- 24.
Hariharan S, Johnson CP, Bresnahan BA, Taranto SE, McIntosh MJ, Stablein D (2000) Improved graft survival after renal transplantation in the United States, 1988 to 1996. N Engl J Med 342(9):605–612
- 25.
Heldal K, Hartmann A, Grootendorst DC, de Jager DJ, Leivestad T, Foss A, Midtvedt K (2009) Benefit of kidney transplantation beyond 70 years of age. Nephrol Dial Transplant 25(5):1680–1687
- 26.
Hoot N, Aronsky D (2005) Using Bayesian networks to predict survival of liver transplant patients. In: AMIA annual symposium proceedings. American Medical Informatics Association, p 345
- 27.
Inza I, Larrañaga P, Etxeberria R, Sierra B (2000) Feature subset selection by Bayesian network-based optimization. Artif Intell 123(1-2):157–184
- 28.
Kaplan B, Schold J (2009) Transplantation: neural networks for predicting graft survival. Nat Rev Nephrol 5(4):190
- 29.
Kim Y, Street WN, Menczer F (2000) Feature selection in unsupervised learning via evolutionary search. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 365-369
- 30.
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Ijcai, vol 2. Montreal, pp 1137-1145
- 31.
Krikov S, Khan A, Baird BC, Barenbaum LL, Leviatov A, Koford JK, Goldfarb-Rumyantzev AS (2007) Predicting kidney transplant survival using tree-based modeling. ASAIO J 53(5):592–600
- 32.
Kusiak A, Dixon B, Shah S (2005) Predicting survival time for kidney dialysis patients: a data mining approach. Comput Biol Med 35(4):311–327
- 33.
Lin RS, Horn SD, Hurdle JF, Goldfarb-Rumyantzev AS (2008) Single and multiple time-point prediction models in kidney transplant outcomes. J Biomed Inform 41(6):944–952
- 34.
Liu H, Motoda H (1998) Feature extraction, construction and selection: A data mining perspective, vol 453. Springer Science & Business Media, Berlin
- 35.
Martín-Valdivia MT, Díaz-Galiano MC, Montejo-Raez A, Urena-Lopez L (2008) Using information gain to improve multi-modal information retrieval systems. Inf Process Manag 44(3):1146–1158
- 36.
Mitra P, Murthy C, Pal SK (2002) Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell 24(3):301–312
- 37.
Mukras R, Wiratunga N, Lothian R, Chakraborti S, Harper D (2007) Information gain feature selection for ordinal text classification using probability re-distribution. In: Proceedings of the Textlink workshop at IJCAI, p 16
- 38.
Nakayama N, Oketani M, Kawamura Y, Inao M, Nagoshi S, Fujiwara K, Tsubouchi H, Mochida S (2012) Algorithm to determine the outcome of patients with acute liver failure: a data-mining analysis using decision trees. J Gastroenterol 47(6):664–677
- 39.
Ojo AO, Hanson JA, Meier-Kriesche H-U, Okechukwu CN, Wolfe RA, Leichtman AB, Agodoa LY, Kaplan B, Port FK (2001) Survival in recipients of marginal cadaveric donor kidneys compared with other recipients and wait-listed transplant candidates. J Am Soc Nephrol 12(3):589–597
- 40.
Ojo AO, Wolfe RA, Agodoa LY, Held PJ, Port FK, Leavey SF, Callard SE, Dickinson DM, Schmouder RL, Leichtman AB (1998) Prognosis after primary renal transplant failure and the beneficial effects of repeat transplantation: Multivariate Analyses from the United States Renal Data System1, 2. Transplantation 66(12):1651–1659
- 41.
Oztekin A, Al-Ebbini L, Sevkli Z, Delen D (2018) A decision analytic approach to predicting quality of life for lung transplant recipients: A hybrid genetic algorithms-based methodology. Eur J Oper Res 266(2):639–651
- 42.
Parmanto B, Doyle H (2001) Recurrent neural networks for predicting outcomes after liver transplantation: representing temporal sequence of clinical observations. Methods Inf Med 40(05):386–391
- 43.
Poli F, Scalamogna M, Cardillo M, Porta E, Sirchia G (2000) An algorithm for cadaver kidney allocation based on a multivariate analysis of factors impacting on cadaver kidney graft survival and function. Transpl Int 13(1):S259–S262
- 44.
Port FK, Bragg-Gresham JL, Metzger RA, Dykstra DM, Gillespie BW, Young EW, Delmonico FL, Wynn JJ, Merion RM, Wolfe RA (2002) Donor characteristics associated with reduced graft survival: an approach to expanding the pool of kidney donors1. Transplantation 74(9):1281–1286
- 45.
Qiang G (2010) An effective algorithm for improving the performance of Naïve Bayes for text classification. In: 2010 Second International Conference on Computer Research and Development
- 46.
Quinlan JR (2014) C4. 5: programs for machine learning. Elsevier, Amsterdam
- 47.
Raji C, Chandra SV (2016) Graft survival prediction in liver transplantation using artificial neural network models. J Comput Sci 16:72–78
- 48.
Rana A, Gruessner A, Agopian VG, Khalpey Z, Riaz IB, Kaplan B, Halazun KJ, Busuttil RW, Gruessner RW (2015) Survival benefit of solid-organ transplant in the United States. JAMA surgery 150(3):252–259
- 49.
Refaeilzadeh P, Tang L, Liu H (2009) Cross-validation. In: Encyclopedia of database systems. Springer, pp 532-538
- 50.
Rish I (2001) An empirical study of the naive Bayes classifier. In: IJCAI 2001 workshop on empirical methods in artificial intelligence, vol 22. IBM, pp 41-46
- 51.
Shih DT, Kim SB, Chen VC, Rosenberger JM, Pilla VL (2014) Efficient computer experiment-based optimization through variable selection. Ann Oper Res 216(1):287–305
- 52.
Siedlecki W, Sklansky J (1988) On automatic feature selection. Int J Pattern Recognit Artif Intell 2(02):197–220
- 53.
Talavera L (1999) Feature selection as a preprocessing step for hierarchical clustering. In: ICML. Citeseer, pp 389-397
- 54.
Tang H, Hurdle JF, Poynton M, Hunter C, Tu M, Baird BC, Krikov S, Goldfarb-Rumyantzev AS (2011) Validating prediction models of kidney transplant outcome using single center data. ASAIO J 57(3):206–212
- 55.
Topuz K, Uner H, Oztekin A, Yildirim MB (2018) Predicting pediatric clinic no-shows: a decision analytic framework using elastic net and Bayesian belief network. Ann Oper Res 263(1-2):479–499
- 56.
Topuz K, Zengul FD, Dag A, Almehmi A, Yildirim MB (2018) Predicting graft survival among kidney transplant recipients: A Bayesian decision support model. Decis Support Syst 106:97–109
- 57.
Tseng W-T, Chiang W-F, Liu S-Y, Roan J, Lin C-N (2015) The application of data mining techniques to oral cancer prognosis. J Med Syst 39(5):59
- 58.
Webb GI (2011) Naïve bayes. In: Encyclopedia of Machine Learning. Springer, pp 713-714
- 59.
Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244
- 60.
Wyse N, Dubes R, Jain AK (1980) A critical evaluation of intrinsic dimensionality algorithms. Pattern Recognition in Practice:415–425
- 61.
Yang C-H, Chuang L-Y, Yang CH (2010) IG-GA: a hybrid filter/wrapper method for feature selection of microarray data. Journal of Medical and Biological Engineering 30(1):23–28
Author information
Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Atallah, D.M., Badawy, M., El-Sayed, A. et al. Predicting kidney transplantation outcome based on hybrid feature selection and KNN classifier. Multimed Tools Appl 78, 20383–20407 (2019). https://doi.org/10.1007/s11042-019-7370-5
Received:
Revised:
Accepted:
Published:
Issue Date:
Keywords
- Kidney transplantation
- Feature selection
- Information gain
- Naïve Bayes
- K-nearest neighbor