Abstract
The incidence of THA (total hip arthroplasty) will rise with an aging population and improvements in surgery, a feasible alternative in health care can effectively increase medical quality. The reason of a hip joint replaced is to relieve severe arthritis pain that is limiting your activities. Hip joint replacement is usually done in people age 60 and older. Younger people who have a hip replaced may put extra stress on the artificial hip. This paper uses a serious data screening function by experts to reduce data dimension after data collection from the National Health Insurance database. The proposed model also adopts an imbalanced sampling method to solve class imbalance problem, and utilizes rough set theory to find out core attributes (selected 7 features). Based on the core attributes, the extracted rules can be comprehensive for the rules of medical quality. In verification, THA dataset is taken as case study; the performance of the proposed model is verified and compared with other data-mining methods under various criteria. Furthermore, the performance of the proposed model is identified as winning the listing methods, as well as using hybrid-sampling can increase the far true-positive rate (minority class). The results show that the proposed model is efficient; the performance is superior to the listing methods under the listing criteria. And the generated decision rules and core attributes could find more managerial implication. Moreover, the result can provide stakeholders with useful THA information to help make decision.
Similar content being viewed by others
References
Antoniou J., Eisenberg M.J., Filion K.B., Huk L., Martineau P.A., Pilote L., Zukor D.J.: In-hospital cost of total hip arthroplasty in Canada and the United States. J. Bone Surg. 86A, 2435–2439 (2004)
Batista G., Monard M.C., Prati R.C.: A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor. 6(1), 20–29 (2004)
Bazan J.: A comparison of dynamic and non-dynamic rough set methods for extracting laws from decision table. In: Polkowski, L., Skowron, A. (eds) Rough Sets in Knowledge Discovery, pp. 321–365. Physica-Verlag, Heidelberg (1998)
Boardman L.D., Lieberman R.J., Thomas J.B.: Impact of declining reimbursement and rising hospital costs on the feasibility of total hip arthroplasty. J. Arthroplast. 12(5), 526–534 (1997)
Bozic K.J., Durbhakula S., Berry D.J., Naessens J.M., Rappaport K., Cisternas M., Saleh K.J., Rubash H.E.: Differences in patient and procedure characteristics and hospital 450 resource use in primary and revision total joint arthroplasty: a multicenter study. J. Arthroplast 20(7), 17–25 (2005)
Bozic K.J., Katz P., Cisternas M., Ono L., Ries M.D., Showstack J.: Hospital resource utilization for primary and revision THA. J. Bone Jt. Surg. Am. 87(3), 570–576 (2005)
Bozic K.J., Wagie A., Naessens J.M., Berry D.J., Rubash H.E.: Predictors of discharge to an inpatient extended care facility after total hip or knee arthroplasty. J. Arthroplast. 21(6), 151–156 (2006)
Breiman L., Friedman J.H., Olshen R.A., Stone C.J.: Classification and Regression Trees. Wadsworth International Group, Belmont (1984)
Chawla, N.V.: C4.5 and imbalanced datasets: investigating the effect of sampling method, probabilistic estimate, and decision tree structure. In: Proceedings of the ICML’03 Workshop on Class Imbalances, Washington, DC, August 2003
Chawla N.V., Japkowicz N., Kotcz A.: Editorial: special issue on learning from imbalanced data sets. SIGKDD Explor. 6(1), 1–6 (2004)
Chen L.S., Su C.T., Yih Y.: Knowledge acquisition through information granulation for imbalanced data. Expert Syst. Appl. 31, 531–541 (2006)
Chen L.S., Chen M.C., Hsu C.C., Zeng W.R.: An information granulation based data mining approach for classifying imbalanced data. Inf. Sci. 178, 3214–3227 (2008)
Chmielewski M.R., Grzymala-Busse J.W.: Global discretization of continuous attributes as preprocessing for machine learning. Int. J. Approx. Reason. 15, 319–331 (1996)
Chyi, Y.M.: Classification analysis techniques for skewed class distribution problems. Master Thesis, Department of Information Management, National Sun Yat-Sen University (2003)
Conan-Guez, B., Rossi, F.: Multi-layer perceptrons for functional data analysis: a projection based approach. In: ICANN 2002, Madrid, Spain, pp. 667–672 (2002)
Cortes C., Vapnik V.: Support-vector network. Mach. Learn. 20, 273–297 (1995)
Department of Health, Executive Yuan, R.O.C. National Health Insurance.: Taiwan international network; 2008. http://www.medicaltravel.org.tw. Accessed 3 Nov 2009
Dieppe P.A., Dixon T., Shaw M.E.: Analysis of regional variation in hip and knee joint replacement rates in England using Hospial Episodes Statistics. Public Health. 120(1), 83–90 (2006)
Dorr L.D., Thomas D., Long W.T., Polatin P.B., Sirianni L.E.: Psychologic reasons for patients preferring minimally invasive total hip arthroplasty. Clin. Orthop. Relat. Res. 458, 94–100 (2007)
Drummond, C., Holte, R.C.: C4.5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In: Working Notes of the ICML’03 Workshop Learning from Imbalanced Data Sets, Washington, DC (2003)
Estabrooks A., Japkowicz N., Jo T.: A multiple resampling method for learning from imbalanced data sets. Comput. Intell. 20(1), 18–36 (2004)
Fernández A., García S., Herrera F., Jesus M.J.: A study of the behavior of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets. Fuzzy Sets Syst. 159(18), 2378–2398 (2008)
Greco S., Matarazzo B., Slowinski R.: Rough sets theory for multicriteria decision analysis. Eur. J. Oper. Res. 129(1), 1–47 (2001)
Grzymala-Busse J.W.: LERS—a system for learning from samples based on rough sets. In: Slowinski, R. (eds) Intelligent Decision Support, pp. 3–18. Kluwer Academic Publishers, Norwell (1992)
Grzymala-Busse J.W.: A new version of the rule induction system LERS. Fundam. Inf. 31, 27–39 (1997)
Grzymala-Busse, J.W., Jan, P., Zdzislaw, S.H.: Melanoma prediction using data mining system LERS. In: Proceedings of the 25th Annual International Computer Software and Applications Conference, Chicago, IL, USA, 8–12 Oct 2001, pp. 615–620
Grzymala-Busse J.W., Stefanowski J., Wilk S.: A comparison of two approaches to data mining from imbalanced data. J. Intell. Manuf. 16, 565–573 (2005)
Halpern M., Kurtz S., Lau E., Mowat F., Ong K.: Projections of primary and revision hip and knee arthroplasty in the United States from 2005 to 2030. J. Bone Surg. 89, 780–785 (2007)
Hanley J.A., McNeil B.J.: A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148(3), 839–843 (1983)
Holte R.C., Kubat M., Matwin S.: Machine learning for the detection of oil spills in satellite radar images. Mach. Learn. 30(2–3), 195–215 (1998)
Hudak P.L., McKeever P.D., Wright J.G.: Understanding the meaning of satisfaction with treatment outcome. Med Care 42(8), 718–725 (2004)
Japkowicz N., Jo T.: Class imbalances versus small disjuncts. SIGKDD Explor. 6(1), 40–49 (2004)
Komarek, P., Moore, A.: Fast robust logistic regression for large sparse datasets with binary outputs. In: Proceedings of the 9th International Workshop on Artificial Intelligence and Statistics. Key West, FL (2003)
Kreder H.J., Grosso P., Williams J.I., Jaglal S., Axcell T., Wal E.K., Stephen D.J.: Provider volume and other predictors of outcome after total knee arthroplasty: a population study in Ontario. Can. Med. Assoc. 46(1), 15–22 (2003)
Kumar V., Steinbach M., Tan P.N.: Introduction to Data Mining. Pearson Education, Boston (2006)
Kurtz, M.S., Ong, K., Schmier, J.: The Surgeons’ revision burden: analysis of caseload disparities in the United States from 1990 to 2003. 74th Annual Meeting of the American Academy of Orthopaedic Surgeons, San Diego, CA (2007)
Maloof, M.: Learning when data sets are imbalanced and when costs are unequal and unknown. In: Proceedings of the ICML’03 Workshop on Learning from Imbalanced Data Sets, Washington, DC (2003)
Medsker L.R.: Hybrid Intelligent System. Kluwer Academic Publishers, Boston (1995)
Mendenhall S.: 2004 Hip and knee implant review. Orthop. Netw. News 15, 1–16 (2004)
Ong K., Lau E., Manley M., Kurtz S.M.: Patient, hospital, and procedure characteristics influencing total, hip and knee arthroplasty procedure duration. J. Arthroplast. 24(6), 925–931 (2009)
Pawlak Z.: Rough sets. Inf. J. Comput. Inf. Sci. 11, 341–356 (1982)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishing, Dordrecht. ISBN 0-7923-1472-7 (1991)
Pawlak Z.: Rough set approach to knowledge-based decision support. Eur. J. Oper. Res. 99, 48–57 (1997)
Provost F.J., Weiss G.M.: Learning when training data are costly: the effect of class distribution on tree induction. J. Artif. Intell. Res. 19, 315–354 (2003)
Quinlan J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Quinlan J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)
Shan, N., Ziarko, W.: Discovering attribute relationships, dependencies and rules by using rough sets. Proceedings of the 28th Annual Hawaii International Conference on System Sciences (HICSS’95), Hawaii, 1995, pp. 293–299
Stefanowski, J.: On rough set based approaches to induction of decision rules. In: Skowron, A., Polkowski, L. (eds.) Rough Sets in Knowledge Discovery, vol. 1(1), pp. 500–529. Physica Verlag, Heidelberg (1998)
Wu X., Yang Q.: 10 Challenging problems in data mining research. Int. J. Inf. Technol. Decis. Mak. 5(4), 597–604 (2006)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wei, MH., Cheng, CH., Huang, CS. et al. Discovering medical quality of total hip arthroplasty by rough set classifier with imbalanced class. Qual Quant 47, 1761–1779 (2013). https://doi.org/10.1007/s11135-011-9624-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11135-011-9624-9