Skip to main content

Advertisement

Log in

Discovering medical quality of total hip arthroplasty by rough set classifier with imbalanced class

  • Published:
Quality & Quantity Aims and scope Submit manuscript

Abstract

The incidence of THA (total hip arthroplasty) will rise with an aging population and improvements in surgery, a feasible alternative in health care can effectively increase medical quality. The reason of a hip joint replaced is to relieve severe arthritis pain that is limiting your activities. Hip joint replacement is usually done in people age 60 and older. Younger people who have a hip replaced may put extra stress on the artificial hip. This paper uses a serious data screening function by experts to reduce data dimension after data collection from the National Health Insurance database. The proposed model also adopts an imbalanced sampling method to solve class imbalance problem, and utilizes rough set theory to find out core attributes (selected 7 features). Based on the core attributes, the extracted rules can be comprehensive for the rules of medical quality. In verification, THA dataset is taken as case study; the performance of the proposed model is verified and compared with other data-mining methods under various criteria. Furthermore, the performance of the proposed model is identified as winning the listing methods, as well as using hybrid-sampling can increase the far true-positive rate (minority class). The results show that the proposed model is efficient; the performance is superior to the listing methods under the listing criteria. And the generated decision rules and core attributes could find more managerial implication. Moreover, the result can provide stakeholders with useful THA information to help make decision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Antoniou J., Eisenberg M.J., Filion K.B., Huk L., Martineau P.A., Pilote L., Zukor D.J.: In-hospital cost of total hip arthroplasty in Canada and the United States. J. Bone Surg. 86A, 2435–2439 (2004)

    Google Scholar 

  • Batista G., Monard M.C., Prati R.C.: A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor. 6(1), 20–29 (2004)

    Article  Google Scholar 

  • Bazan J.: A comparison of dynamic and non-dynamic rough set methods for extracting laws from decision table. In: Polkowski, L., Skowron, A. (eds) Rough Sets in Knowledge Discovery, pp. 321–365. Physica-Verlag, Heidelberg (1998)

    Google Scholar 

  • Boardman L.D., Lieberman R.J., Thomas J.B.: Impact of declining reimbursement and rising hospital costs on the feasibility of total hip arthroplasty. J. Arthroplast. 12(5), 526–534 (1997)

    Article  Google Scholar 

  • Bozic K.J., Durbhakula S., Berry D.J., Naessens J.M., Rappaport K., Cisternas M., Saleh K.J., Rubash H.E.: Differences in patient and procedure characteristics and hospital 450 resource use in primary and revision total joint arthroplasty: a multicenter study. J. Arthroplast 20(7), 17–25 (2005)

    Article  Google Scholar 

  • Bozic K.J., Katz P., Cisternas M., Ono L., Ries M.D., Showstack J.: Hospital resource utilization for primary and revision THA. J. Bone Jt. Surg. Am. 87(3), 570–576 (2005)

    Article  Google Scholar 

  • Bozic K.J., Wagie A., Naessens J.M., Berry D.J., Rubash H.E.: Predictors of discharge to an inpatient extended care facility after total hip or knee arthroplasty. J. Arthroplast. 21(6), 151–156 (2006)

    Article  Google Scholar 

  • Breiman L., Friedman J.H., Olshen R.A., Stone C.J.: Classification and Regression Trees. Wadsworth International Group, Belmont (1984)

    Google Scholar 

  • Chawla, N.V.: C4.5 and imbalanced datasets: investigating the effect of sampling method, probabilistic estimate, and decision tree structure. In: Proceedings of the ICML’03 Workshop on Class Imbalances, Washington, DC, August 2003

  • Chawla N.V., Japkowicz N., Kotcz A.: Editorial: special issue on learning from imbalanced data sets. SIGKDD Explor. 6(1), 1–6 (2004)

    Article  Google Scholar 

  • Chen L.S., Su C.T., Yih Y.: Knowledge acquisition through information granulation for imbalanced data. Expert Syst. Appl. 31, 531–541 (2006)

    Article  Google Scholar 

  • Chen L.S., Chen M.C., Hsu C.C., Zeng W.R.: An information granulation based data mining approach for classifying imbalanced data. Inf. Sci. 178, 3214–3227 (2008)

    Article  Google Scholar 

  • Chmielewski M.R., Grzymala-Busse J.W.: Global discretization of continuous attributes as preprocessing for machine learning. Int. J. Approx. Reason. 15, 319–331 (1996)

    Article  Google Scholar 

  • Chyi, Y.M.: Classification analysis techniques for skewed class distribution problems. Master Thesis, Department of Information Management, National Sun Yat-Sen University (2003)

  • Conan-Guez, B., Rossi, F.: Multi-layer perceptrons for functional data analysis: a projection based approach. In: ICANN 2002, Madrid, Spain, pp. 667–672 (2002)

  • Cortes C., Vapnik V.: Support-vector network. Mach. Learn. 20, 273–297 (1995)

    Google Scholar 

  • Department of Health, Executive Yuan, R.O.C. National Health Insurance.: Taiwan international network; 2008. http://www.medicaltravel.org.tw. Accessed 3 Nov 2009

  • Dieppe P.A., Dixon T., Shaw M.E.: Analysis of regional variation in hip and knee joint replacement rates in England using Hospial Episodes Statistics. Public Health. 120(1), 83–90 (2006)

    Article  Google Scholar 

  • Dorr L.D., Thomas D., Long W.T., Polatin P.B., Sirianni L.E.: Psychologic reasons for patients preferring minimally invasive total hip arthroplasty. Clin. Orthop. Relat. Res. 458, 94–100 (2007)

    Google Scholar 

  • Drummond, C., Holte, R.C.: C4.5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In: Working Notes of the ICML’03 Workshop Learning from Imbalanced Data Sets, Washington, DC (2003)

  • Estabrooks A., Japkowicz N., Jo T.: A multiple resampling method for learning from imbalanced data sets. Comput. Intell. 20(1), 18–36 (2004)

    Article  Google Scholar 

  • Fernández A., García S., Herrera F., Jesus M.J.: A study of the behavior of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets. Fuzzy Sets Syst. 159(18), 2378–2398 (2008)

    Article  Google Scholar 

  • Greco S., Matarazzo B., Slowinski R.: Rough sets theory for multicriteria decision analysis. Eur. J. Oper. Res. 129(1), 1–47 (2001)

    Article  Google Scholar 

  • Grzymala-Busse J.W.: LERS—a system for learning from samples based on rough sets. In: Slowinski, R. (eds) Intelligent Decision Support, pp. 3–18. Kluwer Academic Publishers, Norwell (1992)

    Chapter  Google Scholar 

  • Grzymala-Busse J.W.: A new version of the rule induction system LERS. Fundam. Inf. 31, 27–39 (1997)

    Google Scholar 

  • Grzymala-Busse, J.W., Jan, P., Zdzislaw, S.H.: Melanoma prediction using data mining system LERS. In: Proceedings of the 25th Annual International Computer Software and Applications Conference, Chicago, IL, USA, 8–12 Oct 2001, pp. 615–620

  • Grzymala-Busse J.W., Stefanowski J., Wilk S.: A comparison of two approaches to data mining from imbalanced data. J. Intell. Manuf. 16, 565–573 (2005)

    Article  Google Scholar 

  • Halpern M., Kurtz S., Lau E., Mowat F., Ong K.: Projections of primary and revision hip and knee arthroplasty in the United States from 2005 to 2030. J. Bone Surg. 89, 780–785 (2007)

    Article  Google Scholar 

  • Hanley J.A., McNeil B.J.: A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148(3), 839–843 (1983)

    Google Scholar 

  • Holte R.C., Kubat M., Matwin S.: Machine learning for the detection of oil spills in satellite radar images. Mach. Learn. 30(2–3), 195–215 (1998)

    Google Scholar 

  • Hudak P.L., McKeever P.D., Wright J.G.: Understanding the meaning of satisfaction with treatment outcome. Med Care 42(8), 718–725 (2004)

    Article  Google Scholar 

  • Japkowicz N., Jo T.: Class imbalances versus small disjuncts. SIGKDD Explor. 6(1), 40–49 (2004)

    Article  Google Scholar 

  • Komarek, P., Moore, A.: Fast robust logistic regression for large sparse datasets with binary outputs. In: Proceedings of the 9th International Workshop on Artificial Intelligence and Statistics. Key West, FL (2003)

  • Kreder H.J., Grosso P., Williams J.I., Jaglal S., Axcell T., Wal E.K., Stephen D.J.: Provider volume and other predictors of outcome after total knee arthroplasty: a population study in Ontario. Can. Med. Assoc. 46(1), 15–22 (2003)

    Google Scholar 

  • Kumar V., Steinbach M., Tan P.N.: Introduction to Data Mining. Pearson Education, Boston (2006)

    Google Scholar 

  • Kurtz, M.S., Ong, K., Schmier, J.: The Surgeons’ revision burden: analysis of caseload disparities in the United States from 1990 to 2003. 74th Annual Meeting of the American Academy of Orthopaedic Surgeons, San Diego, CA (2007)

  • Maloof, M.: Learning when data sets are imbalanced and when costs are unequal and unknown. In: Proceedings of the ICML’03 Workshop on Learning from Imbalanced Data Sets, Washington, DC (2003)

  • Medsker L.R.: Hybrid Intelligent System. Kluwer Academic Publishers, Boston (1995)

    Book  Google Scholar 

  • Mendenhall S.: 2004 Hip and knee implant review. Orthop. Netw. News 15, 1–16 (2004)

    Google Scholar 

  • Ong K., Lau E., Manley M., Kurtz S.M.: Patient, hospital, and procedure characteristics influencing total, hip and knee arthroplasty procedure duration. J. Arthroplast. 24(6), 925–931 (2009)

    Article  Google Scholar 

  • Pawlak Z.: Rough sets. Inf. J. Comput. Inf. Sci. 11, 341–356 (1982)

    Article  Google Scholar 

  • Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishing, Dordrecht. ISBN 0-7923-1472-7 (1991)

  • Pawlak Z.: Rough set approach to knowledge-based decision support. Eur. J. Oper. Res. 99, 48–57 (1997)

    Article  Google Scholar 

  • Provost F.J., Weiss G.M.: Learning when training data are costly: the effect of class distribution on tree induction. J. Artif. Intell. Res. 19, 315–354 (2003)

    Google Scholar 

  • Quinlan J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

    Google Scholar 

  • Quinlan J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)

    Google Scholar 

  • Shan, N., Ziarko, W.: Discovering attribute relationships, dependencies and rules by using rough sets. Proceedings of the 28th Annual Hawaii International Conference on System Sciences (HICSS’95), Hawaii, 1995, pp. 293–299

  • Stefanowski, J.: On rough set based approaches to induction of decision rules. In: Skowron, A., Polkowski, L. (eds.) Rough Sets in Knowledge Discovery, vol. 1(1), pp. 500–529. Physica Verlag, Heidelberg (1998)

  • Wu X., Yang Q.: 10 Challenging problems in data mining research. Int. J. Inf. Technol. Decis. Mak. 5(4), 597–604 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ching-Hsue Cheng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, MH., Cheng, CH., Huang, CS. et al. Discovering medical quality of total hip arthroplasty by rough set classifier with imbalanced class. Qual Quant 47, 1761–1779 (2013). https://doi.org/10.1007/s11135-011-9624-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11135-011-9624-9

Keywords

Navigation