Skip to main content

Novel Semantic Discretization Technique for Type-2 Diabetes Classification Model

  • Conference paper
  • First Online:
Innovations in Computer Science and Engineering

Abstract

Semantic discretization, which is relatively a new concept, can be viewed as the discretization technique that uses the semantics of the data along with its value. The semantics of the data refer to the domain knowledge inherent in the data. The semantics of data is derived from the data value itself. Objective and context of the study also contribute significantly to identifying semantic of the data. Since no explicit ontology is associated with the data in semantic discretization, identifying, interpreting, and exploiting, the semantics of the data is a challenging task. This paper presents a novel algorithm for semantic discretization, in which machine learning techniques such as classification and association rule mining is used to derive semantic knowledge, which is further used for discretization. To show the effectiveness of the proposed semantic discretization algorithm, we applied it on diabetes dataset. Experimental results show 2–15% improvement in classification accuracy on semantically discretized dataset in comparison to the original and statistically discretized dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Yang YW, Wu GI, Maimon X, Oded Rokach L Book section, discretization methods, data mining and knowledge discovery handbook, 2005, Springer US, Boston, MA @ 978-0-387-25465-4

    Google Scholar 

  2. Chandrakar O, Saini JR (2017) Knowledge based semantic discretization using data mining techniques. Int J Adv Intell Parad

    Google Scholar 

  3. Liu H, Hussain F, Tan CL, Dash M (2002) Discretization: an enabling technique. Data Min Knowl Disc 6(4):393–423

    Article  MathSciNet  Google Scholar 

  4. Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. In: Proceedings of the twelfth international conference on machine learning (ICML), 1995, pp 194–202

    Google Scholar 

  5. Yang Y, Webb GI, Wu X (2010) Discretization methods. In: Data mining and knowledge discovery handbook, pp 101–116

    Google Scholar 

  6. Li R-P, Wang Z-O (2002) An entropy-based discretization method for classification rules with inconsistency checking. In: Proceedings of the first international conference on machine learning and cybernetics (ICMLC), pp 243–246

    Google Scholar 

  7. Yang Y, Webb GI (2009) Discretization for naive-bayes learning: managing discretization bias and variance. Mach Learn 74(1):39–74

    Article  Google Scholar 

  8. Bay SD (2001) Multivariate discretization for set mining. Knowl Inf Syst 3:491–512

    Article  Google Scholar 

  9. Cerquides J, Lopez R (1997) Proposal and empirical comparison of a parallelizable distance-based discretization method. In: III international conference on knowledge discovery and data mining (KDDM97). Newport Beach, California, USA, pp 139–142

    Google Scholar 

  10. Steck H, Jaakkola T (2004) Predictive discretization during model selection. In: XXVI symposium in pattern recognition (DAGM04). Lecture notes in computer science 3175, Springer, Tbingen, Germany, pp 1–8

    Google Scholar 

  11. Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc.

    Google Scholar 

  12. Au W-H, Chan KCC, Wong AKC (2006) A fuzzy approach to partitioning continuous attributes for classification. IEEE Trans Knowl Data Eng 18(5):715–719

    Article  Google Scholar 

  13. Kerber R (1992) ChiMerge: discretization of numeric attributes. X national conference on artificial intelligence American association (AAAI92). USA, pp 123–128

    Google Scholar 

  14. Chandrakar O, Saini JR Development of Indian weighted diabetic risk score (IWDRS) using machine learning techniques for type-2 diabetes. In: COMPUTE ‘16 proceedings of the 9th annual ACM India conference. ACM New York, NY, USA, pp 125–128. ©2016, ISBN: 978-1-4503-4808-9. https://doi.org/10.1145/2998476.2998497

  15. Bouckaert RR, Frank E, Hall M, Kirkby R, Reutemann P, Seewald A, Scuse D (2016) WEKA manual for version 3-8-1. University of Waikato, Hamilton, New Zealand

    Google Scholar 

  16. Chandrakar O, Saini JR Questionnaire for deriving diabetic risk score for Indian population. Accepted for presentation and publication at international conference on artificial intelligence in health care, ICAIHC-2016

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Omprakash Chandrakar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chandrakar, O., Saini, J.R., Bhatti, D.G. (2019). Novel Semantic Discretization Technique for Type-2 Diabetes Classification Model. In: Saini, H., Sayal, R., Govardhan, A., Buyya, R. (eds) Innovations in Computer Science and Engineering. Lecture Notes in Networks and Systems, vol 74. Springer, Singapore. https://doi.org/10.1007/978-981-13-7082-3_17

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-7082-3_17

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-7081-6

  • Online ISBN: 978-981-13-7082-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics