Hierarchical Clustering Support Vector Machines for Classifying Type-2 Diabetes Patients
Using a large national health database, we propose an enhanced SVM-based model called Hierarchical Clustering Support Vector Machine (HCSVM) that utilizes multiple levels of clusters to classify patients diagnosed with type-2 diabetes. Multiple HCSVMs are trained for clusters at different levels of the hierarchy. Some clusters at certain levels of the hierarchy capture more separable sample spaces than the others. As a result, HCSVMs at different levels may develop different classification capabilities. Since the locations of the superior SVMs are data dependent, the HCSVM model in this study takes advantage of an adaptive strategy to select the most suitable HCSVM for classifying the testing samples. This model solves the large data set problem inherent with the traditional single SVM model because the entire data set is partitioned into smaller and more homogenous clusters. Other approaches also use clustering and multiple SVM to solve the problem of large datasets. These approaches typical employed only one level of clusters. However, a single level of clusters may not provide an optimal partition of the sample space for SVM trainings. On the contrary, HCSVMs utilize multiple partitions available in a multilevel tree to capture a more separable sample space for SVM trainings. Compared with the traditional single SVM model and one-level multiple SVMs model, the HCSVM Model markedly improves the accuracy for classifying testing samples.
KeywordsHierarchical Clustering Support Vector Machines Classification Clustering Algorithm Type-2 Diabetes
Unable to display preview. Download preview PDF.
- 1.Agarwal, D.K.: Shrinkage estimator generalizations of proximal support vector machines. In: Proc. of the 8th ACM SIGKDD international conference of knowledge Discovery and data mining, Edmonton, Canada (2002)Google Scholar
- 2.Award, M., Khan, L., Bastani, F., Yen, I.: An Effective Support Vector Machines(SVMs) Performance Using Hierarchical Clustering. In: Proc. of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2004) (2004)Google Scholar
- 6.Daniael, B., Cao, D.: Training Support Vector Machines Using Adaptive Clustering. In: Proc. of SIAM International Conference on Data Mining, Lake Buena Vista, FL, USA (2004)Google Scholar
- 8.Osuna, E., Freund, R., Girosi, F.: An improved training algorithm for support vector machines. In: Proc. Of IEEE Workshop on Neural Networks for Signal Processing, pp. 276–285 (1997)Google Scholar
- 10.Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: advances in Kerenel Methods-Support Vector Learning, pp. 185–208 (1999)Google Scholar
- 11.Scholkopf, B., Burges, C., Smola, A. (eds.): Advances in Kernel Methods-Support Vec-tor Learning. MIT Press, Cambridge, MA (1999)Google Scholar
- 12.US Department of Health and Human Services, Centers for Disease Control and Prevention: Prevalence of diabetes and impaired fasting glucose in adults-United States 1999–2000, Morbidity and Mortality Weekly Report 52, 833–835 (2003) Google Scholar
- 13.Valentini, G., Dietterich, T.G.: Low Bias Bagged Support vector Machines. In: Proc. of the 20th International Conference on Machine Learning ICML 2003, Washington D.C. USA, pp. 752–759 (2003)Google Scholar
- 16.Yao, Y.Y.: Perspectives of Granular Computing. In: IEEE Conference on Granular Computing (to appear, 2005)Google Scholar
- 17.Yu, H., Yang, J., Han, J.: Classifying Large Data sets Using SVMs with Hierarchical Clusters. In: Proc. Of the 9th ACM SIGKDD 2003 (2003)Google Scholar