Hierarchical Clustering Support Vector Machines for Classifying Type-2 Diabetes Patients

Zhong, Wei; Chow, Rick; Stolz, Richard; He, Jieyue; Dowell, Marsha

doi:10.1007/978-3-540-79450-9_35

Wei Zhong¹,
Rick Chow¹,
Richard Stolz³,
Jieyue He⁴ &
…
Marsha Dowell²

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4983))

Included in the following conference series:

International Symposium on Bioinformatics Research and Applications

990 Accesses
2 Citations

Abstract

Using a large national health database, we propose an enhanced SVM-based model called Hierarchical Clustering Support Vector Machine (HCSVM) that utilizes multiple levels of clusters to classify patients diagnosed with type-2 diabetes. Multiple HCSVMs are trained for clusters at different levels of the hierarchy. Some clusters at certain levels of the hierarchy capture more separable sample spaces than the others. As a result, HCSVMs at different levels may develop different classification capabilities. Since the locations of the superior SVMs are data dependent, the HCSVM model in this study takes advantage of an adaptive strategy to select the most suitable HCSVM for classifying the testing samples. This model solves the large data set problem inherent with the traditional single SVM model because the entire data set is partitioned into smaller and more homogenous clusters. Other approaches also use clustering and multiple SVM to solve the problem of large datasets. These approaches typical employed only one level of clusters. However, a single level of clusters may not provide an optimal partition of the sample space for SVM trainings. On the contrary, HCSVMs utilize multiple partitions available in a multilevel tree to capture a more separable sample space for SVM trainings. Compared with the traditional single SVM model and one-level multiple SVMs model, the HCSVM Model markedly improves the accuracy for classifying testing samples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agarwal, D.K.: Shrinkage estimator generalizations of proximal support vector machines. In: Proc. of the 8th ACM SIGKDD international conference of knowledge Discovery and data mining, Edmonton, Canada (2002)
Google Scholar
Award, M., Khan, L., Bastani, F., Yen, I.: An Effective Support Vector Machines(SVMs) Performance Using Hierarchical Clustering. In: Proc. of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2004) (2004)
Google Scholar
Balcazar, J.L., Dai, Y., Watanabe, O.: Provably Fast Training Algorithms for Support Vector Machines. In: Proc. of the 1st IEEE International Conference on Data mining, pp. 43–50. IEEE Computer Society, Los Alamitos (2001)
Chapter Google Scholar
Breault, J.L., Goodall, C.R., Fos, P.J.: Data Mining a Diabetic Data Warehouse. Artificial Intelligence in Medicine 26, 37–54 (2002)
Article Google Scholar
Chang, C.C., Lin, C.J.: Training nu-support vector classifiers: Theory and algorithms. Neural Computations 13, 2119–2147 (2001)
Article MATH Google Scholar
Daniael, B., Cao, D.: Training Support Vector Machines Using Adaptive Clustering. In: Proc. of SIAM International Conference on Data Mining, Lake Buena Vista, FL, USA (2004)
Google Scholar
Dowell, M.A., Rozell, B., Roth, D., Delugach, H., Chaloux, P., Dowell, J.: Economic and Clinical Disparities in Hospitalized Patients with Type-2 Diabetes. Journal of Nursing Scholarship 36, 66–72 (2004)
Article Google Scholar
Osuna, E., Freund, R., Girosi, F.: An improved training algorithm for support vector machines. In: Proc. Of IEEE Workshop on Neural Networks for Signal Processing, pp. 276–285 (1997)
Google Scholar
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Computing Surveys 31, 264–323 (1999)
Article Google Scholar
Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: advances in Kerenel Methods-Support Vector Learning, pp. 185–208 (1999)
Google Scholar
Scholkopf, B., Burges, C., Smola, A. (eds.): Advances in Kernel Methods-Support Vec-tor Learning. MIT Press, Cambridge, MA (1999)
Google Scholar
US Department of Health and Human Services, Centers for Disease Control and Prevention: Prevalence of diabetes and impaired fasting glucose in adults-United States 1999–2000, Morbidity and Mortality Weekly Report 52, 833–835 (2003)
Google Scholar
Valentini, G., Dietterich, T.G.: Low Bias Bagged Support vector Machines. In: Proc. of the 20th International Conference on Machine Learning ICML 2003, Washington D.C. USA, pp. 752–759 (2003)
Google Scholar
Vapnik, V.: Statistical Learning Theory. John Wiley & Sons, Inc., New York (1998)
MATH Google Scholar
Vavasis, S.A.: Nonlinear Optimization: Complexity Issues. Oxford Science, New York (1991)
MATH Google Scholar
Yao, Y.Y.: Perspectives of Granular Computing. In: IEEE Conference on Granular Computing (to appear, 2005)
Google Scholar
Yu, H., Yang, J., Han, J.: Classifying Large Data sets Using SVMs with Hierarchical Clusters. In: Proc. Of the 9th ACM SIGKDD 2003 (2003)
Google Scholar
Zagrovic, B., Pande, V.S.: How does averaging affect protein structure comparison on the ensemble level? Biophysical Journal 87, 2240–2246 (2004)
Article Google Scholar
Zhong, W., He, J., Harrison, R., Tai, P.C., Pan, Y.: Clustering Support Vector Machines for Protein Local Structure Prediction. Expert Systems With Applications 32, 518–526 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Divison of Math and Computer Science,
Wei Zhong & Rick Chow
School of Nursing,
Marsha Dowell
School of Business Administration and Economics, University of South Carolina Upstate, Spartanburg, SC 29303, USA
Richard Stolz
School of Computer Science and Engineering, Southeast University, Nanjing, 210096, China
Jieyue He

Authors

Wei Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Rick Chow
View author publications
You can also search for this author in PubMed Google Scholar
Richard Stolz
View author publications
You can also search for this author in PubMed Google Scholar
Jieyue He
View author publications
You can also search for this author in PubMed Google Scholar
Marsha Dowell
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Ion Măndoiu Raj Sunderraman Alexander Zelikovsky

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhong, W., Chow, R., Stolz, R., He, J., Dowell, M. (2008). Hierarchical Clustering Support Vector Machines for Classifying Type-2 Diabetes Patients. In: Măndoiu, I., Sunderraman, R., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2008. Lecture Notes in Computer Science(), vol 4983. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79450-9_35

Download citation

DOI: https://doi.org/10.1007/978-3-540-79450-9_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-79449-3
Online ISBN: 978-3-540-79450-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics