Abstract
Single nucleotide polymorphisms (SNPs) are the foremost part of many genome association studies. Selecting a subset of SNPs that is sufficiently informative but still small enough to reduce the genotyping overhead is an important step towards disease-gene association. In this work, a Random Forest (RF) approach to informative SNPs selection in Familial Combined Hyperlipidemia (FCH) is proposed. FCH is the most common form of familial hyperlipidemia. Affected patients have elevated levels of plasma triglycerides and/or total cholesterol and show increased risk of premature coronary heart disease. In order to identify susceptibility markers for FCH we perform the analysis of 21 SNPs in ten genes associated with high cardiovascular risk. RF appears to be a useful technique in identifying gene polymorphisms involved in FCH: the identified SNPs confirmed some variants in a couple of genes as genetic markers of FCH as proved in various studies in scientific literature and lead us to report for the first time a further gene association with FCH. This result could be promising and encourages to further investigate on the role of the identified gene in the development of FCH phenotype.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer (2006)
Breiman, L.: Random Forests. Machine Learning 45(1) (2001)
Calcagno, G., Staiano, A., Fortunato, G., Brescia-Morra, V., Salvatore, E., Liguori, R., Capone, S., Filla, A., Longo, G., Sacchetti, L.: A Multilayer Perceptron Neural Network-based Approach for the Identification of Responsiveness to Interferon Therapy in Multiple Sclerosis Patients. Information Sciences 180(21), 4153–4163 (2010)
Chen, X., Wang, M., Zhang, H.: The use of Classification Trees for Bioinformatics. WIREs Data Mining Knowl. Discov. 1, 55–63 (2011)
Cordell, H.J.: Detecting Gene-gene Interactions that Underlie Human Diseases. Nat. Revs. Gen. 10, 392–404 (2009)
Hastie, J., Tibshirani, R., Friedmanl, J.: The Elements of Statistical Learning - Data Mining, Inference and Prediction. Springer (2009)
Hegele, R.A.: Plasma Lipoproteins: Genetic Influences and Clinical Implications. Nat. Revs. Gen. 10(2), 109–121 (2009)
Kruglyek, L., Nickerson, D.A.: Variation in the Spice of Life. Nat. Genet. 27, 234–236 (2001)
Liu, Z.-K., Hu, M., Baumb, L., Thomas, G.N., Tomlinson, B.: Associations of Polymorphisms in the Apolipoprotein a1/c3/a4/a5 Gene Cluster with Familial Combined Hyperlipidaemia in Hong Kong Chinese. Atherosclerosis 208, 427–432 (2010)
Loh, W.-Y.: Classification and Regression Trees. WIREs Data Mining Knowl. Discov. 1, 14–23 (2011)
Long, N., Gianola, D., Rosa, G.J.M., Weigel, K.A., Avendano, S.: Comparison of Classification Methods for Detecting Associations between SNPs and Chick Mortality. Genet. Sel. Evol. 41(18) (2009)
Mar, R., Pajukanta, P., Allayee, H., Groenendijk, M., Dallinga-Thie, G., Krauss, R.M., Sinsheimer, J.S., Cantor, R.M., de Bruin, T.W.A., Lusis, A.J.: Association of the Apolipoprotein a1/c3/a4/a5 Gene Cluster with Triglyceride Levels and Ldl Particle Size in Familial Combined Hyperlipidemia. Circulation Reserch 94(7), 993–999 (2004)
Naukkarinen, J., Ehnholm, C., Peltonen, L.: Genetics of Familial Combined Hyperlipidemia. Current Opinion in Lipidology 17, 285–290 (2006)
Obulkasim, A., Meijer, G.A., van de Wiel, M.A.: Stepwise Classification of Cancer Samples Using Clinical and Molecular Data. BMC Bioinformatics 12(422) (2011)
Saeys, Y., Inza, I., Larranaga, P.: A Review of Feature Selection Techniques in Bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
Yoshida, M., Koike, A.: SNPinterforest: A New Method for Detecting Epistatic Interactions. BMC Bioinformatics 12(469) (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Staiano, A. et al. (2013). Investigation of Single Nucleotide Polymorphisms Associated to Familial Combined Hyperlipidemia with Random Forests. In: Apolloni, B., Bassis, S., Esposito, A., Morabito, F. (eds) Neural Nets and Surroundings. Smart Innovation, Systems and Technologies, vol 19. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35467-0_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-35467-0_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35466-3
Online ISBN: 978-3-642-35467-0
eBook Packages: EngineeringEngineering (R0)