Abstract
DNA microarray usage in genetics is rapidly proliferating, generating huge amount of data. It is estimated that around 5-20% of measurements do not succeed, leading to missing values in the data destined for further analysis. Missing values in further microarray analysis lead to low reliability, therefore there is a need for effective and efficient methods of missing values estimation.
This report presents a method for estimating missing values in SNP Microarrays using k-Nearest Neighbors among similar individuals. Usage of preliminary imputation is proposed and discussed. It is shown that introduction of multiple passes of kNN improves quality of missing value estimation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alizadeh, A.A., Eisen, M.B., Davis, R.E., Ma, C., Lossos, I.S., Rosenwald, A., Boldrick, J.C., Sabet, H., Tran, T., Yu, X., Powell, J.I., Yang, L., Marti, G.E., Moore, T., Hudson Jr., J., Lu, L., Lewis, D.B., Tibshirani, R., Sherlock, G., Chan, W.C., Greiner, T.C., Weisenburger, D.D., Armitage, J.O., Warnke, R., Staudt, L.M., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers (2011)
Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshi-rani, R., Botstein, D., Altman, R.B.: Missing value estimation methods for DNA microarrays. Bioinformatics 17(6), 520–525 (2001)
Kang, H., Qin, Z.S., Niu, T., Liu, J.S.: Incorporating Genotyping Uncer-tainty in Haplotype Inference for Single-Nucleotide Polymorphisms. Am. J. Hum. Genet. 74, 495–510 (2004)
Patil, N., et al.: Blocks of Limited Haplotype Diversity Revealed by High-Resolution Scanning of Human Chromosome 21. Science 294, 1719–1723 (2001)
Sinoquet, C.: Iterative two-pass algorithm for missing data imputation in SNP arrays. Journal of Bioinformatics and Computational Biology 7(5), 833–852 (2009)
Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Me-tric Space Approach. Springer (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Podsiadly, P. (2014). Estimation of Missing Values in SNP Array. In: Ali, M., Pan, JS., Chen, SM., Horng, MF. (eds) Modern Advances in Applied Intelligence. IEA/AIE 2014. Lecture Notes in Computer Science(), vol 8482. Springer, Cham. https://doi.org/10.1007/978-3-319-07467-2_45
Download citation
DOI: https://doi.org/10.1007/978-3-319-07467-2_45
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07466-5
Online ISBN: 978-3-319-07467-2
eBook Packages: Computer ScienceComputer Science (R0)