Noise Correction in Genomic Data

Teng, Choh Man

doi:10.1007/978-3-540-45080-1_8

Choh Man Teng⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2690))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

972 Accesses
2 Citations

Abstract

Osteogenesis Imperfecta (OI) is a genetic collagenous disease caused by mutations in one or both of the genes COLIA1 and COLIA2. There are at least four known phenotypes of OI, of which type II is the severest and often lethal. We applied a noise correction mechanism called polishing to a data set of amino acid sequences and associated information of point mutations of COLIA1. Polishing makes use of the inter-relationship between attribute and class values in the data set to identify and selectively correct components that are noisy. Preliminary results suggest that polishing is a viable mechanism for improving data quality, resulting in a more accurate classification of the lethal OI phenotype.

This work was supported by NASA NCC2-1239 and ONR N00014-03-1-0516.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Brodley, C.E., Friedl, M.A.: Identifying mislabeled training data. Journal of Artificial Intelligence Research 11, 131–167 (1999)
MATH Google Scholar
Clark, P., Niblett, T.: The CN2 induction algorithm. Machine Learning 3(4), 261–283 (1989)
Google Scholar
Domingos, P., Pazzani, M.: Beyond independence: Conditions for the optimality of the simple Bayesian classifier. In: Proceedings of the Thirteenth International Conference on Machine Learning, pp. 105–112 (1996)
Google Scholar
Drastal, G.: Informed pruning in constructive induction. In: Proceedings of the Eighth International Workshop on Machine Learning, pp. 132–136 (1991)
Google Scholar
Gamberger, D., Lavrač, N., Džeroski, S.: Noise elimination in inductive concept learning: A case study in medical diagnosis. In: Proceedings of the Seventh International Workshop on Algorithmic Learning Theory, pp. 199–212 (1996)
Google Scholar
Hunter, L., Klein, T.E.: Finding relevant biomolecular features. In: Proceedings of the International Conference on Intelligent Systems for Molecular Biology, pp. 190–197 (1993)
Google Scholar
John, G.H.: Robust decision trees: Removing outliers from databases. In: Proceedings of the First International Conference on Knowledge Discovery and Data Mining, pp. 174–179 (1995)
Google Scholar
Klein, T.E., Wong, E.: Neural networks applied to the collagenous disease osteogenesis imperfecta. In: Proceedings of the Hawaii International Conference on System Sciences, vol. I, pp. 697–705 (1992)
Google Scholar
Kononenko, I.: Semi-naive Bayesian classifier. In: Proceedings of the Sixth European Working Session on Learning, pp. 206–219 (1991)
Google Scholar
Langley, P., Iba, W., Thompson, K.: An analysis of Bayesian classifiers. In: Proceedings of the Tenth National Conference on Artificial Intelligence, pp. 223–228 (1992)
Google Scholar
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
MATH Google Scholar
Mooney, S.D., Huang, C.C., Kollman, P.A., Klein, T.E.: Computed free energy differences between point mutations in a collagenlike peptide. Biopolymers 58, 347–353 (2001)
Article Google Scholar
Ross Quinlan, J.: Simplifying decision trees. International Journal of Man-Machine Studies 27(3), 221–234 (1987)
Article Google Scholar
Ross Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. John Wiley & Sons, Chichester (1987)
Book MATH Google Scholar
Teng, C.M.: Correcting noisy data. In: Proceedings of the Sixteenth International Conference on Machine Learning, pp. 239–248 (1999)
Google Scholar
Teng, C.M.: Evaluating noise correction. In: Lecture Notes in Artificial Intelligence: Proceedings of the Sixth Pacific Rim International Conference on Artificial Intelligence, Springer, Heidelberg (2000)
Google Scholar
Teng, C.M.: A comparison of noise handling techniques. In: Proceedings of the Fourteenth International Florida Artificial Intelligence Research Society Conference, pp. 269–273 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Human and Machine Cognition, University of West Florida, 40 South Alcaniz Street, Pensacola, FL, 32501, USA
Choh Man Teng

Authors

Choh Man Teng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong
Jiming Liu
Department of Computer Science, Hong Kong Baptist University, Hong Kong
Yiu-ming Cheung
School of Electrical and Electronic Engineering, University of Manchester, UK
Hujun Yin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Teng, C.M. (2003). Noise Correction in Genomic Data. In: Liu, J., Cheung, Ym., Yin, H. (eds) Intelligent Data Engineering and Automated Learning. IDEAL 2003. Lecture Notes in Computer Science, vol 2690. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45080-1_8

Download citation

DOI: https://doi.org/10.1007/978-3-540-45080-1_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40550-4
Online ISBN: 978-3-540-45080-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics