Osteogenesis Imperfecta (OI) is a genetic collagenous disease caused by mutations in one or both of the genes COLIA1 and COLIA2. There are at least four known phenotypes of OI, of which type II is the severest and often lethal. We applied a noise correction mechanism called polishing to a data set of amino acid sequences and associated information of point mutations of COLIA1. Polishing makes use of the inter-relationship between attribute and class values in the data set to identify and selectively correct components that are noisy. Preliminary results suggest that polishing is a viable mechanism for improving data quality, resulting in a more accurate classification of the lethal OI phenotype.


Osteogenesis Imperfecta Polished Data Target Concept Improve Data Quality Viable Mechanism 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [Brodley and Friedl, 1999]
    Brodley, C.E., Friedl, M.A.: Identifying mislabeled training data. Journal of Artificial Intelligence Research 11, 131–167 (1999)zbMATHGoogle Scholar
  2. [Clark and Niblett, 1989]
    Clark, P., Niblett, T.: The CN2 induction algorithm. Machine Learning 3(4), 261–283 (1989)Google Scholar
  3. [Domingos and Pazzani, 1996]
    Domingos, P., Pazzani, M.: Beyond independence: Conditions for the optimality of the simple Bayesian classifier. In: Proceedings of the Thirteenth International Conference on Machine Learning, pp. 105–112 (1996)Google Scholar
  4. [Drastal, 1991]
    Drastal, G.: Informed pruning in constructive induction. In: Proceedings of the Eighth International Workshop on Machine Learning, pp. 132–136 (1991)Google Scholar
  5. [Gamberger et al., 1996]
    Gamberger, D., Lavrač, N., Džeroski, S.: Noise elimination in inductive concept learning: A case study in medical diagnosis. In: Proceedings of the Seventh International Workshop on Algorithmic Learning Theory, pp. 199–212 (1996)Google Scholar
  6. [Hunter and Klein, 1993]
    Hunter, L., Klein, T.E.: Finding relevant biomolecular features. In: Proceedings of the International Conference on Intelligent Systems for Molecular Biology, pp. 190–197 (1993)Google Scholar
  7. [John, 1995]
    John, G.H.: Robust decision trees: Removing outliers from databases. In: Proceedings of the First International Conference on Knowledge Discovery and Data Mining, pp. 174–179 (1995)Google Scholar
  8. [Klein and Wong, 1992]
    Klein, T.E., Wong, E.: Neural networks applied to the collagenous disease osteogenesis imperfecta. In: Proceedings of the Hawaii International Conference on System Sciences, vol. I, pp. 697–705 (1992)Google Scholar
  9. [Kononenko, 1991]
    Kononenko, I.: Semi-naive Bayesian classifier. In: Proceedings of the Sixth European Working Session on Learning, pp. 206–219 (1991)Google Scholar
  10. [Langley et al., 1992]
    Langley, P., Iba, W., Thompson, K.: An analysis of Bayesian classifiers. In: Proceedings of the Tenth National Conference on Artificial Intelligence, pp. 223–228 (1992)Google Scholar
  11. [Mitchell, 1997]
    Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)zbMATHGoogle Scholar
  12. [Mooney et al., 2001]
    Mooney, S.D., Huang, C.C., Kollman, P.A., Klein, T.E.: Computed free energy differences between point mutations in a collagenlike peptide. Biopolymers 58, 347–353 (2001)CrossRefGoogle Scholar
  13. [Quinlan, 1987]
    Ross Quinlan, J.: Simplifying decision trees. International Journal of Man-Machine Studies 27(3), 221–234 (1987)CrossRefGoogle Scholar
  14. [Quinlan, 1993]
    Ross Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
  15. [Rousseeuw and Leroy, 1987]
    Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. John Wiley & Sons, Chichester (1987)zbMATHCrossRefGoogle Scholar
  16. [Teng, 1999]
    Teng, C.M.: Correcting noisy data. In: Proceedings of the Sixteenth International Conference on Machine Learning, pp. 239–248 (1999)Google Scholar
  17. [Teng, 2000]
    Teng, C.M.: Evaluating noise correction. In: Lecture Notes in Artificial Intelligence: Proceedings of the Sixth Pacific Rim International Conference on Artificial Intelligence, Springer, Heidelberg (2000)Google Scholar
  18. [Teng, 2001]
    Teng, C.M.: A comparison of noise handling techniques. In: Proceedings of the Fourteenth International Florida Artificial Intelligence Research Society Conference, pp. 269–273 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Choh Man Teng
    • 1
  1. 1.Institute for Human and Machine CognitionUniversity of West FloridaPensacolaUSA

Personalised recommendations