Skip to main content

Design Issues and Comparison of Methods for Microarray-Based Classification

  • Chapter
Computational and Statistical Approaches to Genomics

Conclusion

Except in situations where the amount of data is large in comparison to the number of variables, classifier design and error estimation involve subtle issues. This is especially so in applications such as cancer classification where there is no prior knowledge concerning the vector-label distributions involved. It is clearlyprudent to try to achieve classification using small numbers of genes and rules of low complexity (low VC dimension), and to use cross-validationwhen it is not possible to obtain large independent samples for testing. Even when one uses a cross-validation method such as leave-one-out estimation, one is still confronted by the high variance of the estimator. In many applications, large samples are impossible owing to either cost or availability. Therefore, it is unlikely that a statistical approach alone will provide satisfactory results. Rather, one can use the results of classification analysis to discover gene sets that potentially provide good discrimination, and then focus attention on these. In the same vein, one can utilize the common engineering approach of integrating data with human knowledge to arrive at satisfactory systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 74.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M. and Yakhini, Z. (2000) Tissue classification with gene expression profiles. Computational Biology, 7, 559–583.

    CAS  Google Scholar 

  • Bishop, C. M., (1995) Neural Networks for Pattern Recognition, Oxford University Press, Oxford.

    Google Scholar 

  • Bittner, M., Meltzer, P., Khan, J., Chen, Y., Jiang, Y., Seftor, E., Hendrix, M., Radmacher, M., Simon, R., Yakhini, Z., Ben-Dor, A., Dougherty, E., Wang, E., Marincola, F., Gooden, C., Lueders, J., Glatfelter, A., Pollock, P., Gillanders, E., Leja, A., Dietrich, K., Beaudry, C., Berrens, M., Alberts, D., Sondak, V., Hayward, N., and Trent, J. (2000) Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature, 406, 536–540.

    Article  PubMed  CAS  Google Scholar 

  • Brown, M. P. S., Grundy, W. N., Lin, D., Cristianini, N., Sugnet, C. W., Furey, T. S., Ares, Jr., M., and D. Haussler. (2000) Knowledge-Based Analysis of Microarray Gene Expression Data by Using Support Vector Machines. Proc. National Academy Science, 97(1), 262–267.

    CAS  Google Scholar 

  • Cybenko, G. (1989) Approximation by Superposition of Sigmoidal Functions. Mathematics Control, Signals, Systems, 2, 303–314.

    Google Scholar 

  • Devroye, L., Gyorfi, L., and G. Lugosi. (1996) A Probabilistic Theory of Pattern Recognition. Springer-Verlag, New York.

    Google Scholar 

  • Devroye, L., and Kryzak, A. (1989) An Equivalence Theorem for L1 Convergence of the Kernel Regression Estimate, Statistical Planning and Inference, 23, 71–82.

    Google Scholar 

  • Dougherty, E. R. (2001) Small Sample Issues for Microarray-Based Classification. Comparative and Functional Genomics, 2, 28–34.

    Article  CAS  Google Scholar 

  • Farago, A., and Lugosi, G. (1993) Strong Universal Consistency of Neural Network Classifiers. IEEE Trans. on Information Theory, 39, 1146–1151.

    Article  Google Scholar 

  • Funahashi, K. (1989) On the Approximate Realization of Continuous Mappings by Neural Networks. Neural Networks, 2, 183–192.

    Article  Google Scholar 

  • Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., Bloomfield, C. D. and Lander, E. S. (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 286, 531–537.

    Article  PubMed  CAS  Google Scholar 

  • Gordon, L. and R. Olshen (1978) Asymptotically Efficient Solutions to the Classification Problem, Annals of Statistics, 6, 525–533.

    Google Scholar 

  • Hedenfalk, I., Duggan, D., Chen, Y., Radmacher, M., Bittner, M., Simon. R., Meltzer, P., Gusterson, B., Esteller, M., Raffeld, Yakhini, Z., Ben-Dor, A., Dougherty, E., Kononen, J., Bubendorf, L., Fehrle, W., Pittaluga, S., Gruvverger, S., Loman, N., Johannsson, O., Olsson, H., Wifond, B., Sauter, G., Kallioniemi, O. P., Borg, A., and Trent, J. (2001) Gene expression profiles distinguish hereditary breast cancers. New England J. Medicine, 34, 539–548.

    Google Scholar 

  • Khan, J., Wei, J. S., Ringner, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C. R., Peterson, C. and Meltzer, P. S. (2002) Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nature Medicine, 7, 673–679.

    Google Scholar 

  • Kim, S., Dougherty, E. R., Barrera, J., Chen, Y., Bittner, M., and J. M. Trent (2002) Strong Feature Sets From Small Samples. Journal of Computational Biology, 9(1).

    Google Scholar 

  • Rosenblatt, F. (1962) Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms, Spartan, Washington DC.

    Google Scholar 

  • Stone, C. (1977) Consistent Nonparametric Regression. Annals of Statistics, 5, 595–645.

    Google Scholar 

  • Vapnik, V. N., Golowich, S. E., and A. Smola (1997) Support Vector Method for Function Approximation, Regression, and Signal Processing. in Advances in Neural Information Processing Systems, 9, MIT Press, Cambridge.

    Google Scholar 

  • Vapnik, V. N. (1998) Statistical Learning Theory, John Wiley, New York.

    Google Scholar 

  • Vapnik, V., and A. Chervonenkis (1974) Theory of Pattern Recognition, Nauka, Moscow.

    Google Scholar 

  • Vapnik, V., and A. Chervonenkis (1971) On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities Theory of Probability and its Applications, 16, 264–280.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Kluwer Academic Publishers

About this chapter

Cite this chapter

Dougherty, E.R., Attoor, S.N. (2003). Design Issues and Comparison of Methods for Microarray-Based Classification. In: Zhang, W., Shmulevich, I. (eds) Computational and Statistical Approaches to Genomics. Springer, Boston, MA. https://doi.org/10.1007/0-306-47825-0_7

Download citation

  • DOI: https://doi.org/10.1007/0-306-47825-0_7

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4020-7023-5

  • Online ISBN: 978-0-306-47825-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics