Skip to main content

Conclusion

At a conceptual level, one can divide the task of concept learning into the subtask of selecting a proper subset of features to use in describing the concept, and learning a hypothesis based on these features. This directly leads to a modular design of the learning algorithm which allows flexible combinations of explicit feature selection methods with model induction algorithms and sometimes leads to powerful variants. Many recent works, however, tend to take a more general view of feature selection as part of model selection and therefore integrate feature selection more closely into the learning algorithms (i.e. the Bayesian feature selection methods). Feature selection for clustering is a largely untouched problem, and there has been little theoretical characterization of the heuristic approaches we described in the chapter. In summary, although no universal strategy can be prescribed, for high-dimensional problems frequently encountered in microarray analysis, feature selection offers a promising suite of techniques to improve interpretability, performance and computation efficiency in learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Baluja S. and Davies S. (1997). Using Optimal Dependency-Trees for Combinatorial Optimization: Learning the Structure of the Search Space, Proceedings of the Fourteenth International Conference on Machine Learning.

    Google Scholar 

  • Ben-Dor A., Friedman N. and Yakhini Z. (2000). Scoring genes for relevance, Agilent Technologies Technical Report AGL-2000-19.

    Google Scholar 

  • Blum A. and Langley P. (1997). Selection of Relevant Features and Examples in Machine Learning, Artificial Intelligence 97:245–271.

    Article  Google Scholar 

  • Chow M.L and Liu C. (1968). Approximating discrete probability distribution with dependency tree, IEEE Transactions on Information Theory 14:462–367.

    Article  Google Scholar 

  • Chow M.L., Moler E.J., Mian I.S. (2002). Identification of marker genes in transcription profiling data using a mixture of feature relevance experts, Physiological Genomics (in press).

    Google Scholar 

  • Cover T. and Thomas J. (1991). Elements of Information Theory, Wiley, New York.

    Google Scholar 

  • Cox T. and Cox M. (1994). Multidimensional Scaling, Chapman & Hall, London.

    Google Scholar 

  • Dash M. and Liu H. (2000). Feature Selection for Clustering, PAKDD, 110–121.

    Google Scholar 

  • Dempster A.P., Laird N.M., Revow M. (1977). Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, B39(1):1–38.

    Google Scholar 

  • Devaney M. and Ram A. (1997) Efficient feature selection in conceptual clustering, Proceedings of the Fourteenth International Conference on Machine Learning, Morgan Kaufmann, San Francisco, CA, 92–97.

    Google Scholar 

  • Dudoit S., Fridlyand J., Speed T. (2000). Comparison of discrimination methods for the classification of tumors using gene expression data, Technical report 576, Department of Statistics, UC Berkeley.

    Google Scholar 

  • Fisher D. H. (1987). Knowledge Acquisition via Incremental Conceptual Clustering, Machine Learning 2:139–172.

    Google Scholar 

  • George E.I. and McCulloch R.E. (1997). Approaches for Bayesian variable selection, Statistica Sinica 7:339–373.

    Google Scholar 

  • Golub T.R., Slonim D.K., Tamayo P., Huard C., Gaasenbeek M., Mesirov J.P., Coller H., Loh M.L., Downing J.R, Caligiuri M.A., Bloomfield C.D., Lander E.S. (1999). Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring, Science 286:531–537.

    Article  PubMed  CAS  Google Scholar 

  • Jebara T. and Jaakola T. (2000). Feature selection and dualities in maximum entropy discrimination, Proceedings of the Sixteenth Annual Conference on Uncertainty in Artificial Intelligence, Morgan Kaufman.

    Google Scholar 

  • Jolliffe I.T. (1989). Principal Component Analysis, Springer-Verlag, New York.

    Google Scholar 

  • Koller D. and Sahami M. (1996), Toward optimal feature selection, Proceedings of the Thirteenth International Conference on Machine Learning, ICML96, 284–292.

    Google Scholar 

  • Littlestone N. (1988). Learning quickly when irrelevant attribute abound: A new linearthreshold algorithm, Machine Learning 2:285–318.

    Google Scholar 

  • Ng A.Y. (1988). On feature selection: Learning with exponentially many irrelevant features as training examples, Proceedings of the Fifteenth International Conference on Machine Learning.

    Google Scholar 

  • Ng A.Y, and Jordan M. (2001). Convergence rates of the voting Gibbs classifier, with application to Bayesian feature selection, Proceedings of the Eighteenth International Conference on Machine Learning.

    Google Scholar 

  • Ng A.Y., Zheng A.X., Jordan M. (2001). Link analysis, eigenvectors, and stability, Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence.

    Google Scholar 

  • Russell S. and Norvig P. (1995). Artificial Intelligence, A Modern Approach, Prentice Hall, New Jersey

    Google Scholar 

  • Xing E.P., Jordan M., Karp R.M. (2001). Feature selection for high-dimensional genomic microarray data, Proceedings of the Eighteenth International Conference on Machine Learning.

    Google Scholar 

  • Xing E.P. and Karp R.M. (2001). Cliff: Clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts, Bioinformatics 1(1):1–9.

    Google Scholar 

  • Zhang T. (2000). Large margin winnow methods for text categorization, KDD 2000 Workshop on Text Mining, 81–87.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Kluwer Academic Publishers

About this chapter

Cite this chapter

Xing, E.P. (2003). Feature Selection in Microarray Analysis. In: Berrar, D.P., Dubitzky, W., Granzow, M. (eds) A Practical Approach to Microarray Data Analysis. Springer, Boston, MA. https://doi.org/10.1007/0-306-47815-3_6

Download citation

  • DOI: https://doi.org/10.1007/0-306-47815-3_6

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4020-7260-4

  • Online ISBN: 978-0-306-47815-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics