Skip to main content

Informative Gene Selection and Tumor Classification by Null Space LDA for Microarray Data

  • Conference paper
Combinatorics, Algorithms, Probabilistic and Experimental Methodologies (ESCAPE 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4614))

Abstract

DNA microarray technology can monitor thousands of genes in a single experiment. One important application of this high-throughput gene expression data is to classify samples into known categories. Since the number of gene often exceeds the number of samples, classical classification methods do not work well under this circumstance. Furthermore, there are many irrelevant and redundant genes which will decrease classification accuracy, thus a gene selection process is necessary. More accurate classification result using these selected genes is expected. A novel informative gene selection and sample classification method for gene expression data is proposed in this paper. This method is based on Linear Discriminant Analysis (LDA) in the regular space and the null space of within-class scatter matrix. By recursively filtering genes which have smaller coefficient in the optimal projection basis vectors, the remaining genes are more and more informative. The results of experiments on leukemia dataset and the colon dataset show that genes in this subset have much less correlations and more discriminative power compared to those selected by classical methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Michael, B.E., Paul, T.S., Patrick, O.B., David, B.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868 (1998)

    Article  Google Scholar 

  2. Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.J.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 96, 6745–6750 (1999)

    Article  Google Scholar 

  3. Laura, J.V., Hongyue, D., Marc, J.V., Yudong, D.H., Augustinus, A.M., Mao, M., Hans, L.P., Karin, K., Matthew, J.M., Anke, T.W., George, J.S., Ron, M.K., Chris, R., Peter, S.L., Rene, B., Stephen, H.F.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415, 530–536 (2002)

    Article  Google Scholar 

  4. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., BloomTeld, C.D., Lander, E.S.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286, 531–537 (1999)

    Article  Google Scholar 

  5. Douglas, T.R., Uwe, S., Michael, B.E., Charles, M.P., Christian, R., Paul, S., Vishwanath, I., Stefanie, S.J., Matt, V.R., Mark, W., Alexander, P., Jeffrey, C.F., Deval, L., Dari, S., Timothy, G.M., John, N.W., David, B., Patrick, O.B.: Systematic variation in gene expression patterns in human cancer cell lines. Nature Genetics 24, 227–235 (2000)

    Article  Google Scholar 

  6. Danh, V.N., David, M.R.: Tumor classification by partial least squares using microarray gene expression data. Bioinformatics 18, 39–50 (2002)

    Article  Google Scholar 

  7. Antoniadis, S., Lambert, L., Leblanc, F.: Effective dimension reduction methods for tumor classification using gene expression data. Bioinformatics 19, 563–570 (2003)

    Article  Google Scholar 

  8. Sun, M., Xiong, M.: A mathematical programming approach for gene selection and tissue classification. Bioinformatics 19, 1243–1251 (2003)

    Article  Google Scholar 

  9. Guan, Z., Zhao, H.: A semiparametric approach for marker gene selection based on gene expression data. Bioinformatics 21, 529–536 (2005)

    Article  Google Scholar 

  10. Roberto, R., José, C.R., Jesús, S.A.: Incremental wrapper-based gene selection from microarray data for cancer classification. Pattern Recognition (in press)

    Google Scholar 

  11. Dudoit, S., Fridlyand, J., Terence, P.S.: Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. Journal of the American Statistical Association 97, 77–87 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  12. Tao, L., Zhang, C., Mitsunori, O.: A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics 20, 2429–2437 (2004)

    Article  Google Scholar 

  13. Statnikov, A., Constantin, F.A., Tsamardinos, I., Hardin, D., Levy, S.: A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis. Bioinformatics 21, 631–643 (2005)

    Article  Google Scholar 

  14. Inza, I., Larranaga, P., Blanco, R., Cerrolaza, A.J.: Filter versus wrapper gene selection approaches in DNA microarray domains. Artificial Intelligence in Medicine 31, 91–103 (2004)

    Article  Google Scholar 

  15. Li, F., Yang, Y.: Analysis of recursive gene selection approaches from microarray data. Bioinformatics 21, 3741–3747 (2005)

    Article  Google Scholar 

  16. West, M., Blanchette, C., Dressman, H., Huang, F., Ishida, S., Spang, R., Zuzan, H., Olason, J., Marks, I., Nevins, J.: Predicting the clinical status of human breast cancer by using gene expression profiles. Proc. Natl. Acad. Sci. USA 98, 11462–11467 (2001)

    Article  Google Scholar 

  17. Fisher, R.A.: The Use of Multiple Measures in Taxonomic Problems. Ann. Eugenics 7, 179–188 (1936)

    Google Scholar 

  18. Chen, L.F., Liao, H.Y., Ko, M.T., Lin, J.C., Yu, G.J.: A New LDA-Based Face Recognition System Which Can Solve the Small Sample Size Problem. Pattern Recognition 33, 1713–1726 (2000)

    Article  Google Scholar 

  19. Yu, H., Yang, J.: A Direct LDA Algorithm for High-Dimensional Data with Application to Face Recognition. Pattern Recognition 34, 2067–2070 (2001)

    Article  MATH  Google Scholar 

  20. Huang, R., Liu, Q., Lu, H., Ma, S.: Solving the Small Size Problem of LDA. Proc. 16th Int’l Conf. Pattern Recognition 3, 29–32 (2002)

    Google Scholar 

  21. Hakan, C., Marian, N., Mitch, W., Atalay, B.: Discriminative Common Vectors for Face Recognition. IEEE Trans. PAMI 27, 4–13 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Bo Chen Mike Paterson Guochuan Zhang

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yue, F., Wang, K., Zuo, W. (2007). Informative Gene Selection and Tumor Classification by Null Space LDA for Microarray Data. In: Chen, B., Paterson, M., Zhang, G. (eds) Combinatorics, Algorithms, Probabilistic and Experimental Methodologies. ESCAPE 2007. Lecture Notes in Computer Science, vol 4614. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74450-4_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74450-4_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74449-8

  • Online ISBN: 978-3-540-74450-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics