Skip to main content

Neighborhood Rough Set Model Based Gene Selection for Multi-subtype Tumor Classification

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5226))

Abstract

Multi-subtype tumor diagnosis based on gene expression profiles is promising in clinical medicine application. Therefore, a great deal of research on tumor classification based on gene expression profiles has been developed, where various machine learning approaches were applied to constructing the best tumor classification model to improve the classification performance as much as possible. To achieve this goal, extracting features or finding informative genes that have good classification ability is crucial. We propose a novel gene selection approach, which adopts Kruskal-Wallis rank sum test to rank all genes and then apply an algorithm based on neighborhood rough set model to gene reduction to obtain gene subsets with fewer genes and more classification ability. Experiments on a small round blue cell tumor (SRBCT) dataset show that our approach can achieve very high classification accuracy with only three or four genes as evaluated by three classifiers: support vector machines, K-nearest neighbor and neighborhood classifier, respectively.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fu, L.M., Fu-Liu, C.S.: Multi-class Cancer Subtype Classification Based on Gene Expression Signatures with Reliablity Analysis. FEBS Lett. 561, 186–190 (2004)

    Article  Google Scholar 

  2. Fung, B.Y.M., Vincent, T.Y.N.: Meta-classification of Multi-type Cancer Gene Expression Data. BIOKDD, 31–39 (2004)

    Google Scholar 

  3. Chen, D.C., Liu, Z.Q., Ma, X.B., Hua, D.: Selecting Genes by Test Statistics. J. Biomed. Biotechnol. 2, 132–138 (2005)

    Article  Google Scholar 

  4. Furey, T.S., Christianini, N., Duffy, N., Bednarski, D.W., Schummer, M., Hauessler, D.: Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expression Data. Bioinform. 16(10), 906–914 (2000)

    Article  Google Scholar 

  5. Xiong, M.M., Li, W.J., Zhao, J.Y., Li, J., Boerwinkle, E.: Feature (Gene) Selection in Gene Expression-based Tumor Classification. Mol. Genet. Metab. 73, 239–247 (2001)

    Article  Google Scholar 

  6. Jaeger, J., Sengupta, R., Ruzzo, W.L.: Improved Gene Selection for Classification of Microarrays. In: Pacific Symposium on Biocomputing, vol. 8, pp. 53–64 (2003)

    Google Scholar 

  7. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286, 531–537 (1999)

    Article  Google Scholar 

  8. Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., Yakhini, Z.: Tissue Classification with Gene Expression Profiles. J. Comput. Biol. 7(3-4), 559–584 (2000)

    Article  Google Scholar 

  9. Deng, L., Ma, J.W., Pei, J.: Rank Sum Method for Related Gene Selection and Its Application to Tumor Diagnosis. Chinese Sci. Bull. 49(15), 1652–1657 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  10. Xiong, M.M., Fang, X.Z., Zhao, J.Y.: Biomarker Identification by Feature Wrappers. Genome Research 11(11), 1878–1887 (2001)

    Google Scholar 

  11. Hu, Q.H., Yu, D.R., Xie, Z.X.: Neighborhood Classifiers. Expert Syst. Appl. 34(2), 866–876 (2008)

    Article  Google Scholar 

  12. Hu, Q.H., Yu, D.R., Xie, Z.X.: Numerical Attribute Reduction Based on Neighborhood Granulation and Rough Approximation. J. Software 19(3), 640–649 (2008)

    Article  Google Scholar 

  13. Jolliffe, I.T.: Principal Component Analysis. Springer, New York (1986)

    Google Scholar 

  14. Lehmann, E.L.: Non-parametrics: Statistical Methods Based on Ranks. Holden-Day, San Francisco (1975)

    Google Scholar 

  15. Wilcoxon, F.: Individual Comparisons by Ranking Methods. Biometr. 1, 80–83 (1945)

    Google Scholar 

  16. Kruskal, W.H., Wallis, W.A.: Use of Ranks in One-criterion Variance Analysis. J. Amer. Statist. Assoc. 47(260), 583–621 (1952)

    Article  MATH  Google Scholar 

  17. Deng, L., Pei, J., Ma, J.W., Lee, D.L.: A Rank Sum Test Method for Informative Gene discovery. In: KDD 2004, Seattle, USA, pp. 410–419 (2004)

    Google Scholar 

  18. Wang, S.L., Chen, H.W., Li, F.R., Zhang, D.X.: Gene Selection with Rough Sets for the Molecular Diagnosing of Tumor Based on Support Vector Machines. In: International Computer Symposium, Taiwan, pp. 1368–1373 (2006)

    Google Scholar 

  19. Vapnik, V.N.: Statistical Learning Theory. Springer, New York (1998)

    MATH  Google Scholar 

  20. Dasarathy, B.: Nearest Neighbor Norms: NN Pattern Classification Techniques. IEEE Computer Society Press, Los Alamitos (1991)

    Google Scholar 

  21. Khan, J., Wei, J.S., Ringner, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C.R., Peterson, C., Meltzer, P.S.: Classification and Diagnostic Prediction of Cancers Using Gene Expression Profiling and Artificial Neural Networks. Nature Medicine 7(6), 673–679 (2001)

    Article  Google Scholar 

  22. Deutsch, J.M.: Evolutionary Algorithms for Finding Optimal Gene Sets in Microarray Prediction. Bioinform. 19(1), 45–52 (2003)

    Article  Google Scholar 

  23. Wang, L.P., Chu, F., Xie, W.: Accurate Cancer Classification Using Expressions of Very Few Genes. IEEE/ACM Trans. Comput. Biol. Bioinform. 4(1), 40–53 (2007)

    Article  MathSciNet  Google Scholar 

  24. Lee, Y., Lee, C.K.: Classification of Multiple Cancer Types by Multicategory Support Vector Machines Using Gene Expression Data. Bioinform. 19(9), 1132–1139 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, S., Li, X., Zhang, S. (2008). Neighborhood Rough Set Model Based Gene Selection for Multi-subtype Tumor Classification. In: Huang, DS., Wunsch, D.C., Levine, D.S., Jo, KH. (eds) Advanced Intelligent Computing Theories and Applications. With Aspects of Theoretical and Methodological Issues. ICIC 2008. Lecture Notes in Computer Science, vol 5226. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87442-3_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-87442-3_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-87440-9

  • Online ISBN: 978-3-540-87442-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics