Ensemble of Dissimilarity Based Classifiers for Cancerous Samples Classification

Blanco, Ángela; Martín-Merino, Manuel; de las Rivas, Javier

doi:10.1007/978-3-540-75286-8_18

Ángela Blanco¹,
Manuel Martín-Merino¹ &
Javier de las Rivas²

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4774))

Included in the following conference series:

IAPR International Workshop on Pattern Recognition in Bioinformatics

1069 Accesses

Abstract

DNA Microarray technology allow us to identify cancerous tissues considering the gene expression levels across a collection of related samples.

Several classifiers such as Support Vector Machines (SVM), k Nearest Neighbors (k-NN) or Diagonal Linear Discriminant Analysis (DLDA) have been applied to this problem. However, they are usually based on Euclidean distances that fail to reflect accurately the sample proximities. Several classifiers have been extended to work with non-Euclidean dissimilarities although none outperforms the others because they misclassify a different set of patterns.

In this paper, we combine different kind of dissimilarity based classifiers to reduce the misclassification errors. The diversity among classifiers is induced considering a set of complementary dissimilarities for three different type of models. The experimental results suggest that the algorithm proposed helps to improve classifiers based on a single dissimilarity and a widely used combination strategy such as Bagging.

Download to read the full chapter text

Chapter PDF

Classifying Microarray Gene Expression Cancer Data Using Statistical Feature Selection and Machine Learning Methods

Cancer Classification Using Gene Expression Profiling: Application of the Filter Approach with the Clustering Algorithm

Searching for Significant Genes in Cancer Metastasis by Tissue Comparisons

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Aggarwal, C.C.: Re-designing distance functions and distance-based applications for high dimensional applications. In: Proc. of the ACM International Conference on Management of Data and Symposium on Principles of Database Systems (SIGMOD-PODS), vol. 1, pp. 13–18 (March 2001)
Google Scholar
Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.J.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Nat’l. Acad. Sci. USA 96, 6745–6750 (1999)
Article Google Scholar
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36, 105–139 (1999)
Article Google Scholar
Braga-Neto, U., Dougherty, E.: Is cross-validation valid for small-sample microarray classification? Bioinformatics 20(3), 374–380 (2004)
Article Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)
MATH MathSciNet Google Scholar
Cox, T., Cox, M.: Multidimensional Scaling, 2nd edn. Chapman & Hall/CRC Press, New York (2001)
Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)
Google Scholar
Drãghici, S.: Data Analysis Tools for DNA Microarrays. Chapman & Hall/CRC Press, New York (2003)
Google Scholar
Dudoit, S., Fridlyand, J., Speed, T.: Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association 97, 77–87 (2002)
Article MATH MathSciNet Google Scholar
Furey, T., Cristianini, N., Duffy, N., Bednarski, D., Schummer, M., Haussler, D.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10), 906–914 (2000)
Article Google Scholar
Gentleman, R., Carey, V., Huber, W., Irizarry, R., Dudoit, S.: Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer, Heidelberg (2006)
Google Scholar
Golub, T., Slonim, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, H., Loh, M., Downing, J., Caligiuri, M., Bloomfield, C., Lander, E.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286(15), 531–537 (1999)
Article Google Scholar
Golub, G.H., Loan, C.F.V.: Matrix Computations, 3rd edn. Johns Hopkins university press, Baltimore, Maryland, USA (1996)
MATH Google Scholar
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002)
Article MATH Google Scholar
Hinneburg, C.C.A.A., Keim, D.A.: What is the nearest neighbor in high dimensional spaces? In: Proc. of the International Conference on Database Theory (ICDT), pp. 506–515. Morgan Kaufmann, Cairo, Egypt (2000)
Google Scholar
Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: A survey. IEEE Transactions on Knowledge and Data Engineering 16(11) (November 2004)
Google Scholar
Kittler, J., Hatef, M., Duin, R., Matas, J.: On combining classifiers. IEEE Transactions on Neural Networks 20(3), 228–239 (1998)
Google Scholar
Kuncheva, L.I.: Combining Pattern Classifiers. John Wiley, New Jersey (2004)
MATH Google Scholar
Martín-Merino, M., Muñoz, A.: Self organizing map and sammon mapping for asymmetric proximities. Neurocomputing 63, 171–192 (2005)
Article Google Scholar
Martín-Merino, M., Noz, A.M.: A new sammon algorithm for sparse data visualization. In: International Conference on Pattern Recognition (ICPR), pp. 477–481. IEEE Press, Cambridge (UK) (2004)
Google Scholar
Molinaro, A., Simon, R., Pfeiffer, R.: Prediction error estimation: a comparison of resampling methods. Bioinformatics 21(15), 3301–3307 (2005)
Article Google Scholar
Pekalska, E., Paclick, P., Duin, R.: A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research 2, 175–211 (2001)
Article Google Scholar
Schölkopf, B., Smola, A.: Learning with Kernels. MIT Press, Cambridge, USA (2002)
Google Scholar
Valentini, G., Dietterich, T.: Bias-variance analysis of support vector machines for the development of svm-based ensemble methods. Journal of Machine Learning Research 5, 725–775 (2004)
MathSciNet Google Scholar
Vapnik, V.: Statistical Learning Theory. John Wiley & Sons, New York (1998)
MATH Google Scholar
West, M., Blanchette, C., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H., Olson, J., Marks, J., Nevins, J.: Predicting the clinical status of human breast cancer by using gene expression profiles. PNAS 98(20) (September 2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Universidad Pontificia de Salamanca, C/Compañía 5, 37002, Salamanca, Spain
Ángela Blanco & Manuel Martín-Merino
Cancer Research Center of Salamanca (CIC), Salamanca, Spain
Javier de las Rivas

Authors

Ángela Blanco
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Martín-Merino
View author publications
You can also search for this author in PubMed Google Scholar
Javier de las Rivas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Jagath C. Rajapakse Bertil Schmidt Gwenn Volkert

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Blanco, Á., Martín-Merino, M., de las Rivas, J. (2007). Ensemble of Dissimilarity Based Classifiers for Cancerous Samples Classification. In: Rajapakse, J.C., Schmidt, B., Volkert, G. (eds) Pattern Recognition in Bioinformatics. PRIB 2007. Lecture Notes in Computer Science(), vol 4774. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75286-8_18

Download citation

DOI: https://doi.org/10.1007/978-3-540-75286-8_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75285-1
Online ISBN: 978-3-540-75286-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Ensemble of Dissimilarity Based Classifiers for Cancerous Samples Classification

Abstract

Chapter PDF

Similar content being viewed by others

Classifying Microarray Gene Expression Cancer Data Using Statistical Feature Selection and Machine Learning Methods

Cancer Classification Using Gene Expression Profiling: Application of the Filter Approach with the Clustering Algorithm

Searching for Significant Genes in Cancer Metastasis by Tissue Comparisons

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Ensemble of Dissimilarity Based Classifiers for Cancerous Samples Classification

Abstract

Chapter PDF

Similar content being viewed by others

Classifying Microarray Gene Expression Cancer Data Using Statistical Feature Selection and Machine Learning Methods

Cancer Classification Using Gene Expression Profiling: Application of the Filter Approach with the Clustering Algorithm

Searching for Significant Genes in Cancer Metastasis by Tissue Comparisons

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation