Abstract
Mixture models represent results of gene expression cluster analysis in a more natural way than ‘hard’ partitions. This is also true for the representation of gene labels, such as functional annotations, where one gene is often assigned to more than one annotation term. Another important characteristic of functional annotations is their higher degree of detail in relation to groups of co-expressed genes. In other words, genes with similar function should be be grouped together, but the inverse does not holds. Both these facts, however, have been neglected by validation studies in the context of gene expression analysis presented so far. To overcome the first problem, we propose an external index extending the corrected Rand for comparison of two mixtures. To address the second and more challenging problem, we perform a clustering of terms from the functional annotation, in order to address the problem of difference in coarseness of two mixtures to be compared. We resort to simulated and biological data to show the usefulness of our proposals. The results show that we can only differentiate between distinct solutions after applying the component clustering
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
CHU S., et al. (1998), The Transcriptional Program of Sporulation in Budding Yeast, Science, 282,5389, 699–705.
EFRON B. and TIBSHIRANI, R. (1993), An Introduction to the Bootstrap, Chapman & Hall, New York.
FIGUEIREDO M. and JAIN, A.K. (2002), Unsupervised learning of finite mixture models, IEEE Transaction on Pattern Analysis and Machine Intelligence, 24,3, 381–396.
HUBERT, L. J., ARABIE, P. (1985), Comparing partitions, Journal of Classification, 2, 63–76.
JAIN A.K., DUBES, R.C. (1988), Algorithms for clustering data. Prentice Hall, New York.
MCLACHLAN G. and PEEL D. (2000), Finite Mixture Models, Wiley, New York.
MILLIGAN G. W. and COOPER M. C. (1986), A study of the comparability of external criteria for hierarchical cluster analysis, Multivariate Behavorial Research, 21, 441–458.
SCHLIEP, A., COSTA, I.G., STEINHOFF, C. and SCHÖNHUTH, A. (2005), Analyzing gene expression time-courses, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2(3), 179–193.
T. G. O. CONSORTIUM (2000), Gene ontology: tool for the unification of biology, Nature Genet, 25, 25–29.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer Berlin · Heidelberg
About this paper
Cite this paper
Costa, I.G., Schliep, A. (2006). On External Indices for Mixtures: Validating Mixtures of Genes. In: Spiliopoulou, M., Kruse, R., Borgelt, C., Nürnberger, A., Gaul, W. (eds) From Data and Information Analysis to Knowledge Engineering. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-31314-1_81
Download citation
DOI: https://doi.org/10.1007/3-540-31314-1_81
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31313-7
Online ISBN: 978-3-540-31314-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)