A New Clustering Approach, Based on the Estimation of the Probability Density Function, for Gene Expression Data
Many techniques have already been suggested for handling and analyzing the large and high-dimensional data sets produced by newly developed gene expression experiments. These techniques include supervised classification and unsupervised agglomerative or hierarchical clustering techniques. Here, we present an alternative approach that does not make assumption on the shape, size and volumes of the clusters. The technique is based on the estimation of the probability density function (pdf). Once the pdf is estimated, with the Parzen technique (with the right amount of smoothing), the parameter space is partitioned according to methods inherited from image processing, namely the skeleton by influence zones and the watershed. We show some advantages of this suggested approach.
KeywordsSupport Vector Machine Probability Density Function Dimensionality Reduction Factorial Axis Influence Zone
Unable to display preview. Download preview PDF.
- BONNET, N., HERBIN, M., and VAUTROT, P. (1997): Une méthode de classification non supervisée ne faisant pas d’hypothèse sur la forme des classes: application à la segmentation d’images multivariables. Cinquièmes Rencontres de la Société Francophone de Classification. Lyon. Proceedings pp 151–154.Google Scholar
- BONNET, N., and CUTRONA, J. (2001): Improvement of unsupervised multi—component image segmentation through fuzzy relaxation. LASTED International Conference on Visualization, Imaging and Image Processing (VIIP’2001) Marbella ( Spain ). Acta Press: 477–482.Google Scholar
- COMANICIU, D., and MEER, P. (2002): Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence. In press.Google Scholar
- CUTRONA, J., BONNET, N., and HERBIN, M. (2002): A new fuzzy clustering technique based on pdf estimation. Information Processing and Management of Uncertainty (IPMU’2002). Submitted.Google Scholar
- GUERRERO, A., BONNET, N., MARCO, S., and CARRASCOSA, J. (2000): Comparative study of methods for the automatic classification of macromolecular image sets: preliminary investigations with realistic simulations. Proc. SPIE - Applications of Artificial Neural Networks in Image Processing V, 3962, 92103.Google Scholar
- HARTUV, E., SCHMITT, A., LANGE, J., MEIER-EWERT, S., LEHRACH, H., and SHAMIR, R. (1999): An algorithm for clustering cDNAs for gene expression. Third Int. Conf. on Computational Molecular Biology (RECOMB’99). ACM Press, pp. 188–197.Google Scholar
- MJOLSNESS, E., NO, R.C., and WOLD, B. (1999): Multi—parent clustering algorithms for large scale gene expression analysis. Technical report JPL-ICTR-995.Google Scholar
- TAMAYO, P., SLONIM, D., MESIROV, J., ZHU, Q., KITAREEWAN, S., DMITROWSKY, E., LANDER, E., and GOLUB, T. (1999): Interpreting patterns of gene expression with self—organizing maps: methods and application to hematopoietic differentiation. Proc. Nat. Acad. Sci. USA, 96, 2907–2912.CrossRefGoogle Scholar