Abstract
Our research takes place in a bioinformatics team embedded in a biological unit where the biologists are using pangenomics cDNA chips to measure expression level of thousands of genes at a time. The goal of our research is to systematically categorize of relations between genes expression levels (1) and biomedical values to support finding of candidate genes allowing a better diagnostic of obesities and related diseases (2). A key issue in the analysis of cDNA chips is that the number of expression levels per chip is very high compared to the number of chips. We are working with 40 cDNA chips with ±40000 spots each one and with 2 biomedical parameters. One way used by biologists to discover relationships between these types of data consists in computing correlations for a small number of them based on their biological knowledge. To go beyond such a biased and manual selection, we propose to explore automatically combinations between all available bioclinical parameters with all gene expressions. These new data need to be classify to identify significant Linear Correlation Discoveries (3). Our method, DISCOCLINI, consists in using abstraction operators to remove outliers, approximation to define correlations and reformulation to describe and to cluster correlations by variations patterns.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Schena, M., Shalon, D., Davis, R.W., Brown, P.O.: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270, 467–470 (1995)
Clement, K., Boutin, P., Froguel, P.: Genetics of obesity. Am. J. Pharmacogenomics 2, 177–187 (2002)
Chianga, R.H.L., Cecilb, C.E.H., Limc, E.-P.: Linear correlation discovery in databases: a data mining approach. Data and Knowledge Engineering 53, 311–337 (2005)
Zucker, J.D.: A grounded theory of abstraction in artificial intelligence. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 358, 1293–1309 (2003)
Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)
Viguerie, N., Clement, K., Barbe, P., Courtine, M., Benis, A., Larrouy, D., Hanczar, B., Pelloux, V., Poitou, C., Khalfallah, Y., Barsh, G.S., Thalamas, C., Zucker, J.D., Langin, D.: In vivo epinephrine-mediated regulation of gene expression in human skeletal muscle. J. Clin. Endocrinol. Metab. 89, 2000–2014 (2004)
Taleb, S., Lacasa, D., Bastard, J.P., Poitou, C., Cancello, R., Pelloux, V., Viguerie, N., Benis, A., Zucker, J.D., Bouillot, J.L., Coussieu, C., Basdevant, A., Langin, D.: Cathepsin S, a novel biomaker of adiposity: relevance to atherogenisis. FASEB Journal (2005) (in press)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Benis, A. (2005). Categorizing Gene Expression Correlations with Bioclinical Data: An Abstraction Based Approach. In: Zucker, JD., Saitta, L. (eds) Abstraction, Reformulation and Approximation. SARA 2005. Lecture Notes in Computer Science(), vol 3607. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527862_29
Download citation
DOI: https://doi.org/10.1007/11527862_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27872-6
Online ISBN: 978-3-540-31882-8
eBook Packages: Computer ScienceComputer Science (R0)