Mining Association Rule Bases from Integrated Genomic Data and Annotations

Martinez, Ricardo; Pasquier, Nicolas; Pasquier, Claude

doi:10.1007/978-3-642-02504-4_7

Mining Association Rule Bases from Integrated Genomic Data and Annotations

Ricardo Martinez²²,
Nicolas Pasquier²² &
Claude Pasquier²³

Conference paper

848 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5488))

Abstract

During the last decade, several clustering and association rule mining techniques have been applied to highlight groups of co-regulated genes in gene expression data. Nowadays, integrating these data and biological knowledge into a single framework has become a major challenge to improve the relevance of mined patterns and simplify their interpretation by biologists. GenMiner was developed for mining association rules from such integrated datasets. It combines a new nomalized discretization method, called NorDi, and the JClose algorithm to extract condensed representations for association rules. Experimental results show that GenMiner requires less memory than Apriori based approaches and that it improves the relevance of extracted rules. Moreover, association rules obtained revealed significant co-annotated and co-expressed gene patterns showing important biological relationships supported by recent biological literature.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proceedings of the VLDB international conference, pp. 478–499 (1994)
Google Scholar
Altman, R., Raychaudhuri, S.: Whole-Genome Expression Analysis: Challenges Beyond Clustering. Current Opinion Structural Biology 11, 340–347 (2001)
Article CAS Google Scholar
Bera, A., Jarque, C.: Efficient Tests for Normality, Homoscedasticity and Serial Independence of Regression Residuals: Monte Carlo Evidence. Economics Letters 7, 313–318 (1981)
Article Google Scholar
Borgelt, C.: Recursion Pruning for the Apriori Algorithm. In: Proceedings of the FIMI international workshop (2004)
Google Scholar
Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic Itemset Counting and Amplication Rules for Market Basket Data. In: Proceedings of the ACM SIGMOD international conference, pp. 255–264 (1997)
Google Scholar
Carmona-Saez, P., Chagoyen, M., Rodriguez, A., Trelles, O., Carazo, J., Pascual-Montano, A.: Integrated Analyis of Gene Expression by Association Rules Discovery. BMC Bioinformatics 7, 54 (2006)
Article PubMed PubMed Central Google Scholar
Creighton, C., Hanansh, S.: Mining Gene Expression Databases for Association Rules. Bioinformatics 19, 79–86 (2003)
Article CAS PubMed Google Scholar
Cristofor, L., Simovici, D.A.: Generating an Informative Cover for Association Rules. In: Proceedings of the ICDM international conference, pp. 597–600 (2002)
Google Scholar
DeRisi, J., Iyer, L., Brown, V.: Exploring the Metabolic and Genetic Control of Gene Expression on a Genomic Scale. Science 278, 680–686 (1997)
Article CAS PubMed Google Scholar
Eisen, M., Spellman, P., Brown, P., Botsein, D.: Cluster Analysis and Display of Genome Wide Expression Patterns. Proc. Nat. Aca. Sci. 95, 14863–14868 (1998)
Article CAS Google Scholar
FIMI: Frequent Itemset Mining Implementations Repository, http://fimi.cs.helsinki.fi
GenMiner: Genomic Data Miner, http://bioinfo.unice.fr/publications/genminer_article
Georgi, E., Richter, L., Ruckert, U., Kramer, S.: Analyzing Microarray Data using Quantitative Association Rules. Bioinformatics 21, 123–129 (2005)
Article Google Scholar
Grubbs, F.: Procedures for Detecting Outlying Observations in Samples. Technometrics 11, 1–21 (1969)
Article Google Scholar
KEIA: Knowledge Extraction, Integration and Applications, http://keia.i3s.unice.fr
Lilliefors, H.: On the Kolmogorov-Smirnov Test for Normality with Mean and Variance Unknown. Journal of the American Statistical Association 62 (1967)
Google Scholar
Lopez, F.J., Blanco, A., Garcia, F., Cano, C., Marin, A.: Fuzzy Association Rules for Biological Data Analysis: A Case Study on Yeast. BMC Bioinformatics 9, 107 (2008)
Article PubMed PubMed Central Google Scholar
Martinez, R., Collard, M.: Extracted knowledge: Interpretation in Mining Biological Data, a Survey. Int. J. of Computer Science and Applications 1, 1–21 (2007)
CAS Google Scholar
Martinez, R., Pasquier, N., Pasquier, C.: GenMiner: Mining Informative Association Rules from Genomic Data. In: Proceedings of the IEEE BIBM international conference, pp. 15–22 (2007)
Google Scholar
NIST: e-Handbook of Statistical Methods. SEMATECH (2007), http://www.itl.nist.gov/div898/handbook/
Pan, K., Lih, C., Cohen, N.: Effects of Threshold Choice on Biological Conclusions Reached During Analysis of Gene Expression by DNA Microarrays. Proc. Nat. Aca. Sci. 102, 8961–8965 (2005)
Article CAS Google Scholar
Pasquier, N., Taouil, R., Bastide, Y., Stumme, G., Lakhal, L.: Generating a Condensed Representation for Association Rules. Journal of Intelligent Information Systems 24(1), 29–60 (2005)
Article Google Scholar
Shatkay, H., Edwards, S., Wilbur, W., Boguski, M.: Genes, Themes, Microarrays: Using Information Retrieval for Large-Scale Gene Analysis. In: Proceedings of the ISMB international conference, pp. 340–347 (2000)
Google Scholar
Tuzhilin, A., Adomavicius, G.: Handling Very Large Numbers of Association Rules in the Analysis of Microarray Data. In: Proceedings of the SIGKDD international conference, pp. 396–404 (2002)
Google Scholar
Yang, I., Chen, E., Hasseman, J., Liang, W., Frank, B., Sharov, V., Quackenbush, J.: Within the Fold: Assesing Differential Expression Measures and Reproducibility in Microarray Assays. Genome Biology 3, 11 (2002)
Google Scholar
Zhao, Y., McIntosh, K., Rudra, D., Schawalder, S., Shore, D., Warner, J.: Fine-Structure Analysis of Ribosomal Protein Gene Transcription. Molecular Cellular Biology 26(13), 4853–4862 (2006)
Article CAS PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire I3S, Université de Nice Sophia-Antipolis/CNRS UMR-6070, 06903, Sophia, Antipolis, France
Ricardo Martinez & Nicolas Pasquier
IDBC, Université de Nice Sophia-Antipolis/CNRS UMR-6543, Parc Valrose, 06108, Nice, France
Claude Pasquier

Authors

Ricardo Martinez
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Pasquier
View author publications
You can also search for this author in PubMed Google Scholar
Claude Pasquier
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

DISI - Dipartimento di Informatica e Scienze dell’Informazione, Università di Genova, Via Dodecaneso 35, 16146, Genova, Italy
Francesco Masulli
DMI, Dipartimento di Matematica ed Informatica, Università di Salerno, Via Ponte don Melillo, 84084, Fisciano (Sa), Italy
Roberto Tagliaferri
Department of Pharmaceutical Chemistry, School of Pharmacy, The University of Kansas, 2095 Constant Ave, Lawrence, 66047, Kansas, USA
Gennady M. Verkhivker

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martinez, R., Pasquier, N., Pasquier, C. (2009). Mining Association Rule Bases from Integrated Genomic Data and Annotations. In: Masulli, F., Tagliaferri, R., Verkhivker, G.M. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2008. Lecture Notes in Computer Science(), vol 5488. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02504-4_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-02504-4_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02503-7
Online ISBN: 978-3-642-02504-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics