On the Use of Statistics in Genomics and Bioinformatics
The human genome project and other genome projects provide us with rich sources of data which invite many new forms of statistical analysis. The nature of the data is often different from that in many other areas of science. This has led to novel forms of data analysis, not to be found in the classical statistical literature. The purpose of this paper is to describe some of these new forms, with a focus on those cases where the biology drives the questions asked, and the statistical analysis presents new features as well as raising further challenges.
Key-wordsBLAST motifs microarrays and the multiple testing problem the false discovery rate concept evolutionary models
AMS Subject Classification62P12 60G70 60G60 60J20
Unable to display preview. Download preview PDF.
- Dayhoff, M. O., Schwartz, R. M., Orcutt, B. C., 1978. A model of evolutionary change in proteins. In Atlas of Protein Sequence Structure 5, Supplement 3.Google Scholar
- Feller, W., 1968. An Introduction to Probability Theory and its Applications, Vol. 1, 3rd edition, Wiley, New York.Google Scholar
- Jukes, T. H., Cantor, C. R., 1969. Evolution of protein molecules. In Munro, H.N. (ed.), Mammalian Protein Metabolism, Academic Press, New York.Google Scholar
- Kimura, M., 1980. A simple method for estimating evolutionary rate in a finite population due to mutational production of neutral and nearly neutral base substitution through comparative studies of nucleotide sequences. Journal of Molecular Biology, 16, 111–120.Google Scholar