Abstract
We present a new computational approach to infer DNA function from eukaryotic DNA sequence information. It is based on the fact that exons, regulatory regions, and non-coding non-regulatory DNA exhibit different statistical patterns. We suggest capturing and measuring these patterns by the following suite of statistical tools: (1) the ‘fluffy-tail’ test, a bootstrap procedure to recognize statistically significant abundant similar words in regulatory DNA; (2) an algorithm to assess the density of patches of low entropy as a new measure of homogeneity. This measure can be used to distinguish coding from non-coding and regulatory regions; (3) an adaptive window technique applied to rescaled range analysis and entropy measurements. This is an optimization technique to segment DNA into homogeneous parts (that are therefore likely to be coding), of which the outcomes are independent of the size of the sliding window and hence avoids averaging. The application of our methods to several annotated data sets from six eukaryotic species enables a clear separation of coding, regulatory, and non-coding non-regulatory DNA. We propose that established computational methods complemented by our new statistical tests and augmented with the novel optimization technique for sliding windows create a powerful tool for the characterization and annotation of DNA sequences. The software is available from the authors on request.
Key words
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer Science+Business Media, Inc.
About this chapter
Cite this chapter
Abnizova, I., te Boekhorst, R., Walter, K., Gilks, W.R. (2006). New Methods to Infer DNA Function from Sequence Information. In: Kolchanov, N., Hofestaedt, R., Milanesi, L. (eds) Bioinformatics of Genome Regulation and Structure II. Springer, Boston, MA. https://doi.org/10.1007/0-387-29455-4_18
Download citation
DOI: https://doi.org/10.1007/0-387-29455-4_18
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-29450-6
Online ISBN: 978-0-387-29455-1
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)