Abstract
This work proposes a novel multi-purpose data standardization method inspired by gene-centric clustering approaches. The clustering is performed via template matching of expression profiles employing Dynamic Time Warping (DTW) alignment algorithm to measure the similarity between the profiles. In this way, for each gene profile a cluster consisting of a varying number of neighboring gene profiles (determined by the degree of similarity) is identified to be used in the subsequent standardization phase. The standardized profiles are extracted via a recursive aggregation algorithm, which reduces each cluster of neighboring expression profiles to a singe profile. The proposed data standardization method is validated on gene expression time series data coming from a study examining the global cell-cycle control of gene expression in fission yeast Schizosaccharomyces pombe.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aach, J., Church, G.M.: Aligning gene expression time series with time warping algorithms. Bioinformatics 17, 495–508 (2001)
Box, G.E.P., Cox, D.R.: An analysis of transformation. Journal of R. Stat. Society B. 26, 211–243 (1964)
Cheadle, C., Vawter, M.P., Freed, W.J., Becker, K.G.: Analysis of microarray data using Z score transformation. Journal of Molecular Diagnostics 5(2), 73–81 (2003)
Criel, J., Tsiporkova, E.: Gene Time Expression Warper: A tool for alignment, template matching and visualization of gene expression time series. Bioinformatics 22, 251–252 (2006)
Durbin, B.P., Hardin, J.S., Hawkins, D.M., Rocke, D.M.: A variance-stabilizing transformation for gene-expression microarray data. Bioinformatics 18(suppl. 1), S105–S110 (2002)
Durbin, B.P., Rocke, D.M.: Estimation of transformation parameters for microarray data. Bioinformatics 19, 1360–1367 (2003)
Fodor, J.C., Roubens, M.: Fuzzy Preference Modelling and Multicriteria Decision Support. Kluwer Academic Publishers, Dordrecht (1994)
Geller, S.C., Gregg, J.P., Hagerman, P., Rocke, D.M.: Transformation and normalization of oligonucleotide microarray data. Bioinformatics 19(14), 1817–1823 (2003)
Hermans, F., Tsiporkova, E.: Merging microarray cell synchronization experiments through curve alignment. Bioinformatics 23, e64–e70 (2007)
Ideker, T., Thorsson, V., Siegel, A.F., Hood, L.E.: Testing for differentially-expressed genes by maximum-likelihood analysis of microarray data. Journal of Computational Biology 7, 805–817 (2001)
Li, C., Wong, W.: Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc. National Academy Science USA 98, 31–36 (2001)
de Lichtenberg, U., Jensen, L.J., Fausbøll, A., Jensen, T.S., Bork, P., Brunak, S.: Comparison of computational methods for the identification of cell cycle-regulated genes. Bioinformatics 21(7), 1164–1171 (2004)
Quackembush, J.: Microarray data normalization and transformation. Nature Genetics Supplement 32, 496–501 (2002)
Rustici, G., Mata, J., Kivinen, K., Lio, P., Penkett, C.J., Burns, G., Hayles, J., Brazma, A., Nurse, P., Bähler, J.: Periodic gene expression program of the fission yeast cell cycle. Natural Genetics 36, 809–817 (2004)
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. on Acoust., Speech, and Signal Proc. ASSP-26, 43–49 (1978)
Sankoff, D., Kruskal, J.: Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. AddisonWesley, Reading Mass (1983)
Smyth, G.K., Speed, T.P.: Normalization of cDNA microarray data. Methods 31, 265–273 (2003)
Sokal, R.R., Rohlf, F.J.: Biometry, 3rd edn. W.H. Freeman and Co., New York (1995)
Speed, T.: Always log spot intensities and ratio. Speed Group Microarray Page, http://www.stat.berkeley.edu/users/terry/zarray/Html/log.html
Tsiporkova, E., Boeva, V.: Nonparametric recursive aggregation process. Kybernetika. Journal of the Czech Society for Cybernetics and Information Sciencies 40(1), 51–70 (2004)
Tsiporkova, E., Boeva, V.: Multi-step ranking of alternatives in a multi-criteria and multi-expert decision making environment. Information Sciencies 76(18), 2673–2697 (2006)
Tsiporkova, E., Boeva, V.: Modelling and simulation of the genetic phenomena of additivity and dominance via gene networks of parallel aggregation processes. In: Hochreiter, S., Wagner, R. (eds.) BIRD 2007. LNCS (LNBI), vol. 4414, pp. 199–211. Springer, Heidelberg (2007)
Tsiporkova, E., Boeva, V.: Two-pass imputation algorithm for missing value estimation in gene expression time series. Journal of Bioinformatics and Computational Biology 5(5), 1005–1022 (2007)
Tsiporkova, E., Boeva, V.: Fusing Time Series Expression Data through Hybrid Aggregation and Hierarchical Merge. Bioinformatics 24(16), i63–i69 (2008)
Wentian, L., Suh, Y.J., Zhang, J.: Does Logarithm Transformation of Microarray Data Affect Ranking Order of Differentially Expressed Genes? In: Proc. Engineering in Medicine and Biology Society, EMBS 2006. 28th Annual International Conference of the IEEE, Suppl., pp. 6593–6596 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Boeva, V., Tsiporkova, E. (2010). A Multi-purpose Time Series Data Standardization Method. In: Sgurev, V., Hadjiski, M., Kacprzyk, J. (eds) Intelligent Systems: From Theory to Practice. Studies in Computational Intelligence, vol 299. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13428-9_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-13428-9_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13427-2
Online ISBN: 978-3-642-13428-9
eBook Packages: EngineeringEngineering (R0)