Abstract
In applications, such as biological ones, the segmentation of very long binary sequences is necessary. For example, such problems arise in DNA analysis. Some properties of a DNA sequence can be coded as a binary sequence and it should be separated into the homogeneous increments. In this paper, we propose a new approach for the segmentation of long binary sequences. Our approach is based on a transformation of an initial sequence into a sequence of real numbers. We will call such sequence a diagnostic sequence. After that, in the case of sequences generated by the stochastic mechanisms, we propose to apply the nonparametric change-point detection algorithm of Brodsky-Darkhovsky to the diagnostic sequence. If we don’t know the type of generating mechanism of the sequence, we propose to utilize our theory of \(\varepsilon \)-complexity to create new diagnostic sequences of \(\varepsilon \)-complexity coefficients. Subsequently, the change-point detection algorithm of Brodsky-Darkhovsky is applied to these diagnostic sequences. We verify the performance of the proposed methods on simulations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Antoch, J., Jarušková, D.: Testing for multiple change points. Comput. Stat. 28(5), 2161–2183 (2013)
Bai, J., Perron, P.: Estimating and testing linear models with multiple structural changes. Econometrica 47–78 (1998)
Billingsley, P.: Convergence of Probability Measures. Wiley, New York (2013)
Braun, J.V., Muller, H.G.: Statistical methods for DNA sequence segmentation. Stat. Sci. 142–162 (1998)
Braun, J.V., Braun, R., Müller, H.G.: Multiple changepoint fitting via quasilikelihood, with application to dna sequence segmentation. Biometrika 87(2), 301–314 (2000)
Brodsky, B., Darkhovsky, B.: Non-parametric Statistical Diagnosis. Mathematics and Its Applications, vol. 509 (2000)
Darkhovskii, B., Brodskii, B.: An identification of the “disorder” time of the random sequence. IFAC Proc. Vol. 12(8), 373–379 (1979)
Darkhovsky, B., Piryatinska, A.: New approach to the segmentation problem for time series of arbitrary nature. Proc. Steklov Inst. Math. 287(1), 54–67 (2014)
Fryzlewicz, P., et al.: Wild binary segmentation for multiple change-point detection. Ann. Stat. 42(6), 2243–2281 (2014)
Horváth, L., Serbinowska, M.: Testing for changes in multinomial observations: the lindisfarne scribes problem. Scandinavian J. Stat. 371–384 (1995)
Hudecová, Š.: Structural changes in autoregressive models for binary time series. J. Stat. Plan. Inference 143(10), 1744–1752 (2013)
Yang, T.Y., Kim, J.: Binary segmentation procedure for detecting change points in a dna sequence. Commun. Stat. Appl. Methods 12(1), 139–147 (2005)
Acknowledgements
Boris Darkhovsky gratefully acknowledges the partial support of this study by the Russian Foundation for Basic Research (project no. 17-29-02115). We would like to thank the anonymous reviewers for their suggestions and comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Darkhovsky, B., Piryatinska, A. (2019). Detection of Changes in Binary Sequences. In: Steland, A., Rafajłowicz, E., Okhrin, O. (eds) Stochastic Models, Statistics and Their Applications. SMSA 2019. Springer Proceedings in Mathematics & Statistics, vol 294. Springer, Cham. https://doi.org/10.1007/978-3-030-28665-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-28665-1_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28664-4
Online ISBN: 978-3-030-28665-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)