Stochastic Distance Between Burkitt Lymphoma/Leukemia Strains
Quantifying the proximity between N-grams allows to establish criteria of comparison between them. Recently, a consistent distance d to achieve this end was proposed, see García JE, González-López VA. Detecting regime changes in Markov models. In New trends in stochastic modeling and data analysis (chapter 2, page 103), 2015. This distance takes advantage of a model structure on Markovian processes in finite alphabets and with finite memories, called Partition Markov Models, see García JE, González-López VA. Entropy 19:160, 2017. In this work we explore the performance of d in a real problem, using d to establish a notion of natural proximity between DNA sequences from patients with identical diagnosis, which is: Burkitt lymphoma/leukemia. And we present a robust strategy of estimation to identify the stochastic law that governs most of the sequences considered, thus mapping out a common profile to all these patients, via their DNA sequences.
KeywordsPartition Markov models Bayesian information criterion Robust estimation in stochastic processes
- García, J. E., & González-López, V. A. (2015). Detecting regime changes in Markov models. In R. Manca, S. McClean, C. H. Skiadas (Eds) New trends in stochastic modeling and data analysis. Chapter 2, 103. ISAST, Athens, Greece (ISBN: 978-618-5180-06-5) .Google Scholar