Handbook of Scan Statistics pp 1-20 | Cite as

# On the Exact Distributions of Pattern Statistics for a Sequence of Binary Trials: A Combinatorial Approach

## Abstract

Consider a sequence of exchangeable or Markov-dependent binary (zero-one) trials. A sequence of independent and identically distributed binary trials is covered as a particular case of both the prementioned ones. For counting/waiting time pattern statistics defined on such model sequences, we point out how their exact probability distributions can be established using enumerative combinatorics. The expressions for the distributions contain probabilities depending on the internal structure of the model sequence and combinatorial numbers denoting set cardinalities. The latter numbers depend on the considered pattern statistics and the number of ones, for an exchangeable sequence, as well as the number of runs of ones, for a Markov-dependent sequence. These numbers become concrete when certain patterns and enumerative schemes are studied on the model sequences. Exact distributions for statistics connected to patterns of limited length, as well as to certain runs and scans, are provided using proper combinatorial numbers and exemplify the approach.

## Keywords

Exact distributions Enumerative combinatorics Runs, scans, and patterns Binary trials## Notes

### Acknowledgements

The authors wish to thank the anonymous referee for the thorough reading and useful comments and suggestions which helped to improve the paper.

## References

- Balakrishnan N, Koutras MV (2002) Runs and scans with applications. Wiley, New YorkzbMATHGoogle Scholar
- Boutsikas MV, Koutras MV, Milienos FS (2017) Asymptotic results for the multiple scan statistic. J Appl Prob 54:320–330MathSciNetCrossRefGoogle Scholar
- Charalambides CA (2002) Enumerative combinatorics. Chapman
*&*Hall/CRC, Boca RatonGoogle Scholar - Chern HH, Hwang HK, Yeh YN (2000) Distribution of the number of consecutive records. Random Struct Algorithms 17:169–196MathSciNetCrossRefGoogle Scholar
- Dafnis SD, Philippou AN (2011) Distributions of patterns with applications in engineering. IAENG Int J Appl Math 41:68–75MathSciNetzbMATHGoogle Scholar
- Dafnis SD, Philippou AN, Antzoulakos DL (2012) Distributions of patterns of two successes separated by a string of
*k*− 2 failures. Stat Papers 53:323–344MathSciNetCrossRefGoogle Scholar - Eryilmaz S (2010) Discrete scan statistics generated by exchangeable binary trials. J Appl Prob 47:1084–1092MathSciNetCrossRefGoogle Scholar
- Eryilmaz S (2011) Joint distribution of run statistics in partially exchangeable processes. Statist Probab Lett 81:163–168MathSciNetCrossRefGoogle Scholar
- Eryilmaz S (2016) A new class of life time distributions. Statist Probab Lett 112:63–71MathSciNetCrossRefGoogle Scholar
- Eryilmaz S (2017) The concept of weak exchangeability and its applications. Metrika 80:259–271MathSciNetCrossRefGoogle Scholar
- Eryilmaz S, Zuo M (2010) Constrained (
*k*,*d*)-out-of-*n*systems. Inst J Syst Sci 41:679–685MathSciNetCrossRefGoogle Scholar - Eryilmaz S, Yalcin F (2011) On the mean and extreme distances between failures in Markovian binary sequences. J Comput Appl Math 236:1502–1510MathSciNetCrossRefGoogle Scholar
- Feller W (1968) An introduction to probability theory and its applications, vol I, 3rd edn. Wiley, New YorkzbMATHGoogle Scholar
- Fu JC, Lou WYW (2003) Distribution theory of runs and patterns and its applications: a finite Markov chain imbedding approach. World Scientific Publishing Co. Inc., River EdgeCrossRefGoogle Scholar
- George EO, Bowman D (1995) A full likelihood procedure for analyzing exchangeable binary data. Biometrics 51:512–523MathSciNetCrossRefGoogle Scholar
- Glaz J, Naus J, Wallenstein S (2001) Scan statistics. Springer, New YorkCrossRefGoogle Scholar
- Holst L (2007) Counts of failure strings in certain Bernoulli sequences. J Appl Prob 44:824–830MathSciNetCrossRefGoogle Scholar
- Holst L (2008) The number of two consecutive successes in a Hope-Polya urn. J Appl Prob 45:901–906CrossRefGoogle Scholar
- Holst L (2009) On consecutive records in certain Bernoulli sequences. J Appl Prob 46:1201–1208MathSciNetCrossRefGoogle Scholar
- Huffer FW, Sethuraman J, Sethuraman S (2009) A study of counts of Bernoulli strings via conditional Poisson processes. Proc Am Math Soc 137:2125–2134MathSciNetCrossRefGoogle Scholar
- Jacquet P, Szpankowski W (2006) On (
*d*,*k*) sequences not containing a given word. In: International symposium on information theory, ISIT 2006, Seatle, pp 1486–1489Google Scholar - Joffe A, Marchand E, Perron F, Popadiuk P (2004) On sums of products of Bernoulli variables and random permutations. J Theor Probab 17:285–292MathSciNetCrossRefGoogle Scholar
- Johnson BC, Fu JC (2014) Approximating the distributions of runs and patterns. J Stat Distrib Appl 1:5CrossRefGoogle Scholar
- Koutras MV, Eryilmaz S (2017) Compound geometric distribution of order
*k*. Methodol Comput Appl Probab 19:377–393MathSciNetCrossRefGoogle Scholar - Koutras MV, Lyberopoulos DP (2018) Asymptotic results for jump probabilities associated to the multiple scan statistic. Ann Inst Stat Math 70:951–968MathSciNetCrossRefGoogle Scholar
- Koutras VM, Koutras MV, Yalcin F (2016) A simple compound scan statistic useful for modeling insurance and risk management problems. Insur Math Econ 69:202–209MathSciNetCrossRefGoogle Scholar
- Kumar AN, Upadhye NS (2018) Generalizations of distributions related to (
*k*_{1},*k*_{2})-runs. Metrika. https://doi.org/10.1007/s00184-018-0668-x Google Scholar - Ling KD (1988) On binomial distributions of order
*k*. Statist Probab Lett 6:247–250MathSciNetCrossRefGoogle Scholar - Makri FS (2010) On occurrences of
*F*−*S*strings in linearly and circularly ordered binary sequences. J Appl Prob 47:157–178MathSciNetCrossRefGoogle Scholar - Makri FS (2011) Minimum and maximum distances between failures in binary sequences. Statist Probab Lett 81:402–410MathSciNetCrossRefGoogle Scholar
- Makri FS, Psillakis ZM (2011) On success runs of a fixed length in Bernoulli sequences: exact and asymtotic results. Comput Math Appl 61:761–772MathSciNetCrossRefGoogle Scholar
- Makri FS, Psillakis ZM (2012) Counting certain binary strings. J Statist Plan Inference 142:908–924MathSciNetCrossRefGoogle Scholar
- Makri FS, Psillakis ZM (2013) Exact distributions of constrained (
*k*,*ℓ*) strings of failures between subsequent successes. Stat Papers 54:783–806MathSciNetCrossRefGoogle Scholar - Makri FS, Psillakis ZM (2014) On the expected number of limited length binary strings derived by certain urn models. J Probab. https://doi.org/10.1155/2014/646140 CrossRefGoogle Scholar
- Makri FS, Psillakis ZM (2016) On runs of ones defined on a
*q*-sequence of binary trials. Metrika 79:579–602MathSciNetCrossRefGoogle Scholar - Makri FS, Psillakis ZM (2017) On limited length binary strings with an application in statistical control. Open Stat Probab J 8:1–6CrossRefGoogle Scholar
- Makri FS, Philippou AN, Psillakis ZM (2007) Success run statistics defined on an urn model. Adv Appl Prob 39:991–1019MathSciNetCrossRefGoogle Scholar
- Makri FS, Psillakis ZM, Arapis AN (2019) On the concentration of runs of ones of length exceeding a threshold in a Markov chain. J Appl Statist 46:85–100MathSciNetCrossRefGoogle Scholar
- Mood AM (1940) The distribution theory of runs. Ann Math Statist 11:367–392MathSciNetCrossRefGoogle Scholar
- Mori TF (2001) On the distribution of sums of overlapping products. Acta Sci Math (Szeged) 67:833–841MathSciNetzbMATHGoogle Scholar
- Mytalas GC, Zazanis MA (2013) Cental limit theorem approximations for the number of runs in Markov-dependent binary sequences. J Statist Plann Inference 143:321–333MathSciNetCrossRefGoogle Scholar
- Riordan J (1964) An introduction to combinatorial analysis, 2nd edn. Wiley, New YorkzbMATHGoogle Scholar
- Sarkar A, Sen K, Anuradha (2004) Waiting time distributions of runs in higher order Markov chains. Ann Inst Stat Math 56:317–349Google Scholar
- Sen K, Goyal B (2004) Distributions of patterns of two failures separated by success runs of length
*k*. J Korean Stat Soc 33:35–58MathSciNetGoogle Scholar - Stefanov VT, Szpankowski W (2007) Waiting time distributions for pattern occurrence in a constrained sequence. Discret Math Theor Comput Sci 9:305–320MathSciNetzbMATHGoogle Scholar
- Zehavi E, Wolf JK (1988) On runlength codes. IEEE Trans Inf Theory 34:45–54CrossRefGoogle Scholar