Scan Statistics pp 1-25 | Cite as

# Joseph Naus: Father of the Scan Statistic

## Abstract

Currently, the literature on the scan statistic is vast, growing exponentially in diverse directions, with contributions by many researchers and groups. As time goes on, the early history of the problem bears telling. Joseph Naus, the father of the scan statistic, originated the modern work on the topic. The process took almost twenty years to reach maturity; I have chosen Naus (1982) as the definition of this maturity. The very name “scan statistic” does not appear to have become attached to the problem for fifteen years, and the interconnections to what is now one problem, in both statement of the problem and common methods of solution, was far from obvious originally. This chapter will not attempt a full review of all of Naus’s statistical contributions, or even a full review of his contributions as they concern the scan statistic. Instead, it will focus on a few themes that had already originated in Naus’s first twenty years of written research (1962–1982), and briefly continue with those threads to the present. Since these early themes include such general issues as applications of the scan statistic, mentoring graduate students, and specific methodological issues, the review will encompass a significant portion of Dr. Naus’s research, without making claim to being exhaustive regarding either his research or the much broader topic of research he influenced on the scan statistic.

## Keywords

Exact Probability Multiple Coverage Poisson Approximation Generalize Likelihood Ratio Test Fixed Grid## References

- 1.Balakrishnan, N. and Koutras, M.V. (2002).
*Runs and Scans with Applications*, Wiley, New York.MATHGoogle Scholar - 2.Barton, D.E. and Mallows, C.L. (1965). Some aspects of the random sequence,
*Annals of Mathematical Statistics*,**36**, 236–260.CrossRefMATHMathSciNetGoogle Scholar - 3.Berg, W. (1945). Aggregates in one- and two-dimensional random distributions,
*The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science*,**36**, 319–336.Google Scholar - 4.Burnside W. (1928).
*Theory of Probability*, Cambridge University Press, Cambridge.MATHGoogle Scholar - 5.Cressie, N. (1977). On some properties of the scan statistic on the circle and the line,
*Journal of Applied Probability*,**14**, 272–283.CrossRefMATHMathSciNetGoogle Scholar - 6.Cressie, N. (1979). An optimal statistic based on higher order gaps,
*Biometrika*,**66**, 619–627.CrossRefMATHMathSciNetGoogle Scholar - 7.Ederer, F., Myers, M.H., and Mantel, N. (1964). A statistical problem in space and time: Do leukemia cases come in clusters?
*Biometrics*,**20**, 626–636.CrossRefGoogle Scholar - 8.Elteren, Van P.H. and Gerrits, H.J.M. (1961). Een wachtprobleem voorkomende bij drempelwaardemetingen aan het oof,
*Statistica Neerlandica*,**15**, 385–401.CrossRefGoogle Scholar - 9.Erdös, P. and Rényi, A. (1970). On a new law of large numbers,
*Journal d’Analyse Mathématique*,**23**, 103–111.CrossRefMATHGoogle Scholar - 10.Feller, W. (1958).
*An Introduction to Probability Theory and its Applications*, Vol. I, 2nd Edition, John Wiley & Sons, New York.Google Scholar - 11.Fisher, R.A. (1959).
*Statistical Methods and Scientific Inference*, Hafner, New York.Google Scholar - 12.Fu, J.C. and Lou, W.Y.W. (2003).
*Distribution Theory of Runs and Patterns and Its Applications*, World Scientific, Singapore.MATHGoogle Scholar - 13.Glaz, J. (1978).
*Multiple Coverage and Clusters on the Line*, Ph.D. thesis, Rutgers University, New Brunswick, NJ.Google Scholar - 14.Glaz, J. and Balakrishnan, N., Editors (1999).
*Scan Statistics and Applications*, Birkhäuser, Boston, MA.Google Scholar - 15.Glaz, J. and Naus, J. (1979). Multiple coverage of the line,
*Annals of Probability*,**7**, 900–906.CrossRefMATHMathSciNetGoogle Scholar - 16.Glaz, J. and Naus, J. (1983). Multiple clusters on the line,
*Communications in Statistics—Theory and Methods*,**12**, 1961–1986.Google Scholar - 17.Glaz, J. and Naus, J. (1986). Approximating probabilities of first passage in a particular Gaussian process,
*Communications in Statistics*,**15**, 1709–1722.CrossRefMATHMathSciNetGoogle Scholar - 18.Glaz, J. and Naus, J. (1991). Tight bounds and approximations for scan statistic probabilities for discrete data,
*Annals of Applied Probability*,**1**, 306–318.CrossRefMATHMathSciNetGoogle Scholar - 19.Glaz, J. and Naus, J. (2005). Scan Statistics and Applications,
*Encyclopedia of Statistical Sciences*, 2nd Edition, S. Kotz, N. Balakrishnan, C.B. Read and B. Vidacovic, eds., 7463–7471, Wiley, New York.Google Scholar - 20.Glaz, J., Naus, J., Roos, M., and Wallenstein, S. (1994). Poisson approximations for the distribution and moments of ordered
*m*-spacings,*Journal Applied Probability*,**31A**, 271–281.CrossRefMathSciNetGoogle Scholar - 21.Glaz, J., Naus, J., and Wallenstein, S. (2001).
*Scan Statistics*, Springer-Verlag, New York.MATHGoogle Scholar - 22.Greenberg, M., Naus, J., Schneider, D., and Wartenberg, D. (1991). Temporal clustering of homicide and suicide among 15–24 year old white and black Americans,
*Ethnicity and Disease*,**1**, 342–350.Google Scholar - 23.Huntington, R. and Naus, J.I. (1975). A simpler expression for
*k*th nearest neighbor coincidence probabilities,*Annals of Probability*,**3**, 894–896.CrossRefMATHMathSciNetGoogle Scholar - 24.Hwang, F.K. (1977). A generalization of the Karlin-McGregor theorem on coincidence probabilities and an application to clustering,
*Annals of Probability*,**5**, 814–817.CrossRefMATHGoogle Scholar - 25.Ikeda, S. (1965). On Bouman-Velden-Yamamoto’s asymptotic evaluation formula for the probability of visual response in a certain experimental research in quantum biophysics of vision,
*Annals of the Institute of Statistics and Mathematics*,**17**, 295–310.CrossRefMATHGoogle Scholar - 26.Karlin, S. and McGregor, G. (1959). Coincidence probabilities,
*Pacific Journal of Mathematics*,**9**, 1141–1164.MATHMathSciNetGoogle Scholar - 27.Karwe, V.V. and Naus, J. (1997). New recursive methods for scan statistic probabilities,
*Computational Statistics and Data Analysis*,**23**, 389–402.CrossRefMATHGoogle Scholar - 28.Kulldorff, M. (1997). A spatial scan statistic,
*Communications in Statistics, A—Theory and Methods*,**26**, 1481–1496.Google Scholar - 29.Kulldorff, M. (2001). Prospective time-periodic geographical disease surveillance using a scan statistic,
*Journal of Royal Statistical Society A*,**164**, 61–72.CrossRefMATHMathSciNetGoogle Scholar - 30.Kulldorff, M. and Williams, G. (1997).
*SaTScan v. 1.0, Software for the Space and Space-Time Scan Statistics*, National Cancer Institute, Bethesda, MD.Google Scholar - 31.Loader, C. (1991). Large deviation approximations to the distribution of scan statistics,
*Advances in Applied Probability*,**23**, 751–771.CrossRefMATHMathSciNetGoogle Scholar - 32.Mack, C. (1948). An exact formula for
*Q*_{k}(*n*), the probable number of*k*-aggregates in a random distribution of*n*points,*The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science*,**39**, 778–790.MATHMathSciNetGoogle Scholar - 33.Mack, C. (1950). The expected number of aggregates in a random distribution of
*n*points,*Proceedings Cambridge Philosophical Society*,**46**, 285–292.CrossRefMATHMathSciNetGoogle Scholar - 34.Menon, M.V. (1964). Clusters in a Poisson process [abstract],
*Annals of Mathematical Statistics*,**35**, 1395.Google Scholar - 35.Naus, J. (1962). The distribution of the maximum number of points on the line,
*ASD Paper 8*.Google Scholar - 36.Naus, J. (1963).
*Clustering of Random Points in the Line and Plane*, Ph.D. thesis, Rutgers University, New Brunswick, NJ.Google Scholar - 37.Naus, J. (1965a). The distribution of the size of the maximum cluster of points on a line,
*Journal of the American Statistical Association*,**60**, 532–538.CrossRefMathSciNetGoogle Scholar - 38.Naus, J. (1965b). Clustering of random points in two dimensions,
*Biometrika*,**52**, 263–267.CrossRefMATHMathSciNetGoogle Scholar - 39.Naus, J. (1966a). A power comparison of two tests of non-random clustering,
*Technometrics*,**8**, 493–517.CrossRefMATHMathSciNetGoogle Scholar - 40.Naus, J. (1966b). Some probabilities, expectations, and variances for the size of largest clusters, and smallest intervals,
*Journal of the American Statistical Association*,**61**, 1191–1199.CrossRefMATHMathSciNetGoogle Scholar - 41.Naus, J. (1968). An extension of the birthday problem,
*American Statistician*,**22**, 27–29.CrossRefGoogle Scholar - 42.Naus, J. (1974). Probabilities for a generalized birthday problem,
*Journal of the American Statistical Association*,**69**, 810–815.CrossRefMATHMathSciNetGoogle Scholar - 43.Naus, J. (1979). An indexed bibliography of clusters, clumps and coincidences,
*International Statistical Review*,**47**, 47–78.MATHMathSciNetGoogle Scholar - 44.Naus, J. (1982). Approximations for distributions of scan statistics,
*Journal of the American Statistical Association*,**77**, 177–183.CrossRefMATHMathSciNetGoogle Scholar - 45.Naus, J. (1988). Scan statistics,
*Encyclopedia of Statistical Sciences*, Vol. 8, 281–284, N.L. Johnson and S. Kotz, eds., Wiley, New York.Google Scholar - 46.Naus, J. (2006). Scan Statistics,
*Handbook of Engineering Statistics*, H. Pham, ed., Chapter 43, 775-790.Springer-Verlag, New York.Google Scholar - 47.Naus, J. and Sheng K.N. (1996). Screening for unusual matched segments in multiple protein sequences,
*Communications in Statistics: Simulation and Computation*,**25**, 937–952.CrossRefMATHMathSciNetGoogle Scholar - 48.Naus, J. and Sheng K.N. (1997). Matching among multiple random sequences,
*Bulletin of Mathematical Biology*,**59**, 483–496.CrossRefMATHGoogle Scholar - 49.Naus, J. and Wartenberg D. (1997). A double scan statistic for clusters of two types of events,
*Journal of the American Statistical Association*,**92**, 1105–1113.CrossRefMATHMathSciNetGoogle Scholar - 50.Naus, J. and Wallenstein, S. (2004). Simultaneously testing for a range of cluster or scanning window sizes,
*Methodology and Computing in Applied Probability*,**6**, 389–400.CrossRefMATHMathSciNetGoogle Scholar - 51.Naus, J. and Wallenstein S. (2006). Temporal surveillance using scan statistics,
*Statistics in Medicine*,**25**, 311–324.CrossRefMathSciNetGoogle Scholar - 52.Neff, N. and Naus, J. (1980). The distribution of the size of the maximum cluster of points on a line,
*IMS Series of Selected Tables in Mathematical Statistics*, Vol. VI, AMS, Providence, RI.Google Scholar - 53.Newell, G.F. (1963). Distribution for the smallest distance between any pair of
*k*th nearest-neighbor random points on a line,*Time series analysis, Proceedings of a conference held at Brown University*, M. Rosenblatt editor, pp. 89–103, John Wiley & Sons, New York.Google Scholar - 54.Ozols, V. (1956). Generalization of the theorem of Gnedenko-Korolyuk to three samples in the case of two one-sided boundaries,
*Latvijas PSR Zinatnu Akad. Vestis*,**10**(111), 141–152.MathSciNetGoogle Scholar - 55.Parzen, E. (1960).
*Modern Probability Theory and its Applications*, John Wiley & Sons, New York.MATHGoogle Scholar - 56.Rabinowitz, L. and Naus, J. (1975). The expectation and variance of the number of components in random linear graphs,
*Annals of Probability*,**3**, 159–161.CrossRefMATHMathSciNetGoogle Scholar - 57.Samuel-Cahn, E. (1983). Simple approximations to the expected waiting time for a cluster of any given size for point processes,
*Advances in Applied Probability*,**15**, 21–38.CrossRefMATHMathSciNetGoogle Scholar - 58.Saperstein, B. (1972). The generalized birthday problem,
*Journal of the American Statistical Association*,**67**, 425–428.CrossRefMATHMathSciNetGoogle Scholar - 59.Sheng. K.N. and Naus, J. (1994). Pattern matching between two non-aligned random sequences,
*Bulleting of Mathematical Biology*,**56**, 1143–1162.MATHGoogle Scholar - 60.Sheng, K.N. and Naus, J. (1996). Matching fixed rectangles in 2-dimensions,
*Statistics and Probability Letters*,**26**, 83–90.CrossRefMATHMathSciNetGoogle Scholar - 61.Silberstein, L. (1945). The probable number of aggregates in random distributions of points,
*The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science*,**36**, 319–336.MathSciNetGoogle Scholar - 62.Takacs, L. (1961). On a coincidence problem concerning particle counters,
*Annals of Mathematical Statistics*,**32**, 739–756.CrossRefMATHMathSciNetGoogle Scholar - 63.Wallenstein S.R. and Naus, J. (1973). Probabilities for the
*k*th nearest neighbor problem on the line,*Annals of Probability*,**1**, 188–190.CrossRefMATHMathSciNetGoogle Scholar - 64.Wallenstein S. and Naus, J. (1974). Probabilities for the size of largest clusters and smallest intervals,
*Journal of the American Statistical Association*,**69**, 690–697.CrossRefMATHMathSciNetGoogle Scholar - 65.Wallenstein S., Naus, J., and Glaz, J. (1993). Power of the scan statistic for the detection of clustering,
*Statistics in Medicine*,**12**, 1829–1843.CrossRefGoogle Scholar - 66.Wallenstein, S. and Neff, N. (1987). An approximation for the distribution of the scan statistic,
*Statistics in Medicine*,**6**, 197–207.CrossRefGoogle Scholar - 67.Wolf E. and Naus, J. (1973). Tables of critical values for a
*k*-sample Kolmogorov-Smirnov test statistic,*Journal of the American Statistical Association*,**68**, 994–997.CrossRefMATHMathSciNetGoogle Scholar