Joseph Naus: Father of the Scan Statistic

  • Sylvan Wallenstein
Part of the Statistics for Industry and Technology book series (SIT)


Currently, the literature on the scan statistic is vast, growing exponentially in diverse directions, with contributions by many researchers and groups. As time goes on, the early history of the problem bears telling. Joseph Naus, the father of the scan statistic, originated the modern work on the topic. The process took almost twenty years to reach maturity; I have chosen Naus (1982) as the definition of this maturity. The very name “scan statistic” does not appear to have become attached to the problem for fifteen years, and the interconnections to what is now one problem, in both statement of the problem and common methods of solution, was far from obvious originally. This chapter will not attempt a full review of all of Naus’s statistical contributions, or even a full review of his contributions as they concern the scan statistic. Instead, it will focus on a few themes that had already originated in Naus’s first twenty years of written research (1962–1982), and briefly continue with those threads to the present. Since these early themes include such general issues as applications of the scan statistic, mentoring graduate students, and specific methodological issues, the review will encompass a significant portion of Dr. Naus’s research, without making claim to being exhaustive regarding either his research or the much broader topic of research he influenced on the scan statistic.


Exact Probability Multiple Coverage Poisson Approximation Generalize Likelihood Ratio Test Fixed Grid 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Balakrishnan, N. and Koutras, M.V. (2002). Runs and Scans with Applications, Wiley, New York.MATHGoogle Scholar
  2. 2.
    Barton, D.E. and Mallows, C.L. (1965). Some aspects of the random sequence, Annals of Mathematical Statistics, 36, 236–260.CrossRefMATHMathSciNetGoogle Scholar
  3. 3.
    Berg, W. (1945). Aggregates in one- and two-dimensional random distributions, The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 36, 319–336.Google Scholar
  4. 4.
    Burnside W. (1928). Theory of Probability, Cambridge University Press, Cambridge.MATHGoogle Scholar
  5. 5.
    Cressie, N. (1977). On some properties of the scan statistic on the circle and the line, Journal of Applied Probability, 14, 272–283.CrossRefMATHMathSciNetGoogle Scholar
  6. 6.
    Cressie, N. (1979). An optimal statistic based on higher order gaps, Biometrika, 66, 619–627.CrossRefMATHMathSciNetGoogle Scholar
  7. 7.
    Ederer, F., Myers, M.H., and Mantel, N. (1964). A statistical problem in space and time: Do leukemia cases come in clusters? Biometrics, 20, 626–636.CrossRefGoogle Scholar
  8. 8.
    Elteren, Van P.H. and Gerrits, H.J.M. (1961). Een wachtprobleem voorkomende bij drempelwaardemetingen aan het oof, Statistica Neerlandica, 15, 385–401.CrossRefGoogle Scholar
  9. 9.
    Erdös, P. and Rényi, A. (1970). On a new law of large numbers, Journal d’Analyse Mathématique, 23, 103–111.CrossRefMATHGoogle Scholar
  10. 10.
    Feller, W. (1958). An Introduction to Probability Theory and its Applications, Vol. I, 2nd Edition, John Wiley & Sons, New York.Google Scholar
  11. 11.
    Fisher, R.A. (1959). Statistical Methods and Scientific Inference, Hafner, New York.Google Scholar
  12. 12.
    Fu, J.C. and Lou, W.Y.W. (2003). Distribution Theory of Runs and Patterns and Its Applications, World Scientific, Singapore.MATHGoogle Scholar
  13. 13.
    Glaz, J. (1978). Multiple Coverage and Clusters on the Line, Ph.D. thesis, Rutgers University, New Brunswick, NJ.Google Scholar
  14. 14.
    Glaz, J. and Balakrishnan, N., Editors (1999). Scan Statistics and Applications, Birkhäuser, Boston, MA.Google Scholar
  15. 15.
    Glaz, J. and Naus, J. (1979). Multiple coverage of the line, Annals of Probability, 7, 900–906.CrossRefMATHMathSciNetGoogle Scholar
  16. 16.
    Glaz, J. and Naus, J. (1983). Multiple clusters on the line, Communications in Statistics—Theory and Methods, 12, 1961–1986.Google Scholar
  17. 17.
    Glaz, J. and Naus, J. (1986). Approximating probabilities of first passage in a particular Gaussian process, Communications in Statistics, 15, 1709–1722.CrossRefMATHMathSciNetGoogle Scholar
  18. 18.
    Glaz, J. and Naus, J. (1991). Tight bounds and approximations for scan statistic probabilities for discrete data, Annals of Applied Probability, 1, 306–318.CrossRefMATHMathSciNetGoogle Scholar
  19. 19.
    Glaz, J. and Naus, J. (2005). Scan Statistics and Applications, Encyclopedia of Statistical Sciences, 2nd Edition, S. Kotz, N. Balakrishnan, C.B. Read and B. Vidacovic, eds., 7463–7471, Wiley, New York.Google Scholar
  20. 20.
    Glaz, J., Naus, J., Roos, M., and Wallenstein, S. (1994). Poisson approximations for the distribution and moments of ordered m-spacings, Journal Applied Probability, 31A, 271–281.CrossRefMathSciNetGoogle Scholar
  21. 21.
    Glaz, J., Naus, J., and Wallenstein, S. (2001). Scan Statistics, Springer-Verlag, New York.MATHGoogle Scholar
  22. 22.
    Greenberg, M., Naus, J., Schneider, D., and Wartenberg, D. (1991). Temporal clustering of homicide and suicide among 15–24 year old white and black Americans, Ethnicity and Disease, 1, 342–350.Google Scholar
  23. 23.
    Huntington, R. and Naus, J.I. (1975). A simpler expression for kth nearest neighbor coincidence probabilities, Annals of Probability, 3, 894–896.CrossRefMATHMathSciNetGoogle Scholar
  24. 24.
    Hwang, F.K. (1977). A generalization of the Karlin-McGregor theorem on coincidence probabilities and an application to clustering, Annals of Probability, 5, 814–817.CrossRefMATHGoogle Scholar
  25. 25.
    Ikeda, S. (1965). On Bouman-Velden-Yamamoto’s asymptotic evaluation formula for the probability of visual response in a certain experimental research in quantum biophysics of vision, Annals of the Institute of Statistics and Mathematics, 17, 295–310.CrossRefMATHGoogle Scholar
  26. 26.
    Karlin, S. and McGregor, G. (1959). Coincidence probabilities, Pacific Journal of Mathematics, 9, 1141–1164.MATHMathSciNetGoogle Scholar
  27. 27.
    Karwe, V.V. and Naus, J. (1997). New recursive methods for scan statistic probabilities, Computational Statistics and Data Analysis, 23, 389–402.CrossRefMATHGoogle Scholar
  28. 28.
    Kulldorff, M. (1997). A spatial scan statistic, Communications in Statistics, A—Theory and Methods, 26, 1481–1496.Google Scholar
  29. 29.
    Kulldorff, M. (2001). Prospective time-periodic geographical disease surveillance using a scan statistic, Journal of Royal Statistical Society A, 164, 61–72.CrossRefMATHMathSciNetGoogle Scholar
  30. 30.
    Kulldorff, M. and Williams, G. (1997). SaTScan v. 1.0, Software for the Space and Space-Time Scan Statistics, National Cancer Institute, Bethesda, MD.Google Scholar
  31. 31.
    Loader, C. (1991). Large deviation approximations to the distribution of scan statistics, Advances in Applied Probability, 23, 751–771.CrossRefMATHMathSciNetGoogle Scholar
  32. 32.
    Mack, C. (1948). An exact formula for Q k(n), the probable number of k-aggregates in a random distribution of n points, The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 39, 778–790.MATHMathSciNetGoogle Scholar
  33. 33.
    Mack, C. (1950). The expected number of aggregates in a random distribution of n points, Proceedings Cambridge Philosophical Society, 46, 285–292.CrossRefMATHMathSciNetGoogle Scholar
  34. 34.
    Menon, M.V. (1964). Clusters in a Poisson process [abstract], Annals of Mathematical Statistics, 35, 1395.Google Scholar
  35. 35.
    Naus, J. (1962). The distribution of the maximum number of points on the line, ASD Paper 8.Google Scholar
  36. 36.
    Naus, J. (1963). Clustering of Random Points in the Line and Plane, Ph.D. thesis, Rutgers University, New Brunswick, NJ.Google Scholar
  37. 37.
    Naus, J. (1965a). The distribution of the size of the maximum cluster of points on a line, Journal of the American Statistical Association, 60, 532–538.CrossRefMathSciNetGoogle Scholar
  38. 38.
    Naus, J. (1965b). Clustering of random points in two dimensions, Biometrika, 52, 263–267.CrossRefMATHMathSciNetGoogle Scholar
  39. 39.
    Naus, J. (1966a). A power comparison of two tests of non-random clustering, Technometrics, 8, 493–517.CrossRefMATHMathSciNetGoogle Scholar
  40. 40.
    Naus, J. (1966b). Some probabilities, expectations, and variances for the size of largest clusters, and smallest intervals, Journal of the American Statistical Association, 61, 1191–1199.CrossRefMATHMathSciNetGoogle Scholar
  41. 41.
    Naus, J. (1968). An extension of the birthday problem, American Statistician, 22, 27–29.CrossRefGoogle Scholar
  42. 42.
    Naus, J. (1974). Probabilities for a generalized birthday problem, Journal of the American Statistical Association, 69, 810–815.CrossRefMATHMathSciNetGoogle Scholar
  43. 43.
    Naus, J. (1979). An indexed bibliography of clusters, clumps and coincidences, International Statistical Review, 47, 47–78.MATHMathSciNetGoogle Scholar
  44. 44.
    Naus, J. (1982). Approximations for distributions of scan statistics, Journal of the American Statistical Association, 77, 177–183.CrossRefMATHMathSciNetGoogle Scholar
  45. 45.
    Naus, J. (1988). Scan statistics, Encyclopedia of Statistical Sciences, Vol. 8, 281–284, N.L. Johnson and S. Kotz, eds., Wiley, New York.Google Scholar
  46. 46.
    Naus, J. (2006). Scan Statistics, Handbook of Engineering Statistics, H. Pham, ed., Chapter 43, 775-790.Springer-Verlag, New York.Google Scholar
  47. 47.
    Naus, J. and Sheng K.N. (1996). Screening for unusual matched segments in multiple protein sequences, Communications in Statistics: Simulation and Computation, 25, 937–952.CrossRefMATHMathSciNetGoogle Scholar
  48. 48.
    Naus, J. and Sheng K.N. (1997). Matching among multiple random sequences, Bulletin of Mathematical Biology, 59, 483–496.CrossRefMATHGoogle Scholar
  49. 49.
    Naus, J. and Wartenberg D. (1997). A double scan statistic for clusters of two types of events, Journal of the American Statistical Association, 92, 1105–1113.CrossRefMATHMathSciNetGoogle Scholar
  50. 50.
    Naus, J. and Wallenstein, S. (2004). Simultaneously testing for a range of cluster or scanning window sizes, Methodology and Computing in Applied Probability, 6, 389–400.CrossRefMATHMathSciNetGoogle Scholar
  51. 51.
    Naus, J. and Wallenstein S. (2006). Temporal surveillance using scan statistics, Statistics in Medicine, 25, 311–324.CrossRefMathSciNetGoogle Scholar
  52. 52.
    Neff, N. and Naus, J. (1980). The distribution of the size of the maximum cluster of points on a line, IMS Series of Selected Tables in Mathematical Statistics, Vol. VI, AMS, Providence, RI.Google Scholar
  53. 53.
    Newell, G.F. (1963). Distribution for the smallest distance between any pair of kth nearest-neighbor random points on a line, Time series analysis, Proceedings of a conference held at Brown University, M. Rosenblatt editor, pp. 89–103, John Wiley & Sons, New York.Google Scholar
  54. 54.
    Ozols, V. (1956). Generalization of the theorem of Gnedenko-Korolyuk to three samples in the case of two one-sided boundaries, Latvijas PSR Zinatnu Akad. Vestis, 10 (111), 141–152.MathSciNetGoogle Scholar
  55. 55.
    Parzen, E. (1960). Modern Probability Theory and its Applications, John Wiley & Sons, New York.MATHGoogle Scholar
  56. 56.
    Rabinowitz, L. and Naus, J. (1975). The expectation and variance of the number of components in random linear graphs, Annals of Probability, 3, 159–161.CrossRefMATHMathSciNetGoogle Scholar
  57. 57.
    Samuel-Cahn, E. (1983). Simple approximations to the expected waiting time for a cluster of any given size for point processes, Advances in Applied Probability, 15, 21–38.CrossRefMATHMathSciNetGoogle Scholar
  58. 58.
    Saperstein, B. (1972). The generalized birthday problem, Journal of the American Statistical Association, 67, 425–428.CrossRefMATHMathSciNetGoogle Scholar
  59. 59.
    Sheng. K.N. and Naus, J. (1994). Pattern matching between two non-aligned random sequences, Bulleting of Mathematical Biology, 56, 1143–1162.MATHGoogle Scholar
  60. 60.
    Sheng, K.N. and Naus, J. (1996). Matching fixed rectangles in 2-dimensions, Statistics and Probability Letters, 26, 83–90.CrossRefMATHMathSciNetGoogle Scholar
  61. 61.
    Silberstein, L. (1945). The probable number of aggregates in random distributions of points, The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 36, 319–336.MathSciNetGoogle Scholar
  62. 62.
    Takacs, L. (1961). On a coincidence problem concerning particle counters, Annals of Mathematical Statistics, 32, 739–756.CrossRefMATHMathSciNetGoogle Scholar
  63. 63.
    Wallenstein S.R. and Naus, J. (1973). Probabilities for the kth nearest neighbor problem on the line, Annals of Probability, 1, 188–190.CrossRefMATHMathSciNetGoogle Scholar
  64. 64.
    Wallenstein S. and Naus, J. (1974). Probabilities for the size of largest clusters and smallest intervals, Journal of the American Statistical Association, 69, 690–697.CrossRefMATHMathSciNetGoogle Scholar
  65. 65.
    Wallenstein S., Naus, J., and Glaz, J. (1993). Power of the scan statistic for the detection of clustering, Statistics in Medicine, 12, 1829–1843.CrossRefGoogle Scholar
  66. 66.
    Wallenstein, S. and Neff, N. (1987). An approximation for the distribution of the scan statistic, Statistics in Medicine, 6, 197–207.CrossRefGoogle Scholar
  67. 67.
    Wolf E. and Naus, J. (1973). Tables of critical values for a k-sample Kolmogorov-Smirnov test statistic, Journal of the American Statistical Association, 68, 994–997.CrossRefMATHMathSciNetGoogle Scholar

Copyright information

© Birkhäuser Boston, a part of Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Sylvan Wallenstein
    • 1
  1. 1.Department of Community and Preventive MedicineNew YorkUSA

Personalised recommendations