Skip to main content

Scan Statistics on Graphs and Networks

  • Living reference work entry
  • First Online:
Book cover Handbook of Scan Statistics
  • 383 Accesses

Abstract

This article summarizes modern research of scan statistics on graphs and networks. These statistics arise naturally in the scanning of time and space looking for clusters of anomalous entities or events. We review theories and methodologies of constructing scan statistics for both static and dynamic graphs, in both purely spatial and spatio temporal frameworks. Computation of graph-structured scan statistics is challenging, and usually leads to NP-hard problems. We also review several popular convex approximation algorithms for computing scan statistics in this article.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    The chromatic number is the smallest number of colors needed to color a graph such that no two adjacent vertices share the same color.

  2. 2.

    The likelihood ratio test is known to be the uniformly most powerful test according to the Neyman-Pearson Lemma.

References

  • Alon N, Krivelevich M (1997) The concentration of the chromatic number of random graphs. Combinatorica 17:303–313

    Article  MathSciNet  MATH  Google Scholar 

  • Appel M, Russo R (1997) The maximum vertex degree of a graph on uniform points in [0, 1]d. Adv Appl Probab 29:567–581

    MathSciNet  MATH  Google Scholar 

  • Arias-Castro E, Grimmett G (2013) Cluster detection in networks using percolation. Bernoulli 19:676–719

    Article  MathSciNet  MATH  Google Scholar 

  • Arias-Castro E, Donoho D, Huo X (2005) Near-optimal detection of geometric objects by fast multiscale methods. IEEE Trans Inf Theory 51:2402–2425

    Article  MathSciNet  MATH  Google Scholar 

  • Arias-Castro E, Donoho D, Huo X (2006) Adaptive multiscale detection of filamentary structures in a background of uniform random points. Ann Stat 34:326–349

    Article  MathSciNet  MATH  Google Scholar 

  • Arias-Castro E, Candés E, Helgason H, Zeitouni O (2008) Searching for a trail of evidence in a maze. Ann Stat 36:1726–1757

    Article  MathSciNet  MATH  Google Scholar 

  • Arias-Castro E, Candés E, Durand A (2011) Detection of an anomalous cluster in a network. Ann Stat 39:278–304

    Article  MathSciNet  MATH  Google Scholar 

  • Arias-Castro E, Candés E, Durand A (2011) Supplement to “Detection of an Anomalous Cluster in a Network.” https://doi.org/10.1214/10-AOS839SUPP

  • Arora A, Dutta P, Bapat S, Kulathumani V, Zhang H, Naik V, Mittal V, Cao H, Demirbas M, Gouda M, Choi Y, Herman T, Kulkarni S, Arumugam U, Nesterenko M, Vora A, Miyashita M (2004) A Line in the sand: a wireless sensor network for target detection, classification, and tracking. Comput Netw 46:605–634

    Article  Google Scholar 

  • Arratia R, Goldstein L, Gordon L (1989) Two moments suffice for poisson approximations: the Chen-Stein method. Ann Probab 17:9–25

    Article  MathSciNet  MATH  Google Scholar 

  • Auer P, Hornik K (1994) On the number of points of a homogeneous poisson process. J Multivar Anal 48:115–156.

    Article  MathSciNet  MATH  Google Scholar 

  • Auer P, Hornik K, Révész P (1991) Some limit theorems for the homogeneous poisson process. Stat Probab Lett 12:91–96

    Article  MathSciNet  MATH  Google Scholar 

  • Barbour A, Månsson M (2000) Compound poisson approximation and the clustering of random points. Adv Appl Probab 32:19–38

    Article  MathSciNet  MATH  Google Scholar 

  • Balakrishnan N, Koutras MV (2002) Runs and scans with applications. Wiley, New York

    MATH  Google Scholar 

  • Berge C (1989) Hypergraphs: combinatorics of finite sets. North-Holland mathematical library, vol 45. North-Holland, Amsterdam

    Google Scholar 

  • Berk R, Jones D (1979) Goodness-of-fit test statistics that dominate the Kolmogorov statistics. Z Wahrsch Verw Gebiete 47:47–59

    Article  MathSciNet  MATH  Google Scholar 

  • Bondy J, Murty U (1976) Graph theory with applications. American Elsevier Publishing Co Inc, New York

    Book  MATH  Google Scholar 

  • Bondy J, Murty U (2008) Graph theory. Graduate texts in mathematics, vol 244. Springer, New York

    Google Scholar 

  • Candés E, Charlton P, Helgason H (2008) Detecting highly oscillatory signals by chirplet path pursuit. Appl Comput Harmon Anal 24:14–40

    Article  MathSciNet  MATH  Google Scholar 

  • Chen F, Neil D (2014) Non-parametric scan statistics for event detection and forecasting in heterogeneous social media graphs. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, New York, pp 1166–1175

    Google Scholar 

  • Cressie N (1977) On some properties of the scan statistic on the circle and the line. J Appl Probab 14:272–283

    Article  MathSciNet  MATH  Google Scholar 

  • Cressie N (1993) Statistics for spatial data. Wiley series in probability and mathematical statistics: applied probability and statistics. Wiley, New York

    Google Scholar 

  • Dembo A, Gandolfi A, Kesten H (2001) Greedy lattice animals: negative values and unconstrained maxima. Ann Probab 29:205–241

    Article  MathSciNet  MATH  Google Scholar 

  • Diestel R (2010) Graph theory. Graduate texts in mathematics, vol 173, 4th edn. Springer, Heidelberg

    Google Scholar 

  • Duczmal L, Kulldorff M, Huang L (2006) Evaluation of spatial statistics for irregularly shaped cluster. J Comput Graph Stat 15:428–442

    Article  MathSciNet  Google Scholar 

  • Erdös P, Rényi A (1959) On random graphs. I. Publ Math Debrecen 6:290–297

    MathSciNet  MATH  Google Scholar 

  • Fu, JC, Lou WYW (2003) Distribution theory of runs and patterns and its applications. World Scientific Publishing Co., Inc., River Edge

    Book  MATH  Google Scholar 

  • Gilbert E (1959) Random graphs. Ann Math Stat 30:1141–1144

    Article  MATH  Google Scholar 

  • Glaz J, Balakrishnan N (1999) Scan statistics and applications. Statistics for industry and technology. Birkhäuser, Boston

    Book  MATH  Google Scholar 

  • Glaz J, Naus J, Wallenstein S (2001) Scan Statistics. Springer series in statistics. Springer, New York

    Book  MATH  Google Scholar 

  • Glaz J, Pozdnyakov V, Wallenstein S (2009) Scan statistics: methods and applications. Statistics for industry and technology. Birkhäuser, Boston

    Book  MATH  Google Scholar 

  • Grothendieck J, Priebe C, Gorin A (2010) Statistical inference on attributed random graphs: fusion of graph features and content. Comput Stat Data Anal 54:1777–1790

    Article  MathSciNet  MATH  Google Scholar 

  • Guerriero M, Willett P, Glaz J (2009) Distributed target detection in sensor networks using scan statistics. IEEE Trans Signal Process 57:2629–2639

    Article  MathSciNet  Google Scholar 

  • Hall P, Jin J (2010) Innovated higher criticism for detecting sparse signals in correlated noise. Ann Stat 38:1686–1732

    Article  MathSciNet  MATH  Google Scholar 

  • Johannsen D, Marchette D (2012) Betti numbers of graphs with an application to anomaly detection. Stat Anal Data Min 5:235–242

    Article  MathSciNet  Google Scholar 

  • Kullback S, Leibler R (1951) On information and sufficiency. Ann Math Stat 22:79–86

    Article  MathSciNet  MATH  Google Scholar 

  • Kulldorff M (2001) Prospective time periodic geographical disease surveillance using a scan statistic. J R Stat Soc Ser A 164:61–72

    Article  MathSciNet  MATH  Google Scholar 

  • Li D, Wong K, Hu Y, Sayeed A (2002) Detection, classification and tracking of targets. IEEE Signal Process Mag 19:17–29

    Google Scholar 

  • Łuczak T (1991) A note on the sharp concentration of the chromatic number of random graphs. Combinatorica 11:295–297

    Article  MathSciNet  MATH  Google Scholar 

  • Månsson M (1999) Poisson approximation in connection with clustering of random points. Ann Appl Probab 9:465–492

    Article  MathSciNet  Google Scholar 

  • Marchette D (2012) Scan statistics on graphs. WIREs Comput Stat 4:466–473

    Article  Google Scholar 

  • Matula D (1972) The employee party problem. Not Am Math Soc 19:A-382

    Google Scholar 

  • Müller T (2008) Two-point concentration in random geometric graphs. Combinatorica 28:529–545

    Article  MathSciNet  MATH  Google Scholar 

  • Naus J (1963) Clustering of random points in the line and plane. Ph.D. Thesis, Harvard University, Cambridge

    Google Scholar 

  • Naus J (1965a) The distribution of the size of the maximum cluster of points on a line. J Am Stat Assoc 60:532–538

    Article  MathSciNet  Google Scholar 

  • Naus J (1965b) Clustering of random points in two dimensions. Biometrika 52:263–267

    Article  MathSciNet  MATH  Google Scholar 

  • Neil J (2011) Scan Statics for the online detection of locally anomalous subgraphs. Doctoral Dissertation, University of New Mexico, Albuquerque

    Google Scholar 

  • Neil J, Hash C, Brugh A, Fish M, Storlie C (2013) Scan statistics for the online detection of locally anomalous subgraphs. Technometrics 55:403–414

    Article  MathSciNet  Google Scholar 

  • Newman M (2003) The structures and function of complex networks. SIAM Rev 45:167–256

    Article  MathSciNet  MATH  Google Scholar 

  • Newman M, Barabási A, Watts D (2006) The structure and dynamics of networks. Princeton University Press, Princeton

    MATH  Google Scholar 

  • Pao H, Coppersmith G, Priebe C (2011) Statistical inference on random graphs: comparative power analyses via Monte Carlo. J Comput Graph Stat 20:395–461

    Article  MathSciNet  Google Scholar 

  • Park Y, Priebe C, Marchette D, Youssef A (2009) Anomaly detection using scan statistics on time series hypergraphs. In: Workshop on link analysis, SIAM data mining conference. Sparks, Nevada

    Google Scholar 

  • Penrose M (1997) The longest edge of the random minimal spanning tree. Ann Appl Probab 7:340–361

    Article  MathSciNet  MATH  Google Scholar 

  • Penrose M (2002) Focusing of the scan statistic and geometric clique number. Adv Appl Probab 34:739–753

    Article  MathSciNet  MATH  Google Scholar 

  • Penrose M (2003) Random geometric graphs. Oxford studies in probability, vol 5. Oxford University, Oxford

    Google Scholar 

  • Priebe C, Conroy J, Marchette D, Park Y (2005) Scan statistics on enron graphs. Comput Math Organ Theory 11:229–247

    Article  MATH  Google Scholar 

  • Priebe C, Park Y, Marchette D, Conroy J, Grothendieck J, Gorin A (2010) Statistical inference on attributed random graphs: fusion of graph features and content: an experiment on time series of Enron graphs. Comput Stat Data Anal 54:1766–1776

    Article  MathSciNet  MATH  Google Scholar 

  • Rukhin A (2011) Asymptotic analysis of various statistics for random graph inference. Doctoral Dissertation, Johns Hopkins University, Baltimore

    Google Scholar 

  • Salamatian K, Vaton S (2000) Hidden Markov modeling for network communication channels. ACM SIGMETRICS Perfom Eval Rev 29:92–101

    Article  Google Scholar 

  • Sharpnack J (2016) Detecting anomalous activity on networks with the graph fourier scan Statistic. IEEE Trans Signal Process 64:364–378

    Article  MathSciNet  Google Scholar 

  • Sharpnack J, Krishnamurthy A, Singh A (2013a) Detecting activations over graphs using spanning tree wavelet bases. J Mach Learn Res 31:536–544

    Google Scholar 

  • Sharpnack J, Krishnamurthy A, Singh A (2013b) Near-optimal anomaly detection in graphs using Lov’asz extended scan statistic. In: Proceedings of the 26th international conference on NIPS, Lake Tahoe, pp 1959–1967

    Google Scholar 

  • Sharpnack J, Rinaldo A, Singh A (2013c) Changepoint detection over graphs with the spectral scan statistics. In: Proceedings of AISTATS, Scottsdale, pp 545–553

    Google Scholar 

  • Song X, Willett P, Glaz J, Zhou S (2012) Active detection with a barrier sensor network using a scan statistic. IEEE J Oceanic Eng 37:66–74

    Article  Google Scholar 

  • Walther G (2010) Optimal and fast detection of spatial clusters with scan statistics. Ann Stat 38:1010–1033

    Article  MathSciNet  MATH  Google Scholar 

  • Wang T-C, Phoa F (2016) A scanning method for detecting clustering pattern of both attribute and structure in social networks. Physica A 445:295–309

    Article  Google Scholar 

  • Wang B, Phillips J, Schreiber R, Wilkinson D, Mishra N, Tarjan R (2008) Spatial scan statistics for graph clustering. In: Proceedings of the 2008 SIAM international conference on data mining, Atlanta, pp 727–738

    Google Scholar 

  • Wang H, Tang M, Park Y, Priebe C (2014) Locality statistics for anomaly detection in time series of graphs. IEEE Trans Signal Process 62:703–717

    Article  MathSciNet  Google Scholar 

  • Wang T-C, Phoa F, Hsu T-C (2015) Power-law distributions of attributes in community detection. Soc Netw Anal Min 5:45

    Article  Google Scholar 

  • Wong L, Pattison P, Robins G (2006) A spatial model for social networks. Physica A 360:99–120

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Panpan Zhang .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media LLC

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Zhang, P., Glaz, J. (2018). Scan Statistics on Graphs and Networks. In: Glaz, J., Koutras, M. (eds) Handbook of Scan Statistics. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8414-1_43-1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-8414-1_43-1

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-8414-1

  • Online ISBN: 978-1-4614-8414-1

  • eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics