Randomized algorithms of maximum likelihood estimation with spatial autoregressive models for large-scale networks

  • Miaoqi Li
  • Emily L. KangEmail author


The spatial autoregressive (SAR) model is a classical model in spatial econometrics and has become an important tool in network analysis. However, with large-scale networks, existing methods of likelihood-based inference for the SAR model become computationally infeasible. We here investigate maximum likelihood estimation for the SAR model with partially observed responses from large-scale networks. By taking advantage of recent developments in randomized numerical linear algebra, we derive efficient algorithms to estimate the spatial autocorrelation parameter in the SAR model. Compelling experimental results from extensive simulation and real data examples demonstrate empirically that the estimator obtained by our method, called the randomized maximum likelihood estimator, outperforms the state of the art by giving smaller bias and standard error, especially for large-scale problems with moderate spatial autocorrelation. The theoretical properties of the estimator are explored, and consistency results are established.


Maximum likelihood estimation Network Randomized numerical linear algebra Spatial autoregressive model 



We would like to thank the associate editor and two reviewers of Statistics and Computing for their insightful comments that greatly improved this work. Li’s work is partially supported by the Henry Laws Fellowship Award and the Taft Research Center at the University of Cincinnati. Kang’s research is partially supported by the Simons Foundation Collaboration Award (#317298) and the Taft Research Center at the University of Cincinnati. This work was supported in part by an allocation of computing time from the Ohio Supercomputer Center (OSC 1987). We would like to thank Dr. Shan Ba, Dr. Won Chang, Dr. Noel Cressie, Dr. Alex B. Konomi, and Dr. Siva Sivaganesan for their helpful suggestions.


  1. Anselin, L., Bera, A.K.: Spatial dependence in linear regression models with an introduction to spatial econometrics. Stat. Textb. Monogr. 155, 237–290 (1998)Google Scholar
  2. Banerjee, S., Gelfand, A.E., Finley, A.O., Sang, H.: Gaussian predictive process models for large spatial data sets. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 70(4), 825–848 (2008)MathSciNetzbMATHGoogle Scholar
  3. Banerjee, S., Carlin, B.P., Gelfand, A.E.: Hierarchical Modeling and Analysis for Spatial Data. CRC Press, Boca Raton (2014)zbMATHGoogle Scholar
  4. Barry, R.P., Pace, R.K.: Monte Carlo estimates of the log determinant of large sparse matrices. Linear Algebra Appl. 289(1–3), 41–54 (1999)MathSciNetzbMATHGoogle Scholar
  5. Beck, N., Gleditsch, K.S., Beardsley, K.: Space is more than geography: Using spatial econometrics in the study of political economy. Int. Stud. Q. 50(1), 27–44 (2006)Google Scholar
  6. Boutsidis, C., Drineas, P., Kambadur, P., Kontopoulou, E.M., Zouzias, A.: A randomized algorithm for approximating the log determinant of a symmetric positive definite matrix. arXiv preprint arXiv:1503.00374 (2015)
  7. Browne, K.: Snowball sampling: using social networks to research non-heterosexual women. Int. J. Soc. Res. Methodol 8(1), 47–60 (2005)Google Scholar
  8. Burden, S., Cressie, N., Steel, D.G.: The SAR model for very large datasets: a reduced rank approach. Econometrics 3(2), 317–338 (2015)Google Scholar
  9. Chen, X., Chen, Y., Xiao, P.: The impact of sampling and network topology on the estimation of social intercorrelations. J. Market. Res. 50(1), 95–110 (2013)Google Scholar
  10. Cressie, N., Johannesson, G.: Fixed rank kriging for very large spatial data sets. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 70(1), 209–226 (2008)MathSciNetzbMATHGoogle Scholar
  11. Darmofal, D.: Spatial Analysis for the Social Sciences. Cambridge University Press, Cambridge (2015)Google Scholar
  12. Doreian, P.: Estimating linear models with spatially distributed data. Sociol. Methodol. 12, 359–388 (1981)Google Scholar
  13. Doreian, P., Freeman, L., White, D., Romney, A.: Models of network effects on social actors. In: Research Methods in Social Network Analysis pp. 295–317 (1989)Google Scholar
  14. Fujimoto, K., Chou, C.P., Valente, T.W.: The network autocorrelation model using two-mode data: affiliation exposure and potential bias in the autocorrelation parameter. Soc. Netw. 33(3), 231–243 (2011)Google Scholar
  15. Guruswami, V., Sinop, A.K.: Optimal column-based low-rank matrix reconstruction. In: Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, pp. 1207–1214 (2012)Google Scholar
  16. Haggett, P.: Hybridizing alternative models of an epidemic diffusion process. Econ. Geogr. 52(2), 136–146 (1976)Google Scholar
  17. Lee, L.F.: Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica 72(6), 1899–1925 (2004)MathSciNetzbMATHGoogle Scholar
  18. Lee, L.F., Liu, X.: Efficient GMM estimation of high order spatial autoregressive models with autoregressive disturbances. Econ. Theory 26(1), 187–230 (2010)MathSciNetzbMATHGoogle Scholar
  19. Lee, L., Yu, J.: Estimation of spatial autoregressive panel data models with fixed effects. J. Econ. 154(2), 165–185 (2010)MathSciNetzbMATHGoogle Scholar
  20. Lf, L., Liu, X., Lin, X.: Specification and estimation of social interaction models with network structures. Econ. J. 13(2), 145–176 (2010)MathSciNetGoogle Scholar
  21. Leenders, R.T.: Modeling social influence through network autocorrelation: constructing the weight matrix. Soc. Netw. 24(1), 21–47 (2002)Google Scholar
  22. LeSage, J., Pace, R.K.: Introduction to Spatial Econometrics. Chapman and Hall, Boca Raton (2009)zbMATHGoogle Scholar
  23. LeSage, J.P., Pace, R.K.: Models for spatially dependent missing data. J. Real Estate Financ. Econ. 29(2), 233–254 (2004)Google Scholar
  24. Leskovec, J., Krevl, A.: SNAP Datasets: Stanford Large Network Dataset Collection (2014)Google Scholar
  25. Lichstein, J.W., Simons, T.R., Shriner, S.A., Franzreb, K.E.: Spatial autocorrelation and autoregressive models in ecology. Ecol. Monogr. 72(3), 445–463 (2002)Google Scholar
  26. Lin, X., Lf, L.: Gmm estimation of spatial autoregressive models with unknown heteroskedasticity. J. Econ. 157(1), 34–52 (2010)MathSciNetzbMATHGoogle Scholar
  27. Mahoney, M.W., et al.: Randomized algorithms for matrices and data. Found. Trends® Mach. Learn. 3(2), 123–224 (2011)Google Scholar
  28. O’Malley, A.J.: The analysis of social network data: an exciting frontier for statisticians. Stat. Med. 32(4), 539–555 (2013)MathSciNetGoogle Scholar
  29. Ord, K.: Estimation methods for models of spatial interaction. J. Am. Stat. Assoc. 70(349), 120–126 (1975)MathSciNetzbMATHGoogle Scholar
  30. OSC: Ohio Supercomputer Center. Columbus, OH: Ohio Supercompu-ter Center. (1987). Accessed 21 Dec 2018
  31. Pace, R.K., Barry, R.: Sparse spatial autoregressions. Stat. Probab. Lett. 33(3), 291–297 (1997)zbMATHGoogle Scholar
  32. Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2006)zbMATHGoogle Scholar
  33. Robins, G.: A tutorial on methods for the modeling and analysis of social network data. J. Math. Psychol. 57(6), 261–274 (2013)MathSciNetzbMATHGoogle Scholar
  34. Robins, G., Pattison, P., Elliott, P.: Network models for social influence processes. Psychometrika 66(2), 161–189 (2001)MathSciNetzbMATHGoogle Scholar
  35. Shao, J.: Mathematical Statistics. Springer, New York (2003)zbMATHGoogle Scholar
  36. Smirnov, O., Anselin, L.: Fast maximum likelihood estimation of very large spatial autoregressive models: a characteristic polynomial approach. Comput. Stat. Data Anal. 35(3), 301–319 (2001)MathSciNetzbMATHGoogle Scholar
  37. Smirnov, O.A.: Computation of the information matrix for models with spatial interaction on a lattice. J. Comput. Graph. Stat. 14(4), 910–927 (2005)MathSciNetGoogle Scholar
  38. Stewart, G.: Four algorithms for the efficient computation of truncated pivoted QR approximations to a sparse matrix. Numer. Math. 83(2), 313–323 (1999)MathSciNetzbMATHGoogle Scholar
  39. Suesse, T.: Estimation of spatial autoregressive models with measurement error for large data sets. Comput. Stat. 33(4), 1627–1648 (2018)MathSciNetGoogle Scholar
  40. Suesse, T.: Marginal maximum likelihood estimation of SAR models with missing data. Comput. Stat. Data Anal. 120, 98–110 (2018)MathSciNetzbMATHGoogle Scholar
  41. Suesse, T., Chambers, R.: Using social network information for survey estimation. J. Off. Stat. 34(1), 181–209 (2018)Google Scholar
  42. Suesse, T., Zammit-Mangion, A.: Computational aspects of the em algorithm for spatial econometric models with missing data. J. Stat. Comput. Simul. 87(9), 1767–1786 (2017)MathSciNetGoogle Scholar
  43. Sun, D., Tsutakawa, R.K., Speckman, P.L.: Posterior distribution of hierarchical models using car (1) distributions. Biometrika 86(2), 341–350 (1999)MathSciNetzbMATHGoogle Scholar
  44. Wang, S., Luo, L., Zhang, Z.: SPSD matrix approximation vis column selection: theories, algorithms, and extensions. J. Mach. Learn. Res. 17(49), 1–49 (2016)MathSciNetzbMATHGoogle Scholar
  45. Wang, W., Lee, L.F.: Estimation of spatial autoregressive models with randomly missing data in the dependent variable. Econ. J. 16(1), 73–102 (2013)MathSciNetGoogle Scholar
  46. Whittle, P.: On stationary processes in the plane. Biometrika 41, 434–449 (1954)MathSciNetzbMATHGoogle Scholar
  47. Woodruff, D.P., et al.: Sketching as a tool for numerical linear algebra. Found. Trends® Theor. Comput. Sci. 10(1–2), 1–157 (2014)Google Scholar
  48. Zhou, J., Tu, Y., Chen, Y., Wang, H.: Estimating spatial autocorrelation with sampled network data. J. Bus. Econ. Stat. 35(1), 130–138 (2017)MathSciNetGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Mathematical SciencesUniversity of CincinnatiCincinnatiUSA

Personalised recommendations