Skip to main content

ANEMI: An Adaptive Neighborhood Expectation-Maximization Algorithm with Spatial Augmented Initialization

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5012))

Abstract

The Neighborhood Expectation-Maximization (NEM) algorithm is an iterative EM-style method for clustering spatial data. Unlike the traditional EM algorithm, NEM has the spatial penalty term incorporated in the objective function. The clustering performance of NEM depends mainly on two factors: the choice of the spatial coefficient, which is used to weigh the penalty term; and the initial state of cluster separation, to which the resultant clustering is sensitive. Existing NEM algorithms usually assign an equal spatial coefficient to every site, regardless of whether this site is in the class interior or on the class border. However, when estimating posterior probabilities, sites in the class interior should receive stronger influence from its neighbors than those on the border. In addition, initialization methods deployed for EM-based clustering algorithms generally do not account for the unique properties of spatial data, such as spatial autocorrelation. As a result, they often fail to provide a proper initialization for NEM to find a good solution in practice. To that end, this paper presents a variant of NEM, called ANEMI, which exploits an adaptive spatial coefficient determined by the correlation of explanatory attributes inside the neighborhood. Also, ANEMI runs from the initial state returned by the spatial augmented initialization method. Finally, the experimental results on both synthetic and real-world datasets validated the effectiveness of ANEMI.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Tobler, W.R.: Cellular Geography, Philosophy in Geography. In: Gale, W.R., Olsson, W.R. (eds.) Reidel, The Netherlands (1979)

    Google Scholar 

  2. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statistical Society B(39), 1–38 (1977)

    MathSciNet  Google Scholar 

  3. Ambroise, C., Govaert, G.: Convergence of an EM-type algorithm for spatial clustering. Pattern Recognition Letters 19(10), 919–927 (1998)

    Article  Google Scholar 

  4. Garey, M.R., Johnson, D.S., Witsenhausen, H.S.: The complexity of the generalized lloyd-max problem. TOIT 28(2), 255–256 (1980)

    MathSciNet  Google Scholar 

  5. Ng, R., Han, J.: CLARANS: A method for clustering objects for spatial data mining. TKDE 14(5), 1003–1016 (2002)

    Google Scholar 

  6. Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, pp. 226–231 (1996)

    Google Scholar 

  7. Karypis, G., Han, E.H., Kumar, V.: CHAMELEON: A hierarchical clustering algorithm using dynamic modeling. Computer 32(8), 68–75 (1999)

    Article  Google Scholar 

  8. Estivill-Castro, V., Lee, I.: Fast spatial clustering with different metrics and in the presence of obstacles. In: ACM GIS. (2001) 142 – 147

    Google Scholar 

  9. Tung, A.K.H., Hou, J., Han, J.: Spatial clustering in the presence of obstacles. In: ICDE, pp. 359–367 (2001)

    Google Scholar 

  10. Guo, D., Peuquet, D., Gahegan, M.: Opening the black box: Interactive hierarchical clustering for multivariate spatial patterns. In: ACM GIS, pp. 131–136 (2002)

    Google Scholar 

  11. Legendre, P.: Constrained clustering. In: Legendre, P., Legendre, L. (eds.) Developments in Numerical Ecology. NATO ASI Series G 14, pp. 289–307 (1987)

    Google Scholar 

  12. Rasson, J.P., Granville, V.: Multivariate discriminant analysis and maximum penalized likelihood density estimation. J. Royal Statistical Society B(57), 501–517 (1995)

    MathSciNet  Google Scholar 

  13. Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. TPAMI 6, 721–741 (1984)

    MATH  Google Scholar 

  14. Solberg, A.H., Taxt, T., Jain, A.K.: A markov random field model for classification of multisource satellite imagery. IEEE Trans. Geoscience and Remote Sensing 34(1), 100–113 (1996)

    Article  Google Scholar 

  15. Congdon, P.: A model for non-parametric spatially varying regression effects. Computational Statistics & Data Analysis 50(2), 422–445 (2006)

    Article  MathSciNet  Google Scholar 

  16. Hammersley, J.M., Clifford, P.: Markov fields on finite graphs and lattices, unpublished manuscript (1971)

    Google Scholar 

  17. Neal, R., Hinton, G.: A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Jordan, M. (ed.) Learning in Graphical Models, pp. 355–368. Kluwer Academic Publishers, Dordrecht (1998)

    Google Scholar 

  18. Hathaway, R.J.: Another interpretation of the EM algorithm for mixture distributions. Statistics and Probability Letters 4, 53–56 (1986)

    Article  MATH  MathSciNet  Google Scholar 

  19. Shekhar, S., Chawla, S.: Spatial Databases: A Tour. Prentice-Hall, Englewood Cliffs (2002)

    Google Scholar 

  20. Tou, J.T., Gonzalez, R.C.: Pattern Recognition Principles. Addison-Wesley, Reading (1974)

    MATH  Google Scholar 

  21. Katsavounidis, I., Kuo, C., Zhang, Z.: A new initialization technique for generalized lloyd iteration. IEEE Signal Processing Letters 1(10), 144–146 (1994)

    Article  Google Scholar 

  22. Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

  23. LeSage, J.P.: MATLAB Toolbox for Spatial Econometrics (1999), http://www.spatial-econometrics.com

  24. Dang, V.M.: (1998), http://www.hds.utc.fr/~mdang/Progs/prognem.html

  25. Pernkopf, F., Bouchaffra, D.: Genetic-based EM algorithm for learning gaussian mixture models. TPAMI 27(8), 1344–1348 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takashi Washio Einoshin Suzuki Kai Ming Ting Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hu, T., Xiong, H., Gong, X., Sung, S.Y. (2008). ANEMI: An Adaptive Neighborhood Expectation-Maximization Algorithm with Spatial Augmented Initialization. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68125-0_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68124-3

  • Online ISBN: 978-3-540-68125-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics