ANEMI: An Adaptive Neighborhood Expectation-Maximization Algorithm with Spatial Augmented Initialization

Hu, Tianming; Xiong, Hui; Gong, Xueqing; Sung, Sam Yuan

doi:10.1007/978-3-540-68125-0_16

ANEMI: An Adaptive Neighborhood Expectation-Maximization Algorithm with Spatial Augmented Initialization

Tianming Hu^1,2,
Hui Xiong³,
Xueqing Gong¹ &
…
Sam Yuan Sung⁴

Conference paper

2537 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5012))

Abstract

The Neighborhood Expectation-Maximization (NEM) algorithm is an iterative EM-style method for clustering spatial data. Unlike the traditional EM algorithm, NEM has the spatial penalty term incorporated in the objective function. The clustering performance of NEM depends mainly on two factors: the choice of the spatial coefficient, which is used to weigh the penalty term; and the initial state of cluster separation, to which the resultant clustering is sensitive. Existing NEM algorithms usually assign an equal spatial coefficient to every site, regardless of whether this site is in the class interior or on the class border. However, when estimating posterior probabilities, sites in the class interior should receive stronger influence from its neighbors than those on the border. In addition, initialization methods deployed for EM-based clustering algorithms generally do not account for the unique properties of spatial data, such as spatial autocorrelation. As a result, they often fail to provide a proper initialization for NEM to find a good solution in practice. To that end, this paper presents a variant of NEM, called ANEMI, which exploits an adaptive spatial coefficient determined by the correlation of explanatory attributes inside the neighborhood. Also, ANEMI runs from the initial state returned by the spatial augmented initialization method. Finally, the experimental results on both synthetic and real-world datasets validated the effectiveness of ANEMI.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Tobler, W.R.: Cellular Geography, Philosophy in Geography. In: Gale, W.R., Olsson, W.R. (eds.) Reidel, The Netherlands (1979)
Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statistical Society B(39), 1–38 (1977)
MathSciNet Google Scholar
Ambroise, C., Govaert, G.: Convergence of an EM-type algorithm for spatial clustering. Pattern Recognition Letters 19(10), 919–927 (1998)
Article Google Scholar
Garey, M.R., Johnson, D.S., Witsenhausen, H.S.: The complexity of the generalized lloyd-max problem. TOIT 28(2), 255–256 (1980)
MathSciNet Google Scholar
Ng, R., Han, J.: CLARANS: A method for clustering objects for spatial data mining. TKDE 14(5), 1003–1016 (2002)
Google Scholar
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, pp. 226–231 (1996)
Google Scholar
Karypis, G., Han, E.H., Kumar, V.: CHAMELEON: A hierarchical clustering algorithm using dynamic modeling. Computer 32(8), 68–75 (1999)
Article Google Scholar
Estivill-Castro, V., Lee, I.: Fast spatial clustering with different metrics and in the presence of obstacles. In: ACM GIS. (2001) 142 – 147
Google Scholar
Tung, A.K.H., Hou, J., Han, J.: Spatial clustering in the presence of obstacles. In: ICDE, pp. 359–367 (2001)
Google Scholar
Guo, D., Peuquet, D., Gahegan, M.: Opening the black box: Interactive hierarchical clustering for multivariate spatial patterns. In: ACM GIS, pp. 131–136 (2002)
Google Scholar
Legendre, P.: Constrained clustering. In: Legendre, P., Legendre, L. (eds.) Developments in Numerical Ecology. NATO ASI Series G 14, pp. 289–307 (1987)
Google Scholar
Rasson, J.P., Granville, V.: Multivariate discriminant analysis and maximum penalized likelihood density estimation. J. Royal Statistical Society B(57), 501–517 (1995)
MathSciNet Google Scholar
Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. TPAMI 6, 721–741 (1984)
MATH Google Scholar
Solberg, A.H., Taxt, T., Jain, A.K.: A markov random field model for classification of multisource satellite imagery. IEEE Trans. Geoscience and Remote Sensing 34(1), 100–113 (1996)
Article Google Scholar
Congdon, P.: A model for non-parametric spatially varying regression effects. Computational Statistics & Data Analysis 50(2), 422–445 (2006)
Article MathSciNet Google Scholar
Hammersley, J.M., Clifford, P.: Markov fields on finite graphs and lattices, unpublished manuscript (1971)
Google Scholar
Neal, R., Hinton, G.: A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Jordan, M. (ed.) Learning in Graphical Models, pp. 355–368. Kluwer Academic Publishers, Dordrecht (1998)
Google Scholar
Hathaway, R.J.: Another interpretation of the EM algorithm for mixture distributions. Statistics and Probability Letters 4, 53–56 (1986)
Article MATH MathSciNet Google Scholar
Shekhar, S., Chawla, S.: Spatial Databases: A Tour. Prentice-Hall, Englewood Cliffs (2002)
Google Scholar
Tou, J.T., Gonzalez, R.C.: Pattern Recognition Principles. Addison-Wesley, Reading (1974)
MATH Google Scholar
Katsavounidis, I., Kuo, C., Zhang, Z.: A new initialization technique for generalized lloyd iteration. IEEE Signal Processing Letters 1(10), 144–146 (1994)
Article Google Scholar
Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
LeSage, J.P.: MATLAB Toolbox for Spatial Econometrics (1999), http://www.spatial-econometrics.com
Dang, V.M.: (1998), http://www.hds.utc.fr/~mdang/Progs/prognem.html
Pernkopf, F., Bouchaffra, D.: Genetic-based EM algorithm for learning gaussian mixture models. TPAMI 27(8), 1344–1348 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

East China Normal University,
Tianming Hu & Xueqing Gong
Dongguan University of Technology,
Tianming Hu
Rutgers, the State University of New Jersey,
Hui Xiong
South Texas College,
Sam Yuan Sung

Authors

Tianming Hu
View author publications
You can also search for this author in PubMed Google Scholar
Hui Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Xueqing Gong
View author publications
You can also search for this author in PubMed Google Scholar
Sam Yuan Sung
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Takashi Washio Einoshin Suzuki Kai Ming Ting Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, T., Xiong, H., Gong, X., Sung, S.Y. (2008). ANEMI: An Adaptive Neighborhood Expectation-Maximization Algorithm with Spatial Augmented Initialization. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_16

Download citation

DOI: https://doi.org/10.1007/978-3-540-68125-0_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68124-3
Online ISBN: 978-3-540-68125-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics