Skip to main content

A Geospatial Implementation of a Novel Delineation Clustering Algorithm Employing the K-means

  • Chapter

Part of the book series: Lecture Notes in Geoinformation and Cartography ((LNGC))

The overarching objective of this paper is to introduce a novel Fast, Efficient, and Scalable k-means k-means (FES-k-means*) algorithm. This algorithm is designed to increase the overall performance of the standard k-means clustering technique. The FES-k-means* algorithm uses a hybrid approach that comprises the k-d tree data structure, the nearest neighbor query, the standard k-means algorithm, and Mashor’s adaptation rate. The algorithm is tested using two real datasets and two synthetic datasets and is employed twice on all four datasets. The first trial consisted of previously MIL-SOM* trained data, and the second was on raw, untrained data. The approach presented with this method enables unfounded knowledge discovery, otherwise unclaimed by conventional clustering methods. When used in conjunction with the MIL-SOM* training technique, theFES-k-means* algorithm reduces the computation time and produces quality clusters. In particular, the robust FES-k-means* method opens doors to (1) faster cluster production than conventional clustering methods, (2) scalability allowing application in other platforms, and its ability to handle small and large datasets, compact or scattered, and (3) efficient geospatial data analysis of large datasets. All of the above makes FES-k-means* live up to defending its well-deserved name—Fast, Efficient, and Scalable k-means (FES-k-means*). The findings of this study are vital to the relatively new and expanding subfield of geospatial data management.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Alsabti, K., S. Ranka, V. Singh. (1998). An efficient k-means clustering algorithm. Proceedings in IPPS: 11th International Parallel Processing Symposium Workshop on High Performance Data Mining, IEEE, Computer Society Press.

    Google Scholar 

  • Bacć ´o, F., Lobo, V., and M. Painho. (2004) Geo-Self-Organizing Map (Geo-SOM) for Building and Exploring Homogeneous Regions. In Egenhofer, M.J., Freskes, C. and Miller, H.J., (eds.) Geographical Information Science. Proceedings of Third International Conference, GIScience. Adelphi, MD, USA. October 20–23, 2004. Springer-Verlay Berlin Heidelberg.

    Google Scholar 

  • Ball, G. H., and D. J. Hall. (1967) A clustering technique for summarizing multivariate data. Behavioral Sciences. 12(2):153–55.

    Article  Google Scholar 

  • Birant, D. and A. Kut. (2007) ST-DBSCAN: An algorithm for clustering spatial-temporal data. Data & Knowledge Engineering. Elsevier. 601(1): 208–221.

    Google Scholar 

  • Chinrungrueng C., and C. H. Sequin. (1995) Optimal adaptive k-means algorithm with dynamic adjustment of learning rate. IEEE Transactions on neural networks. 6(1): 157–69.

    Article  Google Scholar 

  • Ester, M., H.P. Kriegel, J. Sander, and X. Xu. (1996). A density based algorithm for discovering clusters in large geospatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, Oregon.

    Google Scholar 

  • Guha, S., R. Rastogi, and K. Shim. (1998) Cure: An efficient clustering algorithm for large databases. In Proceedings of the ACM SIGMOD Conference on Management of Data (SIGMOD, Seattle, WA, June). ACM Press, New York, NY.

    Google Scholar 

  • Jain, A.K., M.N. Murty, and P.J. Flynn. (1999). Data clustering: a review. ACM Computing Surveys. 31(3): 264–323.

    Article  Google Scholar 

  • Kanungo, T., D.M. Mount, N. S. Netanyahu, et al. (2002). An efficient k-means clustering algorithm: analysis and implementation. IEEE Transactions on pattern analysis and machine intelligence. 24(7): 881–92.

    Article  Google Scholar 

  • Karypis, G., E.H Han, V. Kumar. (1999) Chameleon: Hierarchical clustering using dynamic modeling. IEEE Computer Special Issue on Data Analysis and Mining. 32(8): 68–75.

    Google Scholar 

  • Kim, K. and E. Yamashita (2007)Using a k-means clustering algorithm to examine patterns of pedestrian involved crashes in Honolulu, Hawaii. Journal of Advanced Transportation. 41(1): 60–89.

    Article  Google Scholar 

  • Kohonen, T. (1990) The self-organizing map. Proceedings of the IEEE. 78(9): 1464–80.p

    Google Scholar 

  • Krishna, K. and M. Narasimha. (1999) Genetic k-means algorithm. IEEE Transactions on Systems, Man and Cybernetics, Part B. 29(3): 433–439.

    Article  Google Scholar 

  • Kruskal, J.B. (1964) Non-metric multidimensional scaling: a numerical method. Psychometrika 29: 115–29.

    Article  Google Scholar 

  • Likas A., N. Vlassis, and J.J. Verbeek. (2003) The global k-means clustering algorithm. Pattern Recognition. 36: 451–61.

    Article  Google Scholar 

  • MacQueen, J.B. (1967) Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Statistical Laboratory, University of California. 282–97.

    Google Scholar 

  • Mashor, M.Y. (1998) Improving the performance of k-means clustering algorithm to position the centers of RBF network. International Journal of the Computer, The Internet and Management. 6(2).

    Google Scholar 

  • Ng, R. and J. Han. (1994) Efficient and Effective Clustering methods for spatial data mining. In Proceedings of the 20th International Conference of ery Large Data Bases (VLDB), September 1994, Santiago, Chile. 144–55.

    Google Scholar 

  • Oyana TJ, Achenie LEK, Cuadros-Vargas E, Rivers PA, and Scott KE. (2006) A Mathematical Improvement of the Self-Organizing Map Algorithm. Chapter 8: ICT and Mathematical Modelling (pp 522–531). In Mwakali J.A. and Taban-Wani G. (eds.): Advance in Engineering and Technology, London: Elsevier Ltd. 847.

    Google Scholar 

  • Pham D.T., S.S. Dimov, C.D. Nguyen. (2004) A two-phase k-means algorithm for large datasets. Proc. Instn Mech. Engrs, Part C: Journal Mechanical Engineering Science. 218(C). 1269–73.

    Google Scholar 

  • Roussopoulos N., S. Kelley, F. Vincent. (1995) Nearest Neighbor Queries. In Proceedings of Special Interest Group on Management of data (SIGMOD), San Jose, CA, USA. 71–79.

    Google Scholar 

  • Sammon, J. W. (1969) A nonlinear mapping for data structure analysis. IEEE Transactions on Computers. 18(5): 401–09.

    Article  Google Scholar 

  • Song, M. and S. Rajasekaran. (2005) Finding frequent item sets by transaction mapping. Symposium on Applied Computing. 488–492.

    Google Scholar 

  • Steinley, D. (2006) K-means clustering: A half-century synthesis. British Journal of Mathematical and Statistical Psychology. 59: 1–34.

    Article  Google Scholar 

  • Su, M. and H. T. Chang. (2000) Fast self-organizing feature map algorithm. IEEE transactions on neural network. 11(3): 721–33.

    Article  Google Scholar 

  • Theodoridis, S. and K. Koutroumbas. (2003) Pattern Recognition (2ed). Elsevier Science (USA). China machine press. 431–535.

    Google Scholar 

  • Tsai, C.F., C.W. Tsai, H.C. Wu, and T. Yang.(2004) ACODF: a novel data clustering approach for data mining in large databases. Journal of Systems and Software. 73(1): 133–145.

    Article  Google Scholar 

  • Vesanto, J. and E. Alhoniemi. (2000) Clustering of the Self-Organizing Map. IEEE Transactions on Neural Networks. 11(3): 586–600.

    Article  Google Scholar 

  • Vrahatis, M. N., B. Boutsinas, P. Alevizos, and G. Pavlides. (2002) The new k-windows algorithm for improving the k-means clustering algorithm. Journal of Complexity. 18. 375–91.

    Article  Google Scholar 

  • Wang, L., L. Bo, L. Jiao. (2006) A modified k-means clustering with a density-sensitive distance metric. Lecture notes in Computer Science. Springer Berlin , Heidelber. 4062: 544–551.

    Google Scholar 

  • Xu, R and D. Wunsch II. (2005) Survey of Clustering Algorithms. IEEE Transactions on Neural Networks. 16(3): 645–78.

    Article  Google Scholar 

  • Zhang T., R. Ramakrishnan, M. Livny. (1996) BIRCH: An efficient data clustering method for very large databases. Proc. ACM SIGMOD Conf. Management of Data. 103–14.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Oyana, T.J., Scott, K.E. (2008). A Geospatial Implementation of a Novel Delineation Clustering Algorithm Employing the K-means . In: Bernard, L., Friis-Christensen, A., Pundt, H. (eds) The European Information Society. Lecture Notes in Geoinformation and Cartography. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78946-8_8

Download citation

Publish with us

Policies and ethics