Varying Density Spatial Clustering Based on a Hierarchical Tree

Hu, Xuegang; Wang, Dongbo; Wu, Xindong

doi:10.1007/978-3-540-73499-4_15

Xuegang Hu¹,
Dongbo Wang¹ &
Xindong Wu^1,2

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4571))

Included in the following conference series:

International Workshop on Machine Learning and Data Mining in Pattern Recognition

3668 Accesses
2 Citations

Abstract

The high efficiency and quality of clustering for dealing with high-dimensional data are strongly needed with the leap of data scale. Density-based clustering is an effective clustering approach, and its representative algorithm DBSCAN has advantages as clustering with arbitrary shapes and handling noise. However, it also has disadvantages in its high time expense, parameter tuning and inability to varying densities. In this paper, a new clustering algorithm called VDSCHT (Varying Density Spatial Clustering Based on a Hierarchical Tree) is presented that constructs a hierarchical tree to describe subcluster and tune local parameter dynamically. Density-based clustering is adopted to cluster by detecting adjacent spaces of the tree. Both theoretical analysis and experimental results indicate that VDSCHT not only has the advantages of density-based clustering, but can also tune the local parameter dynamically to deal with varying densities. In addition, only one scan of database makes it suitable for mining large-scaled ones.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Han, J.W., Kanber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, Seattle (2001)
Google Scholar
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An efficient data clustering method for very large databases [C]. In: SIGMOD 1996. Proc. 1996 ACM-SIGMOD Int. Conf. Management of Data, pp. 103–114, Montreal, Canada (June 1996)
Google Scholar
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: DBSCAN: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD 1996. Proc. 1996 Int. Conf. Knowledge Discovery and Data Mining, pp. 226–231, Portland, OR (August 1996)
Google Scholar
Ankerst, M., Bruenig, M., Kreigel, H.-P., Sander, J.: OPTICS: Ordering points to identify the clustering structure. In: SIGMOD 1999. Proc. 1999 ACM-SIGMOD Int. Conf. Management of Data, pp. 49–60, Philadelphia, PA (June 1999)
Google Scholar
Dash, M., Liu, H., Xu, X.: ’1+1>2’: merging distance and density based clustering. In: Proc. 2001 Int. Conf. Database Systems for Advanced Applications, pp. 32–39, Hong Kong, China (April 2001)
Google Scholar
Brecheisen, S., Kriegel, H.-P., Pfeifle, M.: Efficient density-based clustering of complex objects. In: ICDM 2004. Proc. 2004 Int. Conf. Data Mining, pp. 43–50 (November 2004)
Google Scholar
Brecheisen, S., Kriegel, H.-P., Pfeifle, M.: Multi-step density-based clustering. Knowledge and Information Systems 9(3), 284–308 (2006)
Article Google Scholar
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic Subspace Clustering of High Dimensional Data. Data Mining and Knowledge Discovery 11, 5–33 (2005)
Article MathSciNet Google Scholar
Guha, S., Rastogi, R., Shim, K.: CURE: An efficient clustering algorithm for large databases. In: SIGMOD 1998. Proc. 1998 ACM-SIGMOD Int. Conf. Management of Data, pp. 73–84, Seattle, WA (June 1998)
Google Scholar
Yasser, E.-S., Ismail, M.A., Farouk, M.: An Efficient Density Based Clustering Algorithm for Large Databases. In: ICTAI 2004. Proc. 2004 16th IEEE Int. Conf. Tools with Artificial Intelligence (2004)
Google Scholar
Borah, B., Bhattacharyya, D.K.: An improved sampling-based DBSCAN for large spatial databases. Intelligent Sensing and Information Processing (2004)
Google Scholar
Stonebraker, M., Frew, J., Gardels, K., Meredith, J.: The SEQUOIA 2000 Storage Benchmark. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 2–11, Washington, DC (1993)
Google Scholar
http://www.cs.waikato.ac.nz/ml/weka/

Download references

Author information

Authors and Affiliations

School of Computer Science and Information Engineering, Hefei University of Technology, Anhui 230009, China
Xuegang Hu, Dongbo Wang & Xindong Wu
Department of Computer Science, University of Vermont, Burlington, VT 50405, USA
Xindong Wu

Authors

Xuegang Hu
View author publications
You can also search for this author in PubMed Google Scholar
Dongbo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xindong Wu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, X., Wang, D., Wu, X. (2007). Varying Density Spatial Clustering Based on a Hierarchical Tree. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2007. Lecture Notes in Computer Science(), vol 4571. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73499-4_15

Download citation

DOI: https://doi.org/10.1007/978-3-540-73499-4_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73498-7
Online ISBN: 978-3-540-73499-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics