A Neighborhood-Based Clustering by Means of the Triangle Inequality

Kryszkiewicz, Marzena; Lasek, Piotr

doi:10.1007/978-3-642-15381-5_35

Marzena Kryszkiewicz²¹ &
Piotr Lasek²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6283))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

1768 Accesses
13 Citations

Abstract

Grouping data into meaningful clusters is an important task of both artificial intelligence and data mining. An important group of clustering algorithms are density based ones that require calculation of a neighborhood of a given data point. The bottleneck for such algorithms are high dimensional data. In this paper, we propose a new TI-k-Neighborhood-Index algorithm that calculates k-neighborhoods for all points in a given data set by means the triangle inequality. We prove experimentally that the NBC (Neighborhood Based Clustering) clustering algorithm supported by our index outperforms NBC supported by such known spatial indices as VA-file and R-tree both in the case of low and high dimensional data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Database with Noise. In: Proc. of KDD’96, pp. 226–231 (1996)
Google Scholar
Blott, S., Weber, R.: A Simple Vector Approximation File for Similarity Search in High-dimensional Vector Spaces, Technical Report 19, ESPRIT project HERMES, vol. 9141, Technical Report number 19 (March 1997)
Google Scholar
Guttman, A.: R-Trees: A Dynamic Index Structure For Spatial Searching. In: Proc. of ACM SIGMOD, Boston, pp. 47–57 (1984)
Google Scholar
Kryszkiewicz, M., Lasek, P.: TI-DBSCAN: Clustering with DBSCAN by Means of the Triangle Inequality. In: Szczuka, M. (ed.) RSCTC 2010. LNCS, vol. 6086, pp. 60–69. Springer, Heidelberg (2010)
Google Scholar
Stonebraker, M., Frew, J., Gardels, K., Meredith, J.: The SEQUOIA 2000 Storage Benchmark. In: Proc. of ACM SIGMOD, Washington, pp. 2–11 (1993)
Google Scholar
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: A New Data Clustering Algorithm and its Applications. Data Mining and Knowledge Discovery 1(2), 141–182 (1997)
Article Google Scholar
Zhou, S., Zhao, Y., Guan, J., Huang, J.Z.: A Neighborhood-Based Clustering Algorithm. In: Proc. of PAKDD, pp. 361–371 (2005)
Google Scholar
http://kdd.ics.uci.edu/databases/kddcup98/kddcup98.html

Download references

Author information

Authors and Affiliations

Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665, Warsaw, Poland
Marzena Kryszkiewicz & Piotr Lasek

Authors

Marzena Kryszkiewicz
View author publications
You can also search for this author in PubMed Google Scholar
Piotr Lasek
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computing, University of the West of Scotland, PA1 2BE, Paisley, UK
Colin Fyfe
University of Birmingham, B15 2TT, Birmingham, UK
Peter Tino
University of Ulster, Coleraine, UK
Darryl Charles
Universidad de Burgos, Burgos, Spain
Cesar Garcia-Osorio
School of Electrical and Electronic Engineering, University of Manchester, Sackville Street Building, Sackville Street, M60 1QD, Manchester, UK
Hujun Yin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kryszkiewicz, M., Lasek, P. (2010). A Neighborhood-Based Clustering by Means of the Triangle Inequality. In: Fyfe, C., Tino, P., Charles, D., Garcia-Osorio, C., Yin, H. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2010. IDEAL 2010. Lecture Notes in Computer Science, vol 6283. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15381-5_35

Download citation

DOI: https://doi.org/10.1007/978-3-642-15381-5_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15380-8
Online ISBN: 978-3-642-15381-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Neighborhood-Based Clustering by Means of the Triangle Inequality