Abstract
In this study, we propose a modified version of relationship based clustering framework dealing with density based clustering and outlier detection in high dimensional datasets. Originally, relationship based clustering framework is based on METIS. Therefore, it has some drawbacks such as no outlier detection and difficulty of determining the number of clusters. We propose two improvements over the framework. First, we introduce a new space which consists of tiny partitions created by METIS, hence we call it micro-partition space. Second, we used DBSCAN for clustering micro-partition space. The visualization of the results are carried out by CLUSION. Our experiments have shown that, our proposed framework produces promising results on high dimensional datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Strehl, A., Ghosh, J.: Relationship-based clustering and visualization for high dimensional data mining. INFORMS Journal on Computing, 208-230 (2003)
Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal of Scientific Computing 20(1), 359–392 (1998)
Karypis, G., Kumar, V.: A parallel algorithm for multilevel graph-partitioning and sparse matrix ordering. Journal of Parallel and Distributed Computing 48(1), 71–95 (1998)
Ester, M., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: 2nd International Conference on KDD, pp. 226–231 (1996)
Keim, D.A., Kriegel, H.P.: Visualization Techniques for Mining Large Databases: A Comparison. IEEE Trans. Knowledge and Data Eng. 8(6), 923–936 (1996)
Gale, N., Halperin, W., Costanzo, C.: Unclassed matrix shading and optimal ordering in hierarchical cluster analysis. Journal of Classification 1, 75–92 (1984)
The Insurance Company Benchmark (COIL 2000). The UCI KDD Archive (February 2006), http://www.ics.uci.edu/~kdd/databases/tic/tic.html
BBC news articles dataset from Trinity College Computer Science Department (February 2006), https://www.cs.tcd.ie/Derek.Greene/research/
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Bilgin, T.T., Camurcu, A.Y. (2007). A Modified Relationship Based Clustering Framework for Density Based Clustering and Outlier Filtering on High Dimensional Datasets. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_40
Download citation
DOI: https://doi.org/10.1007/978-3-540-71701-0_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71700-3
Online ISBN: 978-3-540-71701-0
eBook Packages: Computer ScienceComputer Science (R0)