Skip to main content

A Modified Relationship Based Clustering Framework for Density Based Clustering and Outlier Filtering on High Dimensional Datasets

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4426))

Included in the following conference series:

  • 1830 Accesses

Abstract

In this study, we propose a modified version of relationship based clustering framework dealing with density based clustering and outlier detection in high dimensional datasets. Originally, relationship based clustering framework is based on METIS. Therefore, it has some drawbacks such as no outlier detection and difficulty of determining the number of clusters. We propose two improvements over the framework. First, we introduce a new space which consists of tiny partitions created by METIS, hence we call it micro-partition space. Second, we used DBSCAN for clustering micro-partition space. The visualization of the results are carried out by CLUSION. Our experiments have shown that, our proposed framework produces promising results on high dimensional datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Strehl, A., Ghosh, J.: Relationship-based clustering and visualization for high dimensional data mining. INFORMS Journal on Computing, 208-230 (2003)

    Google Scholar 

  2. Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal of Scientific Computing 20(1), 359–392 (1998)

    Article  MathSciNet  Google Scholar 

  3. Karypis, G., Kumar, V.: A parallel algorithm for multilevel graph-partitioning and sparse matrix ordering. Journal of Parallel and Distributed Computing 48(1), 71–95 (1998)

    Article  MathSciNet  Google Scholar 

  4. Ester, M., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: 2nd International Conference on KDD, pp. 226–231 (1996)

    Google Scholar 

  5. Keim, D.A., Kriegel, H.P.: Visualization Techniques for Mining Large Databases: A Comparison. IEEE Trans. Knowledge and Data Eng. 8(6), 923–936 (1996)

    Article  Google Scholar 

  6. Gale, N., Halperin, W., Costanzo, C.: Unclassed matrix shading and optimal ordering in hierarchical cluster analysis. Journal of Classification 1, 75–92 (1984)

    Article  Google Scholar 

  7. The Insurance Company Benchmark (COIL 2000). The UCI KDD Archive (February 2006), http://www.ics.uci.edu/~kdd/databases/tic/tic.html

  8. BBC news articles dataset from Trinity College Computer Science Department (February 2006), https://www.cs.tcd.ie/Derek.Greene/research/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Zhi-Hua Zhou Hang Li Qiang Yang

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Bilgin, T.T., Camurcu, A.Y. (2007). A Modified Relationship Based Clustering Framework for Density Based Clustering and Outlier Filtering on High Dimensional Datasets. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71701-0_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71700-3

  • Online ISBN: 978-3-540-71701-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics