Skip to main content

A Three-Way Decision Clustering Approach for High Dimensional Data

  • Conference paper
  • First Online:
Rough Sets (IJCRS 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9920))

Included in the following conference series:

Abstract

In this paper, we propose a three-way decision clustering approach for high-dimensional data. First, we propose a three-way K-medoids clustering algorithm, which produces clusters represented by three regions. Objects in the positive region of a cluster certainly belong to the cluster, objects in the negative region of a cluster definitively do not belong to the cluster, and objects in the boundary region of a cluster may belong to multiple clusters. Then, we propose the novel three-way decision clustering approach using random projection method. The basic idea is to apply the three-way K-medoids several times, increasing the dimensionality of the data after each iteration of three-way K-medoids. Because the center of the project result is used to be the initial center of the next projection, the time of computing is greatly reduced. Experimental results show that the proposed clustering algorithm is suitable for high-dimensional data and has a higher accuracy and does not sacrifice the computing time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aggarwal, C.C.: On high himensional projected clustering of uncertain data streams. In: Proceedings of the 25th International Conference on Data Engineering, pp. 1152–1154 (2009)

    Google Scholar 

  2. Cardoso, A., Wichert, A.: Iterative random projections for high-dimensional data clustering. Pattern Recogn. Lett. 33(13), 1749–1755 (2012)

    Article  Google Scholar 

  3. Choi, Y.K., Park, C.H., Kweon, I.S.: Accelerated k-means clustering using binary random projection. In: 12th Asian Conference on Computer Vision, pp. 257–272 (2014)

    Google Scholar 

  4. Dasgupta, S., Gupta, A.: An elementary proof of a theorem of Johnson and Lindenstrauss. Random Struct. Algorithms 22(1), 60–65 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  5. Deng, Z.H., Choi, K.S., Jiang, Y.Z., Wang, J., Wang, S.T.: A survey on soft subspace clustering. Inf. Sci. 348, 84–106 (2016)

    Article  MathSciNet  Google Scholar 

  6. Gan, G., Wu, J., Yang, Z.-J.: A fuzzy subspace algorithm for clustering high dimensional data. In: Li, X., Zaïane, O.R., Li, Z. (eds.) ADMA 2006. LNCS (LNAI), vol. 4093, pp. 271–278. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  7. Gunnemann, S., Kremer, H., Seidl, T.: Subspace clustering for uncertain data. In: Proceedings of the SIAM International Conference on Data Mining, SDM 2010, PP. 385–396 (2010)

    Google Scholar 

  8. Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: 30th Annual ACM Symposium on Theory of Computing, pp. 604–613. ACM Press (1998)

    Google Scholar 

  9. Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. Contemp. Math. 26, 189C–206 (1984)

    Article  MathSciNet  Google Scholar 

  10. Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)

    Book  MATH  Google Scholar 

  11. Kaur, H., Khanna, P.: Gaussian random projection based non-invertible cancelable biometric templates. Procedia Comput. Sci. 54, 661–670 (2015)

    Article  Google Scholar 

  12. Kriegel, H.P., Kroger, P., Zimek, A.: Clustering high-dimensional data: a survey on subspace clustering, pattern-based clustering and correlation clustering. ACM Trans. Knowl. Disc. Data 3(1), 337–348 (2009)

    Google Scholar 

  13. Liu, K., Guo, Y., Pan, Y.: Information system evaluation based on the multidimensional utility mergence method. Inf. Stud. Theor. Appl. 35(3), 103–108 (2012). (In Chinese)

    Google Scholar 

  14. Murtagh, F., Contreras, P.: Random projection towards the baire metric for high dimensional clustering. In: Gammerman, A., Vovk, V., Papadopoulos, H. (eds.) SLDS 2015. LNCS, vol. 9047, pp. 424–431. Springer, Heidelberg (2015). doi:10.1007/978-3-319-17091-6_37

    Chapter  Google Scholar 

  15. Papadimitriou, C.H., Raghavan, P., Tamaki, H., Vempala, S.: Latent semantic indexing: a probabilistic analysis. In: 7th ACM Symposium on Principles of Database Systems, pp. 159–168. ACM Press (1998)

    Google Scholar 

  16. Zhang, X., Gao, L., Yu, H.: Constraint based subspace clustering for high dimensional uncertain data. In: Khan, L., Bailey, J., Washio, T., Huang, J.Z., Wang, R. (eds.) PAKDD 2016. LNCS, vol. 9652, pp. 271–282. Springer, Heidelberg (2016). doi:10.1007/978-3-319-31750-2_22

    Chapter  Google Scholar 

  17. Xu, D.K., Tian, Y.J.: A Comprehensive survey of clustering algorithms. Ann. Data Sci. 2(2), 165–193 (2015)

    Article  MathSciNet  Google Scholar 

  18. Yao, Y.: An outline of a theory of three-way decisions. In: Yao, J.T., Yang, Y., Słowiński, R., Greco, S., Li, H., Mitra, S., Polkowski, L. (eds.) RSCTC 2012. LNCS, vol. 7413, pp. 1–17. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  19. Yao, Y.Y.: Three-way decisions and cognitive computing. Cogn. Comput. 8, 543–554 (2016)

    Article  Google Scholar 

  20. Yu, H., Liu, Z.G., Wang, G.Y.: An automatic method to determine the number of clusters using decision-theoretic rough set. Int. J. Approximate Reason. 55(1), 101–115 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  21. Yu, H., Zhang, C., Wang, G.Y.: A tree-based incremental overlapping clustering method using the three-way decision theory. Knowl. Based Syst. 91, 189–203 (2016)

    Article  Google Scholar 

  22. Wang, Y.T., Chen, L.H., Mei, J.P.: Incremental fuzzy clustering with multiple medoids for large data. IEEE Trans. Fuzzy Syst. 22(6), 1557–1568 (2014). IEEE Press

    Article  Google Scholar 

  23. http://personalpages.manchester.ac.uk/mbs/Julia.Handl/generators.html

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant Nos. 61379114 & 61533020.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hong Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Yu, H., Zhang, H. (2016). A Three-Way Decision Clustering Approach for High Dimensional Data. In: Flores, V., et al. Rough Sets. IJCRS 2016. Lecture Notes in Computer Science(), vol 9920. Springer, Cham. https://doi.org/10.1007/978-3-319-47160-0_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47160-0_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47159-4

  • Online ISBN: 978-3-319-47160-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics