Skip to main content

A Survey on Dimension Reduction Algorithms in Big Data Visualization

  • Conference paper
  • First Online:
Cloud Computing, Smart Grid and Innovative Frontiers in Telecommunications (CloudComp 2019, SmartGift 2019)

Abstract

In practical applications, the data set we deal with is typically high dimensional, which not only affects training speed but also makes it difficult for people to analyze and understand. It is known as “the curse of dimensionality”. Therefore, dimensionality reduction plays a key role in the multidimensional data analysis. It can improve the performance of the model and assist people in understanding the structure of data. These methods are widely used in financial field, medical field e.g. adverse drug reactions and so on. In this paper, we present a number of dimension reduction algorithms and compare their strengths and shortcomings. For more details about these algorithms, please visit our Dagoo platform via www.dagoovis.com.

Z. Sun and W. Xing—Co-first author.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Song, L., Ma, H., Wu, M., Zhou, Z., Fu, M.: A brief survey of dimension reduction. In: Peng, Y., Yu, K., Lu, J., Jiang, X. (eds.) IScIDE 2018. LNCS, vol. 11266, pp. 189–200. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02698-1_17

    Chapter  Google Scholar 

  2. Fodor, I.K.: A Survey of Dimension Reduction Techniques. No. UCRL-ID-148494. Lawrence Livermore National Lab., CA, US (2002)

    Google Scholar 

  3. Tsai, C.-W., et al.: Big data analytics: a survey. J. Big Data 2(1), 21 (2015)

    Article  Google Scholar 

  4. Engel, D., Hüttenberger, L., Hamann, B.: A survey of dimension reduction methods for high-dimensional data analysis and visualization. In: Visualization of Large and Unstructured Data Sets: Applications in Geospatial Planning, Modeling and Engineering-Proceedings of IRTG 1131 Workshop 2011. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2012)

    Google Scholar 

  5. Zhang, T., Yang, B.: Big data dimension reduction using PCA. In: 2016 IEEE International Conference on Smart Cloud (SmartCloud). IEEE (2016)

    Google Scholar 

  6. Brigham, E.: Random projection in dimension reduction: applications to image and text data. In: ACM SIGKDD ICKDDM (2001)

    Google Scholar 

  7. Ye, F., Shi, Z., Shi, Z.: A comparative study of PCA, LDA and kernel LDA for image classification. In: 2009 International Symposium on Ubiquitous Virtual Reality, Gwangju, pp. 51–54 (2009)

    Google Scholar 

  8. Lu, J., Plataniotis, K.N., Venetsanopoulos, A.N.: Face recognition using LDA-based algorithms. IEEE Trans. Neural Networks 14(1), 195–200 (2003)

    Article  Google Scholar 

  9. Jolliffe, I.: Principal Component Analysis. Springer, Berlin Heidelberg (2011). https://doi.org/10.1007/b98835

    Book  MATH  Google Scholar 

  10. Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdisc. Rev. Comput. Stat. 2(4), 433–459 (2010)

    Article  Google Scholar 

  11. Roweis, S.T., Lawrence, K.S.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)

    Article  Google Scholar 

  12. Dou, J., Qin, Q., Tu, Z.: Robust edit propagation based on Hessian local linear embedding. In: 2017 29th Chinese Control and Decision Conference (CCDC), Chongqing, pp. 3336–3339 (2017)

    Google Scholar 

  13. Zhang, Z., Zha, H.: Nonlinear dimension reduction via local tangent space alignment. In: Liu, J., Cheung, Y.-m., Yin, H. (eds.) IDEAL 2003. LNCS, vol. 2690, pp. 477–481. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45080-1_66

    Chapter  Google Scholar 

  14. Yang, G., Xu, X., Zhang, J.: Manifold alignment via local tangent space alignment. In: 2008 International Conference on Computer Science and Software Engineering, Hubei, pp. 928–931 (2008)

    Google Scholar 

  15. Cox, T.F., Cox, M.A.A.: Multidimensional Scaling. Chapman and hall/CRC, Boca Raton (2000)

    Book  Google Scholar 

  16. Borg, I., Groenen, P.: Modern multidimensional scaling: theory and applications. J. Educ. Meas. 40(3), 277–280 (2003)

    Article  Google Scholar 

  17. Kruskal, J.B., Wish, M.: Multidimensional Scaling, vol. 11. Sage, London (1978)

    Book  Google Scholar 

  18. Li, Y.: Locally multidimensional scaling for nonlinear dimensionality reduction. In: 18th International Conference on Pattern Recognition (ICPR’06), Hong Kong, pp. 202–205 (2006)

    Google Scholar 

  19. Bengio, Y., et al.: Out-of-sample extensions for LLE, Isomap, MDS, Eigenmaps, and spectral clustering. In: Advances in Neural Information Processing Systems (2004)

    Google Scholar 

  20. Fan, M., et al.: Isometric multi-manifold learning for feature extraction. In: 2012 IEEE 12th International Conference on Data Mining. IEEE (2012)

    Google Scholar 

  21. Schölkopf, B., Smola, A., Müller, K.-R.: Kernel principal component analysis. In: Gerstner, W., Germond, A., Hasler, M., Nicoud, J.-D. (eds.) ICANN 1997. LNCS, vol. 1327, pp. 583–588. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0020217

    Chapter  Google Scholar 

  22. Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

  23. Mounce, S.: Visualizing smart water meter dataset clustering with parametric t-distribution stochastic neighbor embedding. In: 2017 13th International Conference on Natural Computation, Fuzzy System and Knowledge Discovery (ICNC FSKD), Guilin, pp. 1940–1945 (2017)

    Google Scholar 

  24. Cheng, S., Zhong, W., Isaacs, K.E., Mueller, K.: Visualizing the topology and data traffic of multi-dimensional torus interconnect networks. IEEE Access 6, 57191–57204 (2018)

    Article  Google Scholar 

  25. Cheng, S., Xu, W., Mueller, K.: RadViz Deluxe: a component-aware display for multivariate chemical data. Processes 5(4), 75 (2017)

    Article  Google Scholar 

  26. Cheng, S., Mueller, K.: The data context map: fusing data and attributes into a unified display. IEEE Trans. Visual Comput. Graphics 22(1), 121–130 (2016)

    Article  Google Scholar 

  27. Cheng, S., Mueller, K., Xu, W.: A framework to visualize temporal behavioral relationships in streaming multivariate data. In: New York Scientific Data Summit, pp. 1–10, New York, August 2016

    Google Scholar 

  28. Cheng, S., Mueller, K.: Improving the fidelity of contextual data layouts using a generalized Barycentric coordinates framework. In: 2015 IEEE Pacific Visualization Symposium (PacificVis), pp. 295–302 (2015)

    Google Scholar 

  29. Cheng, S., De, P., Jiang, S.H., Mueller, K.: TorusVis^ND: unraveling high-dimensional torus networks for network traffic visualizations. In: First Workshop on Visual Performance Analysis, pp. 9–16 (2014)

    Google Scholar 

  30. Cheng, S., Xu, W., Mueller, K.: ColorMapND: a data-driven approach and tool for mapping multivariate data to color. IEEE Trans. Visual Comput. Graphics 25(2), 1361–1377 (2019)

    Article  Google Scholar 

  31. Spurek, P., Jacek, T., Śmieja, M.: Fast independent component analysis algorithm with a simple closed-form solution. Knowl.-Based Syst. 161, 26–34 (2018)

    Article  Google Scholar 

  32. Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems (2002)

    Google Scholar 

  33. Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)

    Article  MathSciNet  Google Scholar 

  34. Cheng, S., et al.: Dagoo – A platform for big data visualization (2018). http://www.dagoovis.com

Download references

Funding Acknowledgement

This paper is support by the Program for Guangdong Introducing Innovative and Enterpreneurial Teams (Grant No.: 2017ZT07X183), the Pearl River Talent Recruitment Program Innovative and Entrepreneurial Teams in 2017 (Grant No.: 2017ZT07X152), the Shenzhen Fundamental Research Fund (Grants No.: JCYJ20170306141038939, KQJSCX20170728162302784, KQTD2015033114415450 and ZDSYS201707251409055), and Department of Science and Technology of Guangdong Province Fund(2018B030338001), and Shenzhen Science and Technology Innovation Committe, (Basic Research (Free Exploration No.: CYJ20170818104824165).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shenghui Cheng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sun, Z. et al. (2020). A Survey on Dimension Reduction Algorithms in Big Data Visualization. In: Zhang, X., Liu, G., Qiu, M., Xiang, W., Huang, T. (eds) Cloud Computing, Smart Grid and Innovative Frontiers in Telecommunications. CloudComp SmartGift 2019 2019. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 322. Springer, Cham. https://doi.org/10.1007/978-3-030-48513-9_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-48513-9_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-48512-2

  • Online ISBN: 978-3-030-48513-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics