Abstract
In practical applications, the data set we deal with is typically high dimensional, which not only affects training speed but also makes it difficult for people to analyze and understand. It is known as “the curse of dimensionality”. Therefore, dimensionality reduction plays a key role in the multidimensional data analysis. It can improve the performance of the model and assist people in understanding the structure of data. These methods are widely used in financial field, medical field e.g. adverse drug reactions and so on. In this paper, we present a number of dimension reduction algorithms and compare their strengths and shortcomings. For more details about these algorithms, please visit our Dagoo platform via www.dagoovis.com.
Z. Sun and W. Xing—Co-first author.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Song, L., Ma, H., Wu, M., Zhou, Z., Fu, M.: A brief survey of dimension reduction. In: Peng, Y., Yu, K., Lu, J., Jiang, X. (eds.) IScIDE 2018. LNCS, vol. 11266, pp. 189–200. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02698-1_17
Fodor, I.K.: A Survey of Dimension Reduction Techniques. No. UCRL-ID-148494. Lawrence Livermore National Lab., CA, US (2002)
Tsai, C.-W., et al.: Big data analytics: a survey. J. Big Data 2(1), 21 (2015)
Engel, D., Hüttenberger, L., Hamann, B.: A survey of dimension reduction methods for high-dimensional data analysis and visualization. In: Visualization of Large and Unstructured Data Sets: Applications in Geospatial Planning, Modeling and Engineering-Proceedings of IRTG 1131 Workshop 2011. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2012)
Zhang, T., Yang, B.: Big data dimension reduction using PCA. In: 2016 IEEE International Conference on Smart Cloud (SmartCloud). IEEE (2016)
Brigham, E.: Random projection in dimension reduction: applications to image and text data. In: ACM SIGKDD ICKDDM (2001)
Ye, F., Shi, Z., Shi, Z.: A comparative study of PCA, LDA and kernel LDA for image classification. In: 2009 International Symposium on Ubiquitous Virtual Reality, Gwangju, pp. 51–54 (2009)
Lu, J., Plataniotis, K.N., Venetsanopoulos, A.N.: Face recognition using LDA-based algorithms. IEEE Trans. Neural Networks 14(1), 195–200 (2003)
Jolliffe, I.: Principal Component Analysis. Springer, Berlin Heidelberg (2011). https://doi.org/10.1007/b98835
Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdisc. Rev. Comput. Stat. 2(4), 433–459 (2010)
Roweis, S.T., Lawrence, K.S.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Dou, J., Qin, Q., Tu, Z.: Robust edit propagation based on Hessian local linear embedding. In: 2017 29th Chinese Control and Decision Conference (CCDC), Chongqing, pp. 3336–3339 (2017)
Zhang, Z., Zha, H.: Nonlinear dimension reduction via local tangent space alignment. In: Liu, J., Cheung, Y.-m., Yin, H. (eds.) IDEAL 2003. LNCS, vol. 2690, pp. 477–481. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45080-1_66
Yang, G., Xu, X., Zhang, J.: Manifold alignment via local tangent space alignment. In: 2008 International Conference on Computer Science and Software Engineering, Hubei, pp. 928–931 (2008)
Cox, T.F., Cox, M.A.A.: Multidimensional Scaling. Chapman and hall/CRC, Boca Raton (2000)
Borg, I., Groenen, P.: Modern multidimensional scaling: theory and applications. J. Educ. Meas. 40(3), 277–280 (2003)
Kruskal, J.B., Wish, M.: Multidimensional Scaling, vol. 11. Sage, London (1978)
Li, Y.: Locally multidimensional scaling for nonlinear dimensionality reduction. In: 18th International Conference on Pattern Recognition (ICPR’06), Hong Kong, pp. 202–205 (2006)
Bengio, Y., et al.: Out-of-sample extensions for LLE, Isomap, MDS, Eigenmaps, and spectral clustering. In: Advances in Neural Information Processing Systems (2004)
Fan, M., et al.: Isometric multi-manifold learning for feature extraction. In: 2012 IEEE 12th International Conference on Data Mining. IEEE (2012)
Schölkopf, B., Smola, A., Müller, K.-R.: Kernel principal component analysis. In: Gerstner, W., Germond, A., Hasler, M., Nicoud, J.-D. (eds.) ICANN 1997. LNCS, vol. 1327, pp. 583–588. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0020217
Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
Mounce, S.: Visualizing smart water meter dataset clustering with parametric t-distribution stochastic neighbor embedding. In: 2017 13th International Conference on Natural Computation, Fuzzy System and Knowledge Discovery (ICNC FSKD), Guilin, pp. 1940–1945 (2017)
Cheng, S., Zhong, W., Isaacs, K.E., Mueller, K.: Visualizing the topology and data traffic of multi-dimensional torus interconnect networks. IEEE Access 6, 57191–57204 (2018)
Cheng, S., Xu, W., Mueller, K.: RadViz Deluxe: a component-aware display for multivariate chemical data. Processes 5(4), 75 (2017)
Cheng, S., Mueller, K.: The data context map: fusing data and attributes into a unified display. IEEE Trans. Visual Comput. Graphics 22(1), 121–130 (2016)
Cheng, S., Mueller, K., Xu, W.: A framework to visualize temporal behavioral relationships in streaming multivariate data. In: New York Scientific Data Summit, pp. 1–10, New York, August 2016
Cheng, S., Mueller, K.: Improving the fidelity of contextual data layouts using a generalized Barycentric coordinates framework. In: 2015 IEEE Pacific Visualization Symposium (PacificVis), pp. 295–302 (2015)
Cheng, S., De, P., Jiang, S.H., Mueller, K.: TorusVis^ND: unraveling high-dimensional torus networks for network traffic visualizations. In: First Workshop on Visual Performance Analysis, pp. 9–16 (2014)
Cheng, S., Xu, W., Mueller, K.: ColorMapND: a data-driven approach and tool for mapping multivariate data to color. IEEE Trans. Visual Comput. Graphics 25(2), 1361–1377 (2019)
Spurek, P., Jacek, T., Śmieja, M.: Fast independent component analysis algorithm with a simple closed-form solution. Knowl.-Based Syst. 161, 26–34 (2018)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems (2002)
Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)
Cheng, S., et al.: Dagoo – A platform for big data visualization (2018). http://www.dagoovis.com
Funding Acknowledgement
This paper is support by the Program for Guangdong Introducing Innovative and Enterpreneurial Teams (Grant No.: 2017ZT07X183), the Pearl River Talent Recruitment Program Innovative and Entrepreneurial Teams in 2017 (Grant No.: 2017ZT07X152), the Shenzhen Fundamental Research Fund (Grants No.: JCYJ20170306141038939, KQJSCX20170728162302784, KQTD2015033114415450 and ZDSYS201707251409055), and Department of Science and Technology of Guangdong Province Fund(2018B030338001), and Shenzhen Science and Technology Innovation Committe, (Basic Research (Free Exploration No.: CYJ20170818104824165).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Sun, Z. et al. (2020). A Survey on Dimension Reduction Algorithms in Big Data Visualization. In: Zhang, X., Liu, G., Qiu, M., Xiang, W., Huang, T. (eds) Cloud Computing, Smart Grid and Innovative Frontiers in Telecommunications. CloudComp SmartGift 2019 2019. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 322. Springer, Cham. https://doi.org/10.1007/978-3-030-48513-9_31
Download citation
DOI: https://doi.org/10.1007/978-3-030-48513-9_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-48512-2
Online ISBN: 978-3-030-48513-9
eBook Packages: Computer ScienceComputer Science (R0)