Clustering Methods for Spherical Data: An Overview and a New Generalization
Recent advances in data acquisition technologies have led to massive amount of data collected routinely in information sciences and technology, as well as engineering sciences. In this big data era, a clustering analysis is a fundamental and crucial step in an attempt to explore structures and patterns in massive data sets, where clustering objects (data) are represented as vectors. Often such high-dimensional vectors are \(L_2\) normalized so that they lie on the surface of unit hypersphere, transforming them into spherical data. Thus, clustering such data is equivalent to grouping spherical data, where either cosine similarity or correlation is a desired metric to identify similar observations, rather than Euclidean similarity metrics. In this chapter, an overview of different clustering methods for spherical data in the literature is presented. A model-based generalization for asymmetric spherical data is also introduced.
- Rosenbaum, P. R., Rubin, D. B. (1983) The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41–55Google Scholar
- SenGupta, A. (2016). High volatility, multimodal distributions and directional statistics. Special Invited Paper, Platinum Jubilee International Conference on Applicaitons of Statistics, Calcutta University, 21–23 DecGoogle Scholar