Emergent Complexity from Nonlinearity, in Physics, Engineering and the Life Sciences pp 127-141 | Cite as

# Hebbian Learning Clustering with Rulkov Neurons

## Abstract

The recent explosion of high dimensional, high resolution ‘big-data’ from automated bioinformatics measurement techniques demands new methods for unsupervised data processing. An essential analysis step is the identification of groups of similar data, or ‘clusters’, in noisy high-dimensional data spaces, as this permits to perform some analysis steps at the group level. Popular clustering algorithms introduce an undesired cluster shape bias, require prior knowledge of the number of clusters, and are unable to properly deal with noise. Manual data gating, often used to assist these methods, is based on low-dimensional projection techniques, which is prone to obscure the underlying data structure. While Hebbian Learning Clustering successfully overcomes all of these limitations (by using only local similarities to infer global structure), previous implementations were unsuited to deal with big data sets. Here, we present a novel implementation based on realistic neuronal dynamics that removes also this obstacle. By a performance that scales favourably compared to all standard clustering algorithms, unbiased large data analysis becomes feasible on standard desktop hardware.

## Keywords

Coupling Strength Data Item Hebbian Learn Human Bone Marrow Cell Average Synchrony## References

- 1.Bendall, S.C., Simonds, E.F., Qiu, P., Amir, el-AD., Krutzik, P.O., Finck, R., Bruggner, R.V., Melamed, R., Trejo, A., Ornatsky, O.I., Balderas, R.S., Plevritis, S.K., Sachs, K., Peer, D., Tanner, S.D., Nolan, G.P.: Single-cell mass cytometry of differential immune and drug responses across a human hematopoietic continuum. Science
**332**, 687–696 (2011)Google Scholar - 2.Bréhélin, L., Gascuel, O., Martin, O.: Using repeated measurements to validate hierarchical gene clusters. Bioinformatics
**24**, 682–688 (2008)CrossRefGoogle Scholar - 3.Ge, Y., Sealfon, S.C.: FlowPeaks: a fast unsupervised clustering for flow cytometry data via k-means and density peak finding. Bioinformatics
**28**, 2052–2058 (2012)CrossRefGoogle Scholar - 4.Gomez, F., Stoop, R.L., Stoop, R.: Universal dynamical properties preclude standard clustering in a large class of biochemical data. Bioinformatics
**30**, 1–8 (2014)CrossRefGoogle Scholar - 5.Gutiérrez, R., Amann, A., Assenza, S., Gómez-Gardeñes, J., Latora, V., Boccaletti, S.: Emerging meso- and macroscales from synchronization of adaptive networks. Phys. Rev. Lett.
**107**, 234103 (2011)ADSCrossRefGoogle Scholar - 6.Jaccard, P.: Lois de distribution florale dans la zone alpine. Bull. Soc. Vaud. Sci. Nat.
**38**, 67–130 (1902)Google Scholar - 7.Jaccard, P.: The distribution of the flora in the alpine zone. New Phytol.
**11**, 37–50 (1912)CrossRefGoogle Scholar - 8.Landis, F., Ott, T., Stoop, R.: Hebbian self-organizing integrate-and-fire networks for data clustering. Neural Comput.
**22**, 273–288 (2010)MathSciNetCrossRefzbMATHGoogle Scholar - 9.Lorimer, T., Gomez, F., Stoop, R.: Two universal physical principles shape the power-law statistics of real-world networks. Sci. Rep.
**5**, 12353 (2015)ADSCrossRefGoogle Scholar - 10.Levine, J.H., Simonds, E.F., Bendall, S.C., Davis, K.L., Amir, el-AD., Tadmor, M.D., Litvin, O., Fienberg, H.G., Jager, A., Zunder, E.R., Finck, R., Gedman, A.L., Radtke, I., Downing, J.R., Peer, D., Nolan, G.P.: Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis. Cell
**162**, 184–197 (2015)Google Scholar - 11.McQueen, J.B.: Some methods for classification and analysis of multivariate observations. Proc. Fifth Berkeley Symp. Math. Statist. Prob.
**1**, 281–297 (1967)MathSciNetGoogle Scholar - 12.Ott, T., Kern, A., Schuffenhauer, A., Popov, M., Acklin, P., Jacoby, E., Stoop, R.: Sequential superparamagnetic clustering for unbiased classification of high-dimensional chemical data. J. Chem. Inf. Comput. Sci.
**44**, 1358–1364 (2004)CrossRefGoogle Scholar - 13.Rulkov, N.F.: Modeling of spiking-bursting neural behavior using two-dimensional map. Phys. Rev. E
**65**, 041922 (2002)ADSMathSciNetCrossRefzbMATHGoogle Scholar - 14.Stoop, R., Benner, P., Uwate, Y.: Real-world existence and origins of the spiral organization of shrimp-shaped domains. Phys. Rev. Lett.
**105**, 074102 (2010)ADSCrossRefGoogle Scholar - 15.Stoop, R., Martignoli, S., Benner, P., Stoop, R.L., Uwate, Y.: Shrimps: occurrence, scaling and relevance. Intl. J. Bif. Chaos
**22**, 1230032 (2012)MathSciNetCrossRefzbMATHGoogle Scholar - 16.Tarjan, R.E.: Depth first search and linear graph algorithms. SIAM J. Comput.
**1**, 146–160 (1972)MathSciNetCrossRefzbMATHGoogle Scholar - 17.Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res.
**9**, 2579–2605 (2008)zbMATHGoogle Scholar - 18.Ward Jr., J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc.
**58**, 236–244 (1963)MathSciNetCrossRefGoogle Scholar - 19.Wong, D.S.V., Wong, F.K., Wood, G.R.: A multi-stage approach to clustering and imputation of gene expression profiles. Bioinformatics
**23**, 998–1005 (2007)CrossRefGoogle Scholar