Abstract
Stochastic proximity embedding (SPE) is a simple, fast, and scalable algorithm for generating low-dimensional Euclidean coordinates for a set of data points so that they satisfy a prescribed set of geometric constraints. Like other related methods, SPE starts with a random initial configuration and iteratively refines it by updating the positions of the data points so as to minimize the violation of the input constraints. However, instead of minimizing all violations at once using a standard gradient minimization technique, SPE stochastically optimizes one constraint at a time, in a manner reminiscent of back-propagation in artificial neural networks. Here, we review the underlying theory that gives rise to the SPE formulation and show how it can be successfully applied to a wide range of problems in data analysis, with particular emphasis on computational chemistry and biology.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Agrafiotis, D.K.: Stochastic algorithms for maximizing molecular diversity. J. Chem. Inform. Comput. Sci. 37(5), 841–851 (1997)
Agrafiotis, D.K.: A new method for analyzing protein sequence relationships based on Sammon maps. Protein Sci. 6(2), 287–293 (1997)
Agrafiotis, D.K.: Diversity of chemical libraries. In: Allinger, N.L., Clark, T., Gasteiger, J., Kollman, P.A., Schaefer III, H.F., Schreiner, P.R. (eds.) The Encyclopedia of Computational Chemistry, vol. 1, pp. 742–761. Wiley, Chichester (1998)
Agrafiotis, D.K.: Exploring the nonlinear geometry of sequence homology. Protein Sci. 12, 1604–1612 (2003)
Agrafiotis, D.K.: Stochastic proximity embedding. J. Comput. Chem. 24, 1215–1221 (2003)
Agrafiotis, D.K.: Exploring the nonlinear geometry of sequence homology. Protein Sci. 12, 1604–1612 (2003)
Agrafiotis, D.K., Alex, S., Dai, H., Derkinderen, A., Farnum, M., Gates, P., Izrailev, S., Jaeger, E.P., Konstant, P., Leung, A., Lobanov, V.S., Marichal, P., Martin, D., Rassokhin, D.N., Shemanarev, M., Skalkin, A., Stong, J., Tabruyn, T., Vermeiren, M., Wan, J., Xu, X.Y., Yao, X.: Advanced Biological and Chemical Discovery (ABCD): centralizing discovery knowledge in an inherently decentralized world. J. Chem. Inform. Model. 47(6), 1999–2014 (2007)
Agrafiotis, D.K., Bandyopadhyay, D., Carta, G., Knox, A.J.S., Lloyd, D.G.: On the effects of permuted input on conformational sampling of druglike molecules: an evaluation of stochastic proximity embedding (SPE). Chem. Biol. Drug. Des. 70(2), 123–133 (2007)
Agrafiotis, D.K., Gibbs, A., Zhu, F., Izrailev, S., Martin, E.: Conformational boosting. Aust. J. Chem. 59, 874–878 (2006)
Agrafiotis, D.K., Gibbs, A., Zhu, F., Izrailev, S., Martin, E.: Conformational sampling of bioactive molecules: a comparative study. J. Chem. Inform. Model. 47, 1067–1086 (2007)
Agrafiotis, D.K., Lobanov, V.S.: Nonlinear mapping networks. J. Chem. Inform. Comput. Sci. 40, 1356–1362 (2000)
Agrafiotis, D.K., Lobanov, V.S.: Multidimensional scaling of combinatorial libraries without explicit enumeration. J. Comput. Chem. 22(14), 1712–1722 (2001)
Agrafiotis, D.K., Lobanov, V.S., Salemme, F.R.: Combinatorial informatics in the post-genomics era. Nat. Rev. Drug. Discov. 1, 337–346 (2002)
Agrafiotis, D.K., Rassokhin, D.N., Lobanov, V.S.: Multidimensional scaling and visualization of large molecular similarity tables. J. Comput. Chem. 22(5), 488–500 (2001)
Agrafiotis, D.K., Xu, H.: A self-organizing principle for learning nonlinear manifolds. Proc. Natl. Acad. Sci. USA 99, 15869–15872 (2002)
Agrafiotis, D.K., Xu, H.: A geodesic framework for analyzing molecular similarities. J. Chem. Inform. Comput. Sci. 43, 475–484 (2003)
Allor, G., Jacob, L.: Distributed wireless sensor network localization using stochastic proximity embedding. Comput. Comm. 33, 745–755 (2010)
Bandyopadhyay, D., Agrafiotis, D.K.: A self-organizing algorithm for molecular alignment and pharmacophore development. J. Comput. Chem. 29, 965–982 (2009)
Bonnet, P., Agrafiotis, D.K., Zhu, F., Martin, E.J.: Conformational analysis of macrocycles: finding what common search methods miss. J. Chem. Inform. Model. 49, 2242–2259 (2009)
Borg, I., Groenen, P.J.F.: Modern Multidimensional Scaling: Theory and Applications. Springer, New York (1997)
Cepeda, M.S., Lobanov, V.S., Farnum, M., Weinstein, R., Gates, P., Agrafiotis, D.K., Stang, P., Berlin, J.A.: Broadening access to electronic health care databases. Nat. Rev. Drug. Discov. 9, 84 (2010)
Crippen, G.M.: Rapid calculation of coordinates from distance matrices. J. Comput. Phys. 26, 449–452 (1978)
Crippen, G.M., Havel, T.F.: Distance Geometry and Molecular Conformation. Wiley, New York (1988)
Havel, T.F., Wüthrich, K.: An evaluation of the combined use of nuclear magnetic resonance and distance geometry for the determination of protein conformations in solution. J. Mol. Biol. 182, 281–294 (1985)
Huang, E.S., Samudrala, R., Ponder, J.W.: Distance geometry generates native-like folds for small helical proteins using the consensus distances of predicted protein structures. Protein Sci. 7, 1998–2003 (1998)
Izrailev, S., Agrafiotis, D.K.: A method for quantifying and visualizing the diversity of QSAR models. J. Mol. Graph. Model. 22, 275–284 (2004)
Izrailev, S., Zhu, F., Agrafiotis, D.K.: A distance geometry heuristic for expanding the range of geometries sampled during conformational search. J. Comput. Chem. 27(16), 1962–1969 (2006)
Kruskal, J.B.: Non-metric multidimensional scaling: a numerical method. Phychometrika 29, 115–129 (1964)
Kuszewski, J., Nilges, M., Brünger, A.T.J.: Sampling and efficiency of metric matrix distance geometry: A novel partial metrization algorithm. J. Biomol. NMR. 2, 33–56 (1992)
Liberti, L., Lavor, C., Mucherino, A., Maculan, N.: Molecular distance geometry methods: from continuous to discrete. Int. Trans. Oper. Res. 18, 33–51 (2011)
Liu, P., Agrafiotis, D.K., Theobald, D.L.: Fast determination of the optimal rotation matrix for weighted superpositions. J. Comput. Chem. 31, 1561–1563 (2010)
Liu, P., Zhu, F., Rassokhin, D.N., Agrafiotis, D.K.: A self-organizing algorithm for modeling protein loops. PLoS Comput. Biol. 5(8), e1000478 (2009)
Martin, E.J., Hoeffel, T.J.: Oriented Substituent Pharmacophore PRopErtY Space (OSPPREYS): A substituent-based calculation that describes combinatorial library products better than the corresponding product-based calculation. J. Mol. Graph. Model. 18, 383–403 (2000)
Meng, E.C., Gschwend, D.A., Blaney, J.M., Kuntz, I.D.: Orientational sampling and rigid-body minimization in molecular docking. Proteins: Structure, Function, and Bioinformatics 17, 266–278 (1993)
Mumenthaler, C., Braun, W.: Automated assignment of simulated and experimental NOESY spectra of proteins by feedback filtering and self-correcting distance geometry. J. Mol. Biol. 254, 465–480 (1995)
Rassokhin, D.N., Agrafiotis, D.K.: A modified update rule for stochastic proximity embedding. J. Mol. Graph. Model. 22, 133–140 (2003)
Rassokhin, D.N., Lobanov, V.S., Agrafiotis, D.K.: Nonlinear mapping of massive data sets by fuzzy clustering and neural networks. J. Comput. Chem. 22(4), 373–386 (2011)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)
Sammon, J.W.: A nonlinear mapping for data structure analysis. IEEE Trans. Comput. C18, 401–409 (1969)
Smellie, A., Wilson, C.J., Ng, S.C.: Visualization and interpretation of high content screening data. J. Chem. Inform. Model. 46, 201–207 (2006)
Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)
Tresadern, G., Agrafiotis, D.K.: Conformational sampling with stochastic proximity embedding (SPE) and self-organizing superimposition (SOS): Establishing reasonable parameters for their practical use. J. Chem. Inform. Model. 49, 2786–2800 (2009)
Witten, I.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann (2010)
Xu, H., Izrailev, S., Agrafiotis, D.K.: Conformational sampling by self-organization. J. Chem. Inform. Comput. Sci. 43, 1186–1191 (2003)
Yang, E., Liu, P., Rassokhin, D., Agrafiotis, D.K.: Stochastic proximity embedding on graphics processing units: Taking multidimensional scaling to a new scale. J. Chem. Inform. Model. 51(11), 2852–2859 (2011)
Zhu, F., Agrafiotis, D.K.: A self-organizing superposition (SOS) algorithm for conformational sampling. J. Comput. Chem. 28, 1234–1239 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Agrafiotis, D.K., Bandyopadhyay, D., Yang, E. (2013). Stochastic Proximity Embedding: A Simple, Fast and Scalable Algorithm for Solving the Distance Geometry Problem. In: Mucherino, A., Lavor, C., Liberti, L., Maculan, N. (eds) Distance Geometry. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-5128-0_14
Download citation
DOI: https://doi.org/10.1007/978-1-4614-5128-0_14
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-5127-3
Online ISBN: 978-1-4614-5128-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)