Abstract
A wide range of scientific computing applications still use algorithms provided by large old code or libraries, that rarely make profit from multiple cores architectures and hardly ever are distributed. In this paper we propose a flexible strategy for execution of those legacy codes, identifying main modules involved in the process. Key technologies involved and a tentative implementation are provided allowing to understand challenges and limitations that surround this problem. Finally a case study is presented for a large-scale, single threaded, stochastic geostatistical simulation, in the context of mining and geological modeling applications. A successful execution, running time and speedup results are shown using a workstation cluster up to eleven nodes.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Gridgain = in-memory computing platform computing, http://www.gridgain.com
Proactive: Open source solution for parallel, distributed, multi-core computing, http://proactive.activeeon.com
Remics: Reuse and migration of legacy applications to interoperable cloud services, http://www.remics.eu
Armstrong, M.P., Marciano, R.J.: Massively parallel strategies for local spatial interpolation. Computers & Geosciences 23(8), 859–867 (1997)
Bergen, A., Yazir, Y.O., Muller, H.A., Coady, Y.: RPC automation: Making legacy code relevant. In: 2013 ICSE Workshop on Software Engineering for Adaptive and Self-Managing Systems (SEAMS), pp. 175–180 (2013)
Cheng, T.: Accelerating universal Kriging interpolation algorithm using CUDA-enabled GPU. Computers & Geosciences 54, 178–183 (2013)
Deutsch, C., Journel, A.: GSLIB: Geostatistical software library and users guide. Oxford University Press, New York (1998)
Dworak, A., Charrue, P., Ehm, F., Sliwinski, W., Sobczak, M.: Middleware Trends and Market Leaders 2011. In: 13th International Conference on Accelerator and Large Experimental Physics Control Systems, p. 1334 (2011)
Emery, X., Lantuéjoul, C.: Tbsim: A computer program for conditional simulation of three-dimensional gaussian random fields via the turning bands method. Computers & Geosciences 32(10), 1615–1628 (2006), http://www.sciencedirect.com/science/article/pii/S0098300406000549
Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns: Elements of Reusable Object-oriented Software. Addison-Wesley Longman Publishing Co., Inc., Boston (1995)
Hall, D.: Ansible Configuration Management. Packt Publishing (2013)
Hintjens, P.: ZeroMQ: Messaging for Many Applications. O’Reilly Media (2013)
Hintjens, P., Sustrik, M.: Zeromq: Multithreading magic (2010), http://www.zeromq.org/whitepapers:multithreading-magic
Huang, T., Li, X., Zhang, T., Lu, D.T.: GPU-accelerated Direct Sampling method for multiple-point statistical simulation. Computers & Geosciences 57, 13–23 (2013)
Huang, T., Lu, D.T., Li, X., Wang, L.: GPU-based SNESIM implementation for multiple-point statistical simulation. Computers & Geosciences 54, 75–87 (2013)
Lith, A., Mattsson, J.: Investigating storage solutions for large data: A comparison of well performing and scalable data storage solutions for real time extraction and batch insertion of data (2010)
Lunacek, M., Braden, J., Hauser, T.: The scaling of many-task computing approaches in python on cluster supercomputers. In: 2013 IEEE International Conference on Cluster Computing (CLUSTER), pp. 1–8 (2013)
Mariethoz, G.: A general parallelization strategy for random path based geostatistical simulation methods. Computers & Geosciences 36(7), 953–958 (2010)
Nunes, R., Almeida, J.A.: Parallelization of sequential Gaussian, indicator and direct simulation algorithms. Computers & Geosciences 36(8), 1042–1052 (2010)
Peredo, O., Ortiz, J.M.: Parallel implementation of simulated annealing to reproduce multiple-point statistics. Computers & Geosciences (2011)
Peredo, O., Ortiz, J.M.: Multiple-Point Geostatistical Simulation Based on Genetic Algorithms Implemented in a Shared-Memory Supercomputer. In: Geostatistics Oslo 2012, pp. 103–114. Springer, Netherlands (2012)
Peredo, O., Ortiz, J.M., Herrero, J.R., Samaniego, C.: Tuning and hybrid parallelization of a genetic-based multi-point statistics simulation code. Parallel Computing 40(5-6), 144–158 (2014)
Plugge, E., Hawkins, T., Membrey, P.: The Definitive Guide to MongoDB: The NoSQL Database for Cloud and Desktop Computing, 1st edn. Apress, Berkely (2010)
Gutiérrez de Ravé, E., Jiménez-Hornero, F.J., Ariza-Villaverde, A.B., Gómez-López, J.M.: Using general-purpose computing on graphics processing units (GPGPU) to accelerate the ordinary kriging algorithm. Computers & Geosciences 64, 1–6 (2014)
Straubhaar, J., Renard, P., Mariethoz, G., Froidevaux, R., Besson, O.: An Improved Parallel Multiple-point Algorithm Using a List Approach. Mathematical Geosciences 43(3), 305–328 (2011)
Strauch, C., Sites, U., Kriha, W.: NoSQL databases. Lecture Notes (2011)
Tahmasebi, P., Sahimi, M., Mariethoz, G.G.: Accelerating geostatistical simulations using graphics processing units (GPU). Computers & Geosciences 46(0), 51–59 (2012)
Wilkinson, B., Allen, M.: Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers, 2nd edn. Prentice-Hall, Inc., Upper Saddle River (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Navarro, F., González, C., Peredo, Ó., Morales, G., Egaña, Á., Ortiz, J.M. (2014). A Flexible Strategy for Distributed and Parallel Execution of a Monolithic Large-Scale Sequential Application. In: Hernández, G., et al. High Performance Computing. CARLA 2014. Communications in Computer and Information Science, vol 485. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45483-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-662-45483-1_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45482-4
Online ISBN: 978-3-662-45483-1
eBook Packages: Computer ScienceComputer Science (R0)