Abstract
Over the past years high-performance computing (HPC) simulation programs have been aggressively employed to solve complex problems in a variety of computational science and engineering disciplines. As those programs are shared in an online platform, many users can easily run their simulations on the platform as long as they are connected on the web. However, repetitive simulations from users have charged a significant burden on the platform’s limited computing and storage resources. To address the concern of inefficiency in simulation execution, we propose a big data service framework based on past simulation records. Such records are called provenances , which capture various properties in simulation. By utilizing the provenances, the platform can perform more efficient simulations via duplicate elimination and assist users with enhanced simulation service such as result prediction, execution-time estimation, and input-parameter clustering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Suh, Y.-K, Ryu, Hoon, Kim, Hanki, and Cho, Kum Won: EDISON: A Web-Based HPC Simulation Execution Framework for Large-Scale Scientific Computing Software. In: IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp. 608–612. IEEE Press, New York (2016)
The EDISON platform, http://www.edison.re.kr
Simmhan, Yogesh L., Plale, Beth, and Gannon, Dennis: A Survey of Data Provenance in e-Science. In: SIGMOD Record, Vol. 34, No. 3, pp. 31–36. ACM (2005)
Herschel, Melanie, Diestelkämper, Ralf, and Lahmar, Houssem Ben: A survey on provenance: What for? What form? What from? In: The VLDB Journal, Vol. 26, Issue 6, pp. 881–906. Springer, Heidelberg (2017)
The HUBzero platform, http://hubzero.org
nanoHUB, http://nanohub.org
DataHUB, https://datacenterhub.org
SimulationHub, https://simulationhub.com
WebMO, https://www.webmo.net
Stevens, Robert D., Robinson, Alan J., Goble, and Carole A.: myGrid: Personalised Bioinformatics on the Information Grid. In: Bioinformatics, Volume 19, Issue suppl_1, 3, pp. 302–304, July 2003
Taverna, http://taverna.sourceforge.net
Kepler, https://kepler-project.org/
Ikeda, R., Park, H., and Widom, J.: Provenance for Generalized Map and Reduce Workflows. In: 5th biennial Conference on Innovative Data Systems Research, pp. 273–283. (2011)
Akoush, S., Sohan, R., and Hopper, A.: HadoopProv: Towards Provenance as a First Class Citizen in MapReduce. In: USENIX Conference on Theory and Practice of Provenance, pp. 11:1–11:4. USENIX Association (2013)
Amsterdamer, Y., Davidson, S. B., Deutch, D., Milo, T., Stoyanovich, J., and Tannen, V.: Putting Lipstick on Pig: Enabling Database-style Workflow Provenance. In: The VLDB Endowment, Vol. 5, No. 4, pp. 346–357. VLDB Endowment (2011)
Apache Pig, https://pig.apache.org/
Hammad, R. and Wu, C.: Provenance as a Service: A Data-centric Approach for Real-time Monitoring. In: IEEE International Congress on Big Data, pp. 258—265. IEEE (2014)
e-Science Central, https://www.esciencecentral.org/
MongoDB, http://www.mongodb.com
Acknowledgement
This research was supported by Kyungpook National University Research Fund, 2017.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Suh, YK., Lee, K.Y., Baek, N. (2019). PEGASEF: A Provenance-Based Big Data Service Framework for Efficient Simulation Execution on Shared Computing Clusters. In: Lee, W., Leung, C. (eds) Big Data Applications and Services 2017. BIGDAS 2017. Advances in Intelligent Systems and Computing, vol 770. Springer, Singapore. https://doi.org/10.1007/978-981-13-0695-2_17
Download citation
DOI: https://doi.org/10.1007/978-981-13-0695-2_17
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-0694-5
Online ISBN: 978-981-13-0695-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)