UMDISW: A Universal Multi-Domain Intelligent Scientific Workflow Framework for the Whole Life Cycle of Scientific Data

  • Qi Sun
  • Yue LiuEmail author
  • Wenjie Tian
  • Yike Guo
  • Bocheng Li
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11459)


Existing scientific data management systems rarely manage scientific data from a whole-life-cycle perspective, and the value-creating steps defined throughout the cycle constitute essentially a scientific workflow. The scientific workflow system developed by many organizations can well meet their own domain-oriented needs, but from the perspective of the entire scientific data, there is a lack of a common framework for multiple domains. At the same time, some systems require scientists to understand the underlying content of the system, which virtually increases the workload and research costs of scientists. In this context, this paper proposes a universal multi-domain intelligent scientific data processing workflow framework (UMDISW), which builds a general model that can be used in multiple domains by defining directed graphs and descriptors, and makes the underlying layer transparent to scientists to just focus on high-level experimental design. On this basis, the paper also uses scientific data as a driving force, incorporating a mechanism of intelligently recommending algorithms into the workflow to reduce the workload of scientific experiments and provide decision support for exploring new scientific discoveries.


Scientific workflow Intelligent Scientific data Universal framework 



This work is supported by the National Key Research and Development Plan of China (Grant No. 2016YFB1000600 and 2016YFB1000601).


  1. 1.
    Andreeva, J., Campana, S., Fanzago, F., Herrala, J.: High-energy physics on the grid: the ATLAS and CMS experience. J. Grid Comput. 6(1), 3–13 (2008)CrossRefGoogle Scholar
  2. 2.
    Chen, J., Wang, W., Zi-Yang, L.I., An, L.I.: Landsat 5 satellite overview. Remote Sens. Inf. 43(3), 85–89 (2007)Google Scholar
  3. 3.
    Bengtsson-Palme, J., et al.: Strategies to improve usability and preserve accuracy in biological sequence databases. Proteomics 16(18), 2454–2460 (2016)CrossRefGoogle Scholar
  4. 4.
    Ivanova, M., Nes, N., Goncalves, R., Kersten, M.: MonetDB/SQL meets SkyServer: the challenges of a scientific database. In: International Conference on Scientific and Statistical Database Management, p. 13 (2007)Google Scholar
  5. 5.
    C. T. P. Team: Paradise: a database system for GIS applications. In: ACM SIGMOD International Conference on Management of Data, p. 485 (1995) Google Scholar
  6. 6.
    Patterson, T.C.: Google earth as a (not just) geography education tool. J. Geogr. 106(4), 145–152 (2007)CrossRefGoogle Scholar
  7. 7.
    Suchanek, F.M., Weikum, G.: Knowledge bases in the age of big data analytics. Proc. VLDB Endow. 7(13), 1713–1714 (2014)CrossRefGoogle Scholar
  8. 8.
    Schwartz, D.G., Te’Eni, D.: Encyclopedia of knowledge management. Online Inf. Rev. 5(3), 315–316 (2006)Google Scholar
  9. 9.
    Moreau, L., et al.: The open provenance model core specification (v1.1). Future Gener. Comput. Syst. 27(6), 743–756 (2011)CrossRefGoogle Scholar
  10. 10.
    Batch control part 1: models and terminology (1995)Google Scholar
  11. 11.
    Reichert, M., Rinderle, S., Dadam, P.: On the modeling of correct service flows with BPEL4WS. In: EMISA 2004, Informations system in E-Business und E-Government, Beiträge des Workshops der GI-Fachgruppe EMISA, 6–8 October 2004, Luxemburg, pp. 117–128 (2004) Google Scholar
  12. 12.
    Taylor, I., Shields, M., Wang, I., Harrison, A.: Visual grid workflow in Triana. J. Grid Comput. 3(3–4), 153–169 (2005)CrossRefGoogle Scholar
  13. 13.
    Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludascher, B., Mock, S.: Kepler: an extensible system for design and execution of scientific workflows. In: SSDBM, pp. 423–424 (2004)Google Scholar
  14. 14.
    Turi, D., Missier, P., Goble, C., De Roure, D., Oinn, T.: Taverna workflows: syntax and semantics. In: IEEE International Conference on e-Science and Grid Computing, pp. 441–448 (2008)Google Scholar
  15. 15.
    Sun, Q., Liu, Y., Tian, W., Guo, Y., Lu, J.: Multi-domain and sub-role oriented software architecture for managing scientific big data. In: Ren, R., Zheng, C., Zhan, J. (eds.) SDBA 2018. CCIS, vol. 911, pp. 111–122. Springer, Singapore (2019). Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Qi Sun
    • 1
  • Yue Liu
    • 1
    Email author
  • Wenjie Tian
    • 1
  • Yike Guo
    • 1
  • Bocheng Li
    • 1
  1. 1.School of Computer Engineering and ScienceShanghai UniversityShanghaiChina

Personalised recommendations