Efficient Data Management for Putting Forward Data Centric Sciences

  • Genoveva Vargas-SolarEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 767)


The novel and multidisciplinary data centric and scientific movement promises new and not yet imagined applications that rely on massive amounts of evolving data that need to be cleaned, integrated, and analysed for modelling, prediction, and critical decision making purposes. This paper explores the key challenges and opportunities for data management in this new scientific context, and discusses how data management can best contribute to data centric sciences applications through clever data science strategies.



This work has been partially funded by the project MULTIPOINT, the cooperation contract Clouding Things and the COST EU Actions KEYSTONE.


  1. 1.
    Alexandrov, A., Bergmann, R., Ewen, S., Freytag, J.C., Hueske, F., Heise, A., Kao, O., Leich, M., Leser, U., Markl, V., et al.: The stratosphere platform for big data analytics. VLDB J. 23(6), 939–964 (2014)CrossRefGoogle Scholar
  2. 2.
    Amedro, B., Baude, F., Caromel, D., Delbé, C., Filali, I., Huet, F., Mathias, E., Smirnov, O.: An efficient framework for running applications on clusters, grids, and clouds. In: Antonopoulos, N., Gillam, L. (eds.) Cloud Computing, pp. 163–178. Springer, London (2010)CrossRefGoogle Scholar
  3. 3.
    Athanassoulis, M., Kester, M., Maas, L., Stoica, R., Idreos, S., Ailamaki, A., Callaghan, M.: Designing access methods: The rum conjecture. In: International Conference on Extending Database Technology (EDBT) (2016)Google Scholar
  4. 4.
    Borthakur, D.: The hadoop distributed file system: Architecture and design. Hadoop Proj. Website 11(2007), 21 (2007)Google Scholar
  5. 5.
    Cortes, T., Queralt, A., Martí, J., Labarta, J.: DataClay: Towards Usable and Shareable Storage Big Data and Extreme-Scale Computing (BDEC), White paper, pp. 1–3.
  6. 6.
    Franklin, M.: The berkeley data analytics stack: Present and future. In: 2013 IEEE International Conference on Big Data, pp. 2–3. IEEE (2013)Google Scholar
  7. 7.
    Gunarathne, T., Zhang, B., Tak-Lon, W., Qiu, J.: Scalable parallel computing on clouds using twister4azure iterative mapreduce. Futur. Gener. Comput. Syst. 29(4), 1035–1048 (2013)CrossRefGoogle Scholar
  8. 8.
    Idreos, S., Alagiannis, I., Johnson, R., Ailamaki, A.: Here are my data files. here are my queries. where are my results? In: Proceedings of 5th Biennial Conference on Innovative Data Systems Research, number EPFL-CONF-161489 (2011)Google Scholar
  9. 9.
    Lordan, F., Tejedor, E., Ejarque, J., Rafanell, R., Alvarez, J., Marozzo, F., Lezzi, D., Sirvent, R., Talia, D., Badia, R.M.: Servicess: An interoperable programming framework for the cloud. J. Grid Comput. 12(1), 67–91 (2014)CrossRefGoogle Scholar
  10. 10.
    Marwick, B.: Computational reproducibility in archaeological research: basic principles and a case study of their implementation. J. Archaeol. Method Theor. 24, 1–27 (2016)Google Scholar
  11. 11.
    McFedries, P.: Beyond just big data, We’re all data geeks now. IEEE Spectr. 53(8), 29 (2015). Google Scholar
  12. 12.
    Peng, J., Zhang, X., Lei, Z., Zhang, B., Zhang, W., Li, Q.: Comparison of several cloud computing platforms. In: 2009 Second International Symposium on Information Science and Engineering (ISISE), pp. 23–27. IEEE (2009)Google Scholar
  13. 13.
    Simmhan, Y., Van Ingen, C., Subramanian, G., Li, J.: Bridging the gap between desktop and the cloud for escience applications. In: 2010 IEEE 3rd International Conference on Cloud Computing (CLOUD), pp. 474–481. IEEE (2010)Google Scholar
  14. 14.
    Steeb, W.H., Hardy, Y., Hardy, A., Stoop, R.: Problems and solutions in scientific computing with C++ and java simulations world scientific publishing (2004)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG-LAFMIAGrenobleFrance

Personalised recommendations