Abstract
In this paper, we explore the challenges of automating experiments in data science. We propose an extensible experiment model as a foundation for integration of different open source tools for running research experiments. We implement our approach in a prototype open source MLDev software package and evaluate it in a series of experiments yielding promising results. Comparison with other state-of-the-art tools signifies novelty of our approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Data version control tool (dvc). https://dvc.org. Accessed 14 June 2021
MLDev. An open source data science experimentation and reproducibility software. https://mlrep.gitlab.io/mldev. Accessed 14 June 2021
Berkus, J.: The 5 types of open source projects. https://wackowiki.org/doc/Org/Articles/5TypesOpenSourceProjects. Accessed 14 June 2021
Bisong, E.: Google colaboratory. In: Building Machine Learning and Deep Learning Models on Google Cloud Platform, pp. 59–64. Springer. Apress, Berkeley, CA (2019). https://doi.org/10.1007/978-1-4842-4470-8_7
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Neural Information Processing Systems (NIPS), pp. 1–9 (2013)
Bunel, R., Hausknecht, M., Devlin, J., Singh, R., Kohli, P.: Leveraging grammar and reinforcement learning for neural program synthesis. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=H1Xw62kRZ
Chue Hong, N.: Reproducibility badging and definitions: a recommended practice of the national information standards organization. Nat. Inf. Stan. Organ. (NISO) (2021). https://doi.org/10.3789/niso-rp-31-2021
Di Tommaso, P., Chatzou, M., Floden, E.W., Barja, P.P., Palumbo, E., Notredame, C.: Nextflow enables reproducible computational workflows. Nat. Biotechnol. 35(4), 316–319 (2017)
Dmitriev, S.: Language oriented programming: the next programming paradigm. JetBrains onboard 1(2), 1–13 (2004)
Gundersen, O.E., Gil, Y., Aha, D.W.: On reproducible AI: towards reproducible research, open science, and digital scholarship in AI publications. AI Mag. 39(3), 56–68 (2018)
Hutson, M.: Artificial intelligence faces reproducibility crisis (2018)
Isdahl, R., Gundersen, O.E.: Out-of-the-box reproducibility: a survey of machine learning platforms. In: 2019 15th International Conference on eScience (eScience), pp. 86–95. IEEE (2019)
Khritankov, A.: Analysis of hidden feedback loops in continuous machine learning systems. arXiv preprint arXiv:2101.05673 (2021)
Khritankov, A.: Hidden feedback loops in machine learning systems: a simulation model and preliminary results. In: Winkler, D., Biffl, S., Mendez, D., Wimmer, M., Bergsmann, J. (eds.) SWQD 2021. LNBIP, vol. 404, pp. 54–65. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-65854-0_5
Kluyver, T., et al.: Jupyter Notebooks-a Publishing Format for Reproducible Computational Workflows, vol. 2016 (2016)
Nathani, D., Chauhan, J., Sharma, C., Kaul, M.: Learning attention-based embeddings for relation prediction in knowledge graphs. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 4710–4723. Association for Computational Linguistics, Florence, Italy, July 2019. https://doi.org/10.18653/v1/P19-1466, https://www.aclweb.org/anthology/P19-1466
Pineau, J., Sinha, K., Fried, G., Ke, R.N., Larochelle, H.: ICLR reproducibility challenge 2019. ReScience C 5(2), 5 (2019)
Pineau, J., et al.: Improving reproducibility in machine learning research (a report from the Neurips 2019 reproducibility program). arXiv preprint arXiv:2003.12206 (2020)
Storer, T.: Bridging the chasm: a survey of software engineering practice in scientific programming. ACM Comput. Surv. (CSUR) 50(4), 1–32 (2017)
Trisovic, A., Lau, M.K., Pasquier, T., Crosas, M.: A large-scale study on research code quality and execution. arXiv preprint arXiv:2103.12793 (2021)
Voelter, M.: Fusing modeling and programming into language-oriented programming. In: Margaria, T., Steffen, B. (eds.) ISoLA 2018. LNCS, vol. 11244, pp. 309–339. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03418-4_19
Vorontsov, K., Iglovikov, V., Strijov, V., Ustuzhanin, A., Khritankov, A.: Roundtable: challenges in repeatable experiments and reproducible research in data science. Proc. MIPT (Trudy MFTI) 13(2), 100–108 (2021). https://mipt.ru/science/trudy/
Wang, J., Tzu-Yang, K., Li, L., Zeller, A.: Assessing and restoring reproducibility of Jupyter notebooks, pp. 138–149 (2020)
Zaharia, M., et al.: Accelerating the machine learning lifecycle with MLflow. IEEE Data Eng. Bull. 41(4), 39–45 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Quality Requirements for Experiment Automation Software
A Quality Requirements for Experiment Automation Software
This is a preliminary list of quality requirements for experiment automation and reproducibility software. The requirements are based on series of in-depth interviews of data science researchers, heads of data science laboratories, academics, students and software developers in MIPT, Innopolis university and HSE.
Quality categories are given in accordance with ISO/IEC 25010 quality model standard.
Functionality
-
Ability to describe pipelines and configuration of ML experiments.
-
Run and reproduce experiments on demand and as part of a larger pipeline.
-
Prepare reports on the experiments including figures and papers.
Usability
-
Low entry barrier for data scientists who are Linux users.
-
Ability to learn gradually, easy to run first experiment
-
Technical and programming skill needed to use experiment automation tools should be lower than running experiments without it.
-
Users should be able to quickly determine the source of the errors.
Portability and Compatibility
-
Support common ML platforms (incl. Cloud Google Colab), OSes (Ubuntu 16, 18, 20, MacOS) and ML libraries (sklearn, pandas, pytorch, tensorflow...)
-
Support experiments in Python, Matlab
-
Run third-party ML tools with command-line interface
Maintainability
-
Open project, that is everyone should be able to participate and contribute.
-
Contributing to the project should not require understanding all the internal workings.
-
Should provide backward compatibility for experiment definitions.
Security/Reliability
-
Confidentiality of experiment data unless requested by user otherwise (e.g. publish results).
-
Keep experiment data secure/safe for a long time
Efficiency
-
Overhead is negligible for small and large experiment compared with the user code.
Satisfaction and Ease of Use.
-
Must be at least as rewarding/satisfactory/easy-to-use as Jupyter Notebook.
-
Interface should be similar to other tools familiar to data scientists.
Freedom from Risk
-
Using experiment automation software should not risk having their projects completed and results published.
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Khritankov, A., Pershin, N., Ukhov, N., Ukhov, A. (2022). MLDev: Data Science Experiment Automation and Reproducibility Software. In: Pozanenko, A., Stupnikov, S., Thalheim, B., Mendez, E., Kiselyova, N. (eds) Data Analytics and Management in Data Intensive Domains. DAMDID/RCDL 2021. Communications in Computer and Information Science, vol 1620. Springer, Cham. https://doi.org/10.1007/978-3-031-12285-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-12285-9_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-12284-2
Online ISBN: 978-3-031-12285-9
eBook Packages: Computer ScienceComputer Science (R0)