Abstract
The chapter presents the incremental approach and agile principles as the alternative methodology for developing an analytical system in light data lake environment. The evaluation of the proposed LDL system building procedure was carried out in a case study in the European restaurant operating in several European countries. The obtained models of data analysis performed in sprints are reliable and the whole approach is very effective. In addition, this approach gives companies great flexibility as they can develop each dimension of the system independently and with varying degrees of intensity, according to the needs and financial resources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
A good example of presenting numerous JN-based projects is the annual JupyetrCon conference: (see: https://conferences.oreilly.com/jupyter/jup-ny).
References
Reinsel, D., Gantz, J., & Rydning, J. (2018). The digitization of the world from edge to core. An IDC White Paper.
McAfee, A., & Brynjolfsson, E. (2012). Big data: The management revolution. Harvard Business Review.
Miloslavskaya, N., & Tolstoy, A. (2016). Big data, fast data and data lake concepts. Procedia Engineering, 2017(88), 300–305.
Gorelik, A. (2019). The enterprise big data lake: Delivering the promise of big data and data science. O’Reilly Media.
Hagstroem, M., Roggendorf, M., Saleh, T., & Sharma, J. (2017). A smarter way to jump into data lakes. Retrieved from https://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/a-smarter-way-to-jump-into-data-lakes. Accessed September 11, 2019.
Tomcy, J., & Pankaj, M. (2017). Data lake for enterprises. Packt Publishing.
Collier, K. W. (2011). Agile analytics: A value-driven approach to business intelligence and data warehousing. Pearson Education. ISBN 9780321669544.
Sakovich, N. (2018). Waterfall vs. Agile: A comparison of software development methodologies. SamSolutions.
Terrizzano, I. G., & Schwarz, P. M., Roth, M., Colino, J. E. (2015). Data wrangling: The challenging journey from the wild to the lake. In CIDR.
Ravat, F., & Zhao, Y. (2019). Data lakes: Trends and perspectives. In Hartmann, S., Küng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A., & Khalil, I. (Eds.), Database and expert systems applications. DEXA 2019. Lecture notes in computer science (vol. 11706). Springer, Cham.
Davenport, T. (2017). How analytics has changed in the last 10 years (and how it’s staved the same). Harvard Business Review. https://bit.ly/2sG6FUb. Accessed August 20, 2019.
Naregsian, F., Zhu, E., & Miller, R. J. (2019). Data lake management: Challenges and opportunities. PVLDB, 12. https://doi.org/10.14778/3352063.3352116.
Llave, M. R. (2018). Data lakes in business intelligence: Reporting from the trenches. Procedia Computer Science, 138, 516–524.
Sitarska-Buba, M., & Zygala, R. (2020). Data lake: Strategic challenges for small and medium sized enterprises. In Hernes, M., Rot, A., & Jelonek, D. (Eds.), Towards Industry 4.0—current challenges in information systems. Lecture Notes in Computational Intelligence. Springer (in print).
Carvalho, L. A., Wang, R., Gil, Y., & Garijo, D. (2017). NiW: converting notebooks into workflows to capture dataflow and provenance. In Conference on Knowledge Capture (K-CAP).
Zhang, Y., & Ives G. (2019). Juneau: Data lake management for jupyter.
LaPlante, A., & Sharma, B. (2016). Architecting data lakes. O’Reilly Media, Inc. ISBN: 9781492042518.
Khine, P. P., & Wang, Z. S. (2017). Data lake: A new ideology in big data era. In ITM Web of Conferences WCSN 2017 (vol. 17, pp. 1–6), Wuhan, China. https://doi.org/10.1051/itmconf/2018170302.
Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., & Wirth, R. (2000). CRISP-DM 1.0, Step-by-step data mining guide. Retrieved from https://pdfs.semanticscholar.org/5406/1a4aa0cb241a726f54d0569efae1c13aab3a.pdf?_ga=2.72896979.1919626922.1564398401-1661122695.1564398401. Accessed October 10, 2019.
Peng, R., & Matsui, E. (2017). The art of data science. http://leanpub.com/artofdatascience. Accessed August 21, 2019.
Fuentes, A. (2018). Hands-on predictive analytics with python. Packt Publishing. ISBN: 9781789138719.
Davenport, T. H. (2014). Big data @ work: Dispelling the myths, uncovering the opportunities. Harvard Business Review Press.
Larsona, D., & Chang, V. (2016). A review and future direction of agile, business intelligence, analytics and data science. International Journal of Information Management, 36(5), 700–710.
Acknowledgements
The project is financed by the Ministry of Science and Higher Education in Poland under the programme “Regional Initiative of Excellence” 2019–2022 project number 015/RID/2018/19 total funding amount 10,721,040.00 PLN.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Gryncewicz, W., Sitarska-Buba, M., Zygała, R. (2020). Agile Approach to Develop Data Lake Based Systems. In: Hernes, M., Rot, A., Jelonek, D. (eds) Towards Industry 4.0 — Current Challenges in Information Systems. Studies in Computational Intelligence, vol 887. Springer, Cham. https://doi.org/10.1007/978-3-030-40417-8_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-40417-8_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-40416-1
Online ISBN: 978-3-030-40417-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)