Medical Big Data Warehouse: Architecture and System Design, a Case Study: Improving Healthcare Resources Distribution
- 571 Downloads
The huge increases in medical devices and clinical applications which generate enormous data have raised a big issue in managing, processing, and mining this massive amount of data. Indeed, traditional data warehousing frameworks can not be effective when managing the volume, variety, and velocity of current medical applications. As a result, several data warehouses face many issues over medical data and many challenges need to be addressed. New solutions have emerged and Hadoop is one of the best examples, it can be used to process these streams of medical data. However, without an efficient system design and architecture, these performances will not be significant and valuable for medical managers. In this paper, we provide a short review of the literature about research issues of traditional data warehouses and we present some important Hadoop-based data warehouses. In addition, a Hadoop-based architecture and a conceptual data model for designing medical Big Data warehouse are given. In our case study, we provide implementation detail of big data warehouse based on the proposed architecture and data model in the Apache Hadoop platform to ensure an optimal allocation of health resources.
KeywordsData warehouse Hadoop Big data Decision support Medical resources allocation
This work was partially supported by the Ministry of Higher Education and Scientific Research of Algeria and the University of Bejaia, under the project CNEPRU (Ref. B*00620140066/2015-2018).
Compliance with Ethical Standards
Conflict of Interest
Authors declare that they have no conflict of interest.
This article does not contain any studies with human participants or animals performed by any of the authors.
- 2.Cuzzocrea, A., Warehousing and Protecting Big Data: State-Of-The-Art-Analysis, Methodologies, Future Challenges. In Proceedings of the International Conference on Internet of things and Cloud Computing (p. 14). ACM, 2016. https://doi.org/10.1145/2896387.2900335
- 3.White, T., Hadoop: The definitive guide (third edition). O’Reilly, 2012. ISBN: 978-1-449-322252-0.Google Scholar
- 4.Sumathi, S., and Esakkirajan, S., Fundamentals of relational database management systems (Vol. 47). Springer, 2007. ISBN: 978 3 540 48397 7.Google Scholar
- 5.Ewen, E.F., Medsker, C.E., and Dusterhoft, L.E., Data warehousing in an integrated health system: building the business case. In Proceedings of the 1st ACM international workshop on Data warehousing and OLAP (pp. 47–53). ACM, 1998. https://doi.org/10.1145/294260.294271
- 6.Pedersen, T.B., and Jensen, C.S., Research issues in clinical data warehousing. In Scientific and Statistical Database Management. Proceedings. Tenth international conference on (pp. 43–52). IEEE, 1998. https://doi.org/10.1109/SSDM.1998.688110
- 7.Guérin, E., Moussouni, F., Courselaud, B., and Loréal, O., UML modeling of Gedaw: A gene expression data warehouse specialised in the liver. In The 3rd French bioinformatics conference proceeding: JOBIM 2002 (pp. 319–334), Saint-Malo, France, 2002.Google Scholar
- 8.Banek, M., Tjoa, A.M., and Stolba, N., Integrating different grain levels in a medical data warehouse federation. In International Conference on Data Warehousing and Knowledge Discovery (pp. 185–194). Springer Berlin Heidelberg, 2006. https://doi.org/10.1007/11823728_18
- 10.Pavalam, S.M., Jawahar, M., and Akorli, F.K., Data warehouse based Architecture for Electronic Health Records for Rwanda. In Education and Management Technology (ICEMT) International Conference on (pp. 253–255). IEEE, 2010. https://doi.org/10.1109/ICEMT.2010.5657660
- 12.Sebaa, A., Nouicer, A., Tari, A., Ramtani, T., and Ouhab, A., Decision support system for Health Care Resources allocation. Abstracts Book of ICHSMT’16- International Conference on Health Sciences and Medical Technologies; 2016 Sep 27-29; Tlemcen, Algeria. Mehr publishing. p. 8, 2016. ISBN: 978-600-96661-0-2.Google Scholar
- 13.Sebaa, A., Tari, A., Ramtani, T., and Ouhab, A., DW RHSB: A framework for optimal allocation of health resources. Int. J. Comput. Sci. Commun Inf. Technol. 2(1):12–17, 2015.Google Scholar
- 15.Cuzzocrea, A., Song, I.Y., and Davis, K.C., Analytics over large-scale multidimensional data: the big data revolution. In Proceedings of the ACM 14th international workshop on Data Warehousing and OLAP. pp. 101–104. ACM, 2011. https://doi.org/10.1145/2064676.2064695
- 16.Sebaa, A., Nouicer, N., Chikh, F., and Tari, A., Big Data Technologies to Improve Medical Data Warehousing. In Proceedings of 2nd international conference on Big Data, Cloud and Applications. ACM, 2017. https://doi.org/10.1145/3090354.3090376
- 21.Rodger, J.A., Discovery of medical big data analytics: Improving the prediction of traumatic brain injury survival rates by data mining patient informatics processing software hybrid Hadoop hive. Informatics in Medicine Unlocked. 1:17–26, 2015. https://doi.org/10.1016/j.imu.2016.01.002.CrossRefGoogle Scholar
- 23.Raja, P.V., and Sivasankar, E., Modern Framework for Distributed Healthcare Data Analytics Based on Hadoop. In Information and Communication Technology-EurAsia Conference (pp. 348–355). Springer Berlin Heidelberg, 2014. https://doi.org/10.1007/978-3-642-55032-4_34
- 25.Sebaa, A., Chick, F., Nouicer, A., and Tari, A., Research in big data warehousing using Hadoop. J. Inform. Syst. Eng. Manag. 2(2), 2017. https://doi.org/10.20897/jisem.201710.
- 27.Wu, S., Li, F., Mehrotra, S., and Ooi, B.C., Query optimization for massively parallel data processing. In Proceedings of the 2nd ACM Symposium on Cloud Computing (p. 12). ACM, 2011. https://doi.org/10.1145/2038916.2038928
- 28.Apache Hadoop: http://hadoop.apache.org/, Viewed in 02/2015.
- 30.Apache Hive: https://hive.apache.org/, Viewed in 02/2015.
- 31.Liu, X., Thomsen, C., and Pedersen, T.B., ETLMR: a highly scalable dimensional ETL framework based on mapreduce. In Transactions on Large-Scale Data-and Knowledge-Centered Systems VIII (pp. 1–31). Springer Berlin Heidelberg, 2013. https://doi.org/10.1007/978-3-642-37574-3_1
- 32.Gao, S., Li, L., Li, W., Janowicz, K., and Zhang, Y., Constructing gazetteers from volunteered big geo-data based on Hadoop. Comput. Environ. Urban. Syst. 61:172–186, 2017. https://doi.org/10.1016/j.compenvurbsys.2014.02.004.Google Scholar
- 35.ANDI: National Agency for Investment Development of Algeria, http://www.andi.dz/index.php/en/secteur-de-sante, Viewed in 02/2015.