Advertisement

A Novel Approach to Handle Huge Data for Refreshment Anomalies in Near Real-Time ETL Applications

  • N. Mohammed MuddasirEmail author
  • K. Raghuveer
Conference paper
  • 20 Downloads
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1154)

Abstract

Real-time analysis of data is the new trend to get useful insights in very less time spend on data preprocessing. Analysis of data requires the movement of data from various heterogeneous/homogenous sources to a common place known as the data warehouse. Data source for data warehouse is the transaction processing systems. Movement of data from the transactional database to the data warehouse is done using the process of extract, transform, and load (ETL). ETL previously was done during of peak hours like a night load or on weekends. The requirement of real-time analysis demands the ETL to be fast and not wait for off-peak hours. This leads to the concept of near real-time ETL, and here techniques are employed to identify the potential changed data at the transaction database and move it to the analysis database with a very minimal delay. This movement of data in real time from multiple sources in an incremental form could lead to anomalies in the data warehouse. This work discusses the various causes of anomalies and solutions to overcome them. Our main contribution is the application of loading data into temporary tables for reducing query execution time in case of overcoming refreshment anomalies.

Keywords

Near real-time ETL Refreshment anomalies TPC-DS 

References

  1. 1.
    Qu, W., Deßloch, S.: Incremental ETL Pipeline Scheduling for Near Real-Time Data Warehouses. Gesellschaft fürInformatik, Bonn (2017). 978-3-88579-659-6Google Scholar
  2. 2.
    Ricardo.: Real-time data warehouse loading methodology. In: IDEAS ‘08 Proceedings of the 2008 International Symposium on Database Engineering and Applications, ACM, Coimbra (2008)Google Scholar
  3. 3.
    Yue, Z., Hector G.-M., Hammer, J., Widom, J.: View Maintenance in a Warehousing Environment. ACM, San Jose, California (1995)Google Scholar
  4. 4.
    Yue, Z., Hector, G.-M., Wiener, J.L.: Consistency Algorithms For Multi-Source Warehouse View Maintenance. Kluwer Academic Publisher, Boston (1997)Google Scholar
  5. 5.
    Gupta, A., Jagadish, H.V., Mumick, I.S.: Data integration using self maintaince views. ACM, Cambridge (1999)Google Scholar
  6. 6.
    Jörg, T., Dessloch, S.: Near Real-Time Data Warehousing Using State-of-the-Art ETL Tools. Springer, Berlin (2010)Google Scholar
  7. 7.
    Qu, W., Basavaraj, V., Shankar, S., Dessloch, S.: Real-Time Snapshot Maintenance With Incremental ETL Pipelines in Data Warehouses. Springer, Dawak, pp. 1611–3349 (2015)Google Scholar
  8. 8.
    TPC.: TPC BENCHMARK™ DS. s.l., TPC (2019)Google Scholar
  9. 9.
    Agrawal, D., El Abbadi, A., Singh, A., Yurek, T.: Efficient View Maintenance at Data Warehouses. s.l., ACM (1997)Google Scholar
  10. 10.
    Mumick, I.S., Gupta, A., Jagadish, H.V.: Data Integration Using Self-Maintainable Views (1996)Google Scholar
  11. 11.
    Zhou, R., Hull, G.: A Framework for Supporting Data Integration Using the Materialized and Virtual Approaches. s.l., ACM (1996)Google Scholar
  12. 12.
    Zhou, R., Hull, G.: Towards the Study of Performance Trade-offs Between Materialized and Virtual Integrated Views. s.l., VIEWS (1996)Google Scholar
  13. 13.
    Sellis, A., Simitsis, P., Vassiliadis, T.: Optimizing ETL Processes In Data Warehouses. IEEE, Tokyo (2005)Google Scholar
  14. 14.
    Liu X., Thomsen C., Pedersen T.B.: A highly scalable dimensional ETL framework based on map reduce. In: Hameurlain, A., Küng, J., Wagner, R., Cuzzocrea, A., Dayal, U. (eds.) Transactions on Large-Scale Data- and Knowledge-Centered Systems, VIII. Springer, Berlin (2013)Google Scholar
  15. 15.
    Bala, M., Boussaid, O., Alimazighi Z.: Big-ETL: Extracting—Transforming—Loading Approach for Big Data. CSREA Press, Las Vegas (2015)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.Department of IS&EVVCE, NIEMysuruIndia
  2. 2.Department of IS&ENIEMysuruIndia

Personalised recommendations