Skip to main content

Data Warehousing in Cloud Environments

  • Living reference work entry
  • First Online:
Encyclopedia of Database Systems

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Recommended Reading

  1. Abadi DJ. Data Management in the cloud: limitations and opportunities. IEEE Data Eng Bull. 2009;32(1):3–12.

    MathSciNet  Google Scholar 

  2. Abouzeid A, Bajda-Pawlikowski K, Abadi D, Silberschatz A, Rasin A. HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. PVLDB 2009;2(1):922–933. doi:10.14778/1687627.1687731.

    Google Scholar 

  3. Agarwal S, Mozafari B, Panda A, Milner H, Madden S, Stoica I. BlinkDB: queries with bounded errors and bounded response times on very large data. Eurosys 2013. doi:10.1145/2465351.2465355.

    Google Scholar 

  4. Armbrust M, Xin RS, Lian C, et al. Spark SQL: relational data processing in spark. SIGMOD 2015. doi:10.1145/2723372.2742797.

    Google Scholar 

  5. Chan L. Presto: interacting with petabytes of data at Facebook. 2016. https://www.facebook.com/notes/facebook-engineering/presto-interacting-with-petabytes-of-data-at-facebook/10151786197628920. Accessed 28 Jun 2016.

  6. Dean J, Ghemawat S. MapReduce: a flexible data processing tool. CACM 2010;53(1):72–77. doi:10.1145/1629175.1629198.

    Article  Google Scholar 

  7. Gupta A, Agarwal D, Tan D, et al. Amazon redshift and the case for simpler data warehouses. SIGMOD 2015. doi:10.1145/2723372.2742795.

    Google Scholar 

  8. Liu X, Thomsen C, Pedersen TB. ETLMR: a highly scalable dimensional ETL framework based on MapReduce. DaWaK 2011. doi:10.1007/978-3-642-23544-3_8.

    Google Scholar 

  9. Liu X, Thomsen C, Pedersen TB. CloudETL: scalable dimensional ETL for hive. IDEAS 2014. doi:10.1145/2628194.2628249.

    Google Scholar 

  10. Olston C, Reed B, Srivastava U, Kumar R, Tomkins A. Pig Latin: a not-so-foreign language for data processing. SIGMOD 2008. doi:10.1145/1376616.1376726.

    Google Scholar 

  11. Özcan F, Hoa D, Beyer KS, Balmin A, Liu CJ, Li Y. Emerging trends in the enterprise analytics: connecting Hadoop and DB2 warehouse. SIGMOD 2011. doi:10.1145/1989323.1989446.

    Google Scholar 

  12. Pavlo A, Paulson E, Rasin A, Abadi DJ, DeWitt DJ, Madden S, Stonebraker M. A comparison of approaches to large-scale data processing. SIGMOD 2009. doi:10.1145/1559845.1559865.

    Google Scholar 

  13. Pike R, Dorward S, Griesemer R, Quinlan S. Interpreting the data: parallel analysis with Sawzall. Sci Program. 2005;13(4):277–298.

    Google Scholar 

  14. Stonebreaker M, Abadi D, DeWitt DJ, Madden S, Paulson E, Pavlo A, Rasin A. MapReduce and parallel DBMSs: friends of foes? CACM 2010;53(1):64–71. doi:10.1145/1629175.1629197.

    Article  Google Scholar 

  15. Thusso A, Sarma JS, Jain N, Shao Z, Chakka P, Anthony S, et al. Hive – a warehousing solution over a Map-Reduce framework. VLDB 2009. doi:10.14778/1687553.1687609.

    Google Scholar 

  16. Xin R, Rosen J, Zaharia M, Franklin MJ, Shenker S, Stoica I. Shark: SQL and rich analytics at scale. SIGMOD 2013. doi:10.1145/2463676.2465288

    Google Scholar 

  17. Zaharia M, Chowdhury M, Das T, et al. Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. NSDI 2012.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Thomsen .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media LLC

About this entry

Cite this entry

Thomsen, C., Pedersen, T.B. (2017). Data Warehousing in Cloud Environments. In: Liu, L., Özsu, M. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4899-7993-3_80623-1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4899-7993-3_80623-1

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4899-7993-3

  • Online ISBN: 978-1-4899-7993-3

  • eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics