Big Data Warehouses for Smart Industries
Definition of Terms
Big Data Warehouse (BDW). A BDW can be defined as a scalable, highly performant, and flexible storage and processing system, capable of dealing with the ever-increasing volume, variety, and velocity of data, i.e., Big Data, while lowering the costs of traditional Data Warehousing architectures through the use of commodity hardware. Big Data imposes severe difficulties for traditional data storage and processing technologies, and the BDW aims to overcome these challenges and support near real-time descriptive and predictive Big Data Analytics over huge amounts of heterogeneous data (Krishnan 2013; Russom 2016; Costa et al. 2017; Santos et al. 2017).
Smart Industry. A Smart Industry can be seen as an organization chain from any industrial sector (e.g., manufacturing, services) with high digitalization levels, which supports the replication of the physical world into a virtual world, through an environment that is highly connected...
This entry has been supported by COMPETE: POCI-01-0145-FEDER-007043 and FCT (Fundação para a Ciência e Tecnologia) within the Project Scope: UID/CEC/00319/2013 and the Doctoral scholarship (PD/BDE/135101/2017) and by European Structural and Investment Funds in the FEDER component, through the Operational Competitiveness and Internationalization Programme (COMPETE 2020) [Project n° 002814; Funding Reference: POCI-01-0247-FEDER-002814].
- Apache Hive (2017) Apache Hive documentation. Apache Software Foundation. https://cwiki.apache.org/confluence/display/Hive/Home. Accessed 12 May 2017
- Costa C, Santos MY (2017a) The SusCity Big Data Warehousing approach for smart cities. In: Proceedings of international database engineering & applications symposium, p 10Google Scholar
- Costa E, Costa C, Santos MY (2017) Efficient big data modelling and organization for Hadoop Hive-based data warehouses. Coimbra, PortugalGoogle Scholar
- Hermann M, Pentek T, Otto B (2016) Design principles for Industrie 4.0 scenarios. In: 2016 49th Hawaii International Conference on System Sciences (HICSS), pp 3928–3937Google Scholar
- Kagermann H, Wahlster W, Helbig J (2013) Recommendations for implementing the strategic initiative INDUSTRIE 4.0. National Academy of Science and Engineering, MünchenGoogle Scholar
- Kimball R, Ross M (2013) The data warehouse toolkit: the definitive guide to dimensional modeling, 3rd edn. Wiley, IndianapolisGoogle Scholar
- Krishnan K (2013) Data warehousing in the age of big data, 1st edn. Morgan Kaufmann Publishers, San FranciscoGoogle Scholar
- Lipcon T, Alves D, Burkert D, et al (2015) Kudu: storage for fast analytics on fast data. Cloudera. Unpublished paper from the KUDU team. http://getkudu.io/kudu.pdf
- Mackey G, Sehrish S, Wang J (2009) Improving metadata management for small files in HDFS. In: 2009 IEEE international conference on cluster computing and workshops, pp 1–4Google Scholar
- Manyika J, Chui M, Brown B, et al (2011) Big data: the next frontier for innovation, competition, and productivity. McKinsey Global InstituteGoogle Scholar
- Marz N, Warren J (2015) Big data: principles and best practices of scalable realtime data systems. Manning Publications Co, Shelter IslandGoogle Scholar
- NBD-PWG (2015) NIST big data interoperability framework: volume 6, reference architecture. National Institute of Standards and Technology, GaithersburgGoogle Scholar
- Russom P (2016) Data warehouse modernization in the age of big data analytics. The Data Warehouse Institute, RentonGoogle Scholar
- Santos MY, Costa C, Galvão J, et al (2017) Evaluating SQL-on-Hadoop for big data warehousing on not-so-good hardware. In: Proceedings of international database engineering & applications symposium (IDEAS’17), BristolGoogle Scholar
- Vale Lima F (2017) Big data warehousing em tempo real: Da Recolha ao Processamento de Dados. University of Minho, GuimarãesGoogle Scholar
- Villars RL, Olofson CW, Eastwood M (2011) Big data: what it is and why you should care. IDC, FraminghamGoogle Scholar