Advertisement

A Multi-driven Approach to Improve Data Analytics for Multi-value Dimensions

  • Gabriel PestanaEmail author
  • Pedro Catelas
  • Isabel Rosa
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 354)

Abstract

The Data Warehouse is a data storage medium with the purpose to produce accurate and useful information to support business stakeholders to conduct data analysis that helps with performing decision making processes and improving information resources. The data warehouse provides a single and detailed view of the organization, and it is intended to be exploited by means of OLAP (On-line Analytical Processing) tools. These tools facilitate information analysis and navigation through the business data based on the multidimensional paradigm. A crucial decision for designing multidimensional models concerns the grain of facts, determined by fact–dimension relationships. This means, that the accuracy of the information can depend on how the data model is structured to support multi-value dimensions and avoid double-counting’s. The paper presents a technique used to overcome these constraints enabling designers to abstract complexity at a conceptual level without taking into account of more complex schema structures (like bridge table) to deal with non-strict fact–dimension relationships at different granularities. The technique is demonstrated using the Pentaho tool and lessons learned from our case study, an information system to monitor the execution of public works contracts.

Keywords

Multidimensional Schema Design Requirements Analysis Multi- Value Dimensions 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kimball, R., et al.: The Data Warehouse Lifecycle Toolkit, 2nd edn. Wiley (2008)Google Scholar
  2. 2.
    Mazón, J., Lechtenbörger, J., Trujillo, J.: A survey on summarizability issues in multidimensional modeling. Data & Knowledge Engineering 68, 1452–1469 (2009)CrossRefGoogle Scholar
  3. 3.
    Adamson, C.: Star Schema: The Complete Reference. McGraw-Hill (2010)Google Scholar
  4. 4.
    Romero, O., Abelló, A.: A Survey of Multidimensional Modeling Methodologies. Int. Journal of Data Warehousing & Mining 5(2), 1–23 (2009)CrossRefGoogle Scholar
  5. 5.
    Song, I.Y., Rowen, W., Medsker, C., Ewen, E.F.: An analysis of many-to-many relationships between fact and dimension tables in dimensional modeling. In: Proc. of DMDW (2001)Google Scholar
  6. 6.
    Guo, Y., Tang, S., Tong, Y., Yang, D.: Triple-Driven Data Modeling Methodology in Data Warehousing: A Case Study. In: Proc. of DOLAP (2006)Google Scholar
  7. 7.
    Dori, D., Feldman, R., Sturm, A.: Transforming an operational system model to a data warehouse model: a survey of techniques. In: Int. Conf. on Software- Science, Technology and Engineering, pp. 47–56. IEEE Computer Society (2005)Google Scholar
  8. 8.
    Thenmozhi, M., Vivekanandan, K.: A Tool for Data Warehouse Multidimensional Schema Design using Ontology. Int. Journal of Computer Science Issues 10(2(3)) (March 2013)Google Scholar
  9. 9.
    Mazón, J., Trujillo, J.: A hybrid model driven development framework for the multidimensional mod-eling of data warehouses. Proc. of SIGMOD Record 38(2) (2009)Google Scholar
  10. 10.
    Serrano, M., et al.: Metrics for data warehouse conceptual models understandability. Information and Software Technology 49, 851–870 (2007)CrossRefGoogle Scholar
  11. 11.
    Romero, O., Simitsis, A., Abelló, A.: GEM: Requirement-driven generation of ETL and multidimensional conceptual designs. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2011. LNCS, vol. 6862, pp. 80–95. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  12. 12.
    Talwar, K., Gosain, A.: Hierarchy classification for Data Warehouse: A Survey. In: Proc. of ICCCS (2012)Google Scholar
  13. 13.
    Prat, N., Akoka, J., Comyn-Wattiau, I.: A UML-based data warehouse design method. Decision Support Systems 42, 1449–1473 (2006)CrossRefGoogle Scholar
  14. 14.
    Song, Y., et al.: An Analysis of Many-to-Many Relationships Between Fact and Dimension Tables in Dimensional Modeling. In: Proc. of DMDW (2001)Google Scholar
  15. 15.
    Chee Tahir, A., Darton, R.C.: The Process Analysis Method of selecting indicators to quantify the sustainability performance of a business operation. Int. Journal of Cleaner Production 18, 1598–1607 (2010)CrossRefGoogle Scholar
  16. 16.
    Pentaho, “Mondrian Schema Documentation”, online documentation available at the Pentaho website: http://mondrian.pentaho.com/documentation/schema.php;
  17. 17.
    Zhijuan, W., Hongchang, W.: A Data Warehouse Design Method. In: International Conference on Computer Science and Service System (2012)Google Scholar
  18. 18.
    Lechtenbörger, J., Vossen, G.: Multidimensional normal forms for data warehouse design. Inf. Syst. 28(5), 415–434 (2003)CrossRefzbMATHGoogle Scholar
  19. 19.
    Caniupán, M., Bravo, L., Hurtado, C.A.: Repairing inconsistent dimensions in data warehouses. Data & Knowledge Engineering 79-80, 17–39 (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Universidade EuropeiaLisbonPortugal
  2. 2.INOV INESC Inovação - Instituto de Novas TecnologiasLisbonPortugal
  3. 3.Instituto da Construção e do Imobiliário, I.P. (InCI)LisbonPortugal

Personalised recommendations