Skip to main content

A Multi-driven Approach to Improve Data Analytics for Multi-value Dimensions

  • Conference paper
New Contributions in Information Systems and Technologies

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 354))

  • 1256 Accesses

Abstract

The Data Warehouse is a data storage medium with the purpose to produce accurate and useful information to support business stakeholders to conduct data analysis that helps with performing decision making processes and improving information resources. The data warehouse provides a single and detailed view of the organization, and it is intended to be exploited by means of OLAP (On-line Analytical Processing) tools. These tools facilitate information analysis and navigation through the business data based on the multidimensional paradigm. A crucial decision for designing multidimensional models concerns the grain of facts, determined by fact–dimension relationships. This means, that the accuracy of the information can depend on how the data model is structured to support multi-value dimensions and avoid double-counting’s. The paper presents a technique used to overcome these constraints enabling designers to abstract complexity at a conceptual level without taking into account of more complex schema structures (like bridge table) to deal with non-strict fact–dimension relationships at different granularities. The technique is demonstrated using the Pentaho tool and lessons learned from our case study, an information system to monitor the execution of public works contracts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kimball, R., et al.: The Data Warehouse Lifecycle Toolkit, 2nd edn. Wiley (2008)

    Google Scholar 

  2. Mazón, J., Lechtenbörger, J., Trujillo, J.: A survey on summarizability issues in multidimensional modeling. Data & Knowledge Engineering 68, 1452–1469 (2009)

    Article  Google Scholar 

  3. Adamson, C.: Star Schema: The Complete Reference. McGraw-Hill (2010)

    Google Scholar 

  4. Romero, O., Abelló, A.: A Survey of Multidimensional Modeling Methodologies. Int. Journal of Data Warehousing & Mining 5(2), 1–23 (2009)

    Article  Google Scholar 

  5. Song, I.Y., Rowen, W., Medsker, C., Ewen, E.F.: An analysis of many-to-many relationships between fact and dimension tables in dimensional modeling. In: Proc. of DMDW (2001)

    Google Scholar 

  6. Guo, Y., Tang, S., Tong, Y., Yang, D.: Triple-Driven Data Modeling Methodology in Data Warehousing: A Case Study. In: Proc. of DOLAP (2006)

    Google Scholar 

  7. Dori, D., Feldman, R., Sturm, A.: Transforming an operational system model to a data warehouse model: a survey of techniques. In: Int. Conf. on Software- Science, Technology and Engineering, pp. 47–56. IEEE Computer Society (2005)

    Google Scholar 

  8. Thenmozhi, M., Vivekanandan, K.: A Tool for Data Warehouse Multidimensional Schema Design using Ontology. Int. Journal of Computer Science Issues 10(2(3)) (March 2013)

    Google Scholar 

  9. Mazón, J., Trujillo, J.: A hybrid model driven development framework for the multidimensional mod-eling of data warehouses. Proc. of SIGMOD Record 38(2) (2009)

    Google Scholar 

  10. Serrano, M., et al.: Metrics for data warehouse conceptual models understandability. Information and Software Technology 49, 851–870 (2007)

    Article  Google Scholar 

  11. Romero, O., Simitsis, A., Abelló, A.: GEM: Requirement-driven generation of ETL and multidimensional conceptual designs. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2011. LNCS, vol. 6862, pp. 80–95. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  12. Talwar, K., Gosain, A.: Hierarchy classification for Data Warehouse: A Survey. In: Proc. of ICCCS (2012)

    Google Scholar 

  13. Prat, N., Akoka, J., Comyn-Wattiau, I.: A UML-based data warehouse design method. Decision Support Systems 42, 1449–1473 (2006)

    Article  Google Scholar 

  14. Song, Y., et al.: An Analysis of Many-to-Many Relationships Between Fact and Dimension Tables in Dimensional Modeling. In: Proc. of DMDW (2001)

    Google Scholar 

  15. Chee Tahir, A., Darton, R.C.: The Process Analysis Method of selecting indicators to quantify the sustainability performance of a business operation. Int. Journal of Cleaner Production 18, 1598–1607 (2010)

    Article  Google Scholar 

  16. Pentaho, “Mondrian Schema Documentation”, online documentation available at the Pentaho website: http://mondrian.pentaho.com/documentation/schema.php;

  17. Zhijuan, W., Hongchang, W.: A Data Warehouse Design Method. In: International Conference on Computer Science and Service System (2012)

    Google Scholar 

  18. Lechtenbörger, J., Vossen, G.: Multidimensional normal forms for data warehouse design. Inf. Syst. 28(5), 415–434 (2003)

    Article  MATH  Google Scholar 

  19. Caniupán, M., Bravo, L., Hurtado, C.A.: Repairing inconsistent dimensions in data warehouses. Data & Knowledge Engineering 79-80, 17–39 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gabriel Pestana .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Pestana, G., Catelas, P., Rosa, I. (2015). A Multi-driven Approach to Improve Data Analytics for Multi-value Dimensions. In: Rocha, A., Correia, A., Costanzo, S., Reis, L. (eds) New Contributions in Information Systems and Technologies. Advances in Intelligent Systems and Computing, vol 354. Springer, Cham. https://doi.org/10.1007/978-3-319-16528-8_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16528-8_12

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16527-1

  • Online ISBN: 978-3-319-16528-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics