Skip to main content

Quality Assessment of Data Using Statistical and Machine Learning Methods

  • Conference paper
  • First Online:
Computational Intelligence in Data Mining - Volume 2

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 32))

Abstract

Data warehouses are used in organization for efficiently managing the information. The data from various heterogeneous data sources are integrated in data warehouse in order to do analysis and make decision. Data warehouse quality is very important as it is the main tool for strategic decision. Data warehouse quality is influenced by Data model quality which is further influenced by conceptual data model. In this paper, we first summarize the set of metrics for measuring the understand ability of conceptual data model for data warehouses. The statistical and machine learning methods are used to predict effect of structural metrics, on understand ability, efficiency and effectiveness of Data warehouse Multidimensional (MD) conceptual model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Serrano, M., Trujillo, J., Calerro, C., Piattini, M.: Metrics for data warehouse conceptual model understandability. Inf. Softw. Technol. 851–890 (2007)

    Google Scholar 

  2. Kimball, R.: The Data Warehouse Toolkit. Wiley, New York (2011)

    Google Scholar 

  3. Kesh, S.: Evaluating the quality of entity relationship models. Inf. Softw. Technol. 37, 681–689 (1995)

    Article  Google Scholar 

  4. Serrano, M., Calero, C., Trujello, J.: Sergio Lujan-Mora and Mario Riattini. Empirical Validation of Metrics for Conceptual Models of Data Warehouses. In: Pearson, A., Stirna, J. (eds.) CAiSE, LNCS, vol. 3084, pp. 506–520 (2004)

    Google Scholar 

  5. Batini, C., Ceri S., Navathe S.: Conceptual database design: an entity relationship approach. Benjamin/Cummings

    Google Scholar 

  6. Jeusfeld, M., Quix, C., Jarke, M.: Design and analysis of quality information for data warehouses. In: 17th International Conference on Conceptual Modeling (ER‟98), Singapore (1998)

    Google Scholar 

  7. Golfarelli, M., Maio, D., Rizzi S.: The dimensional fact model—a conceptual for data warehouses. Int. J. Coop. Inf. Syst. (IJCIS) 7, 215–247 (1998)

    Google Scholar 

  8. Basili, V., Romach.: The tame project towards improvement oriented software environments. IEEE Trans. Soft Eng. 14(6) 728–738 (1988)

    Google Scholar 

  9. Golfarelli, M., Rizzi, S.: A methodological framework for data warehouse design. In: 1st International Workshop on Data Warehousing and OLAP (Dolap 98) Maryland (USA) (1998)

    Google Scholar 

  10. Sapia, C.: On Modeling and Predicting Query Behavior in OLAP Systems. In: International Workshop on Design and Management of Data warehouses (DMDW ‘99), pp. 1–10, Heidelberg (Germany) (1999)

    Google Scholar 

  11. Sapia, C., Blaschka, M., Holfing, G., Dinter, B.: Extending use the E/R model for multidimensional paradigm. In: 1st International Workshop on Data Warehouse and Data mining (DWDM ’98), pp. 105–116. Springer Singapore (1998)

    Google Scholar 

  12. Husemann, B., Lechtenborger, J., Vossen, G.: Conceptual data warehouse design. In: 2nd International Workshop on Design and Management of Data Warehouses (DMDW 2000), pp. 3–9, Stockholm (Sweden) (2000)

    Google Scholar 

  13. Abello, A., Samos, J., Saltor, F.: YAM2 (Yet Another Multi Dimensional Model) An Extension of UML. In: International Database Engineering and Application Symposium (IDEAS 2002), pp. 172–181. IEEE Computer Society Edmonton (Canada) (2002)

    Google Scholar 

  14. Caldiera, V.R.B.G., Dieter Rombach, H.: The goal question metric approach. In: Encyclopedia of Software Engineering. Wiley, New York (1994)

    Google Scholar 

  15. Moody, D.: Metrics for evaluating the quality of entity relationship models. In: 17th International Conference on Conceptual Modelling, pp. 213–225 (ER‟98) Singapore (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prerna Singh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer India

About this paper

Cite this paper

Singh, P., Suri, B. (2015). Quality Assessment of Data Using Statistical and Machine Learning Methods. In: Jain, L., Behera, H., Mandal, J., Mohapatra, D. (eds) Computational Intelligence in Data Mining - Volume 2. Smart Innovation, Systems and Technologies, vol 32. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2208-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-2208-8_10

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-2207-1

  • Online ISBN: 978-81-322-2208-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics