Advertisement

Approaching ETL Processes Specification Using a Pattern-Based Ontology

  • Bruno Oliveira
  • Orlando BeloEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 737)

Abstract

The development of software projects is often based on the composition of components for creating new products and components through the promotion of reusable techniques. These pre-configured components are sometimes based on well-known and validated design-patterns describing abstract solutions for solving recurring problems. The data warehouse ETL development life cycle shares the main steps of most typical phases of any software process development. Considering that patterns have been broadly used in many software areas as a way to increase reliability, reduce development risks and enhance standards compliance, a pattern-oriented approach for the development of ETL systems can be achieve, providing a more flexible approach for ETL implementation. Appealing to an ontology specification, in this paper we present and discuss contextual data for describing ETL patterns based on their structural properties. The use of an ontology allows for the interpretation of ETL patterns by a computer and used posteriorly to rule its instantiation to physical models that can be executed using existing commercial tools.

Keywords

Data warehousing systems ETL conceptual modelling ETL patterns Domain specific language and ontologies 

References

  1. 1.
    Gruber, T.R.: A translation approach to portable ontology specifications. Knowl. Acquis. 5, 199–220 (1993)CrossRefGoogle Scholar
  2. 2.
    Gamma, E., Helm, R., Johnson, R.E., Vlissides, J.: Design patterns: elements of reusable object-oriented software. Design. 206, 395 (1995)Google Scholar
  3. 3.
    Alexander, C., Ishikawa, S., Silverstein, M.: A Pattern Language: Towns, Buildings, Construction. Oxford University Press, Oxford (1977)Google Scholar
  4. 4.
    Weske, M., van der Aalst, W., Verbeek, H.: Advances in business process management. Data Knowl. Eng. 50, 1–8 (2004)CrossRefGoogle Scholar
  5. 5.
    Oliveira, B., Belo, O.: BPMN Patterns for ETL conceptual modelling and validation. In: 20th International Symposium on Methodologies for Intelligent Systems (ISMIS 2012), Macau, 4–7 December 2012Google Scholar
  6. 6.
    Oliveira, B., Santos, V., Belo, O.: Pattern-based ETL conceptual modelling. In: Cuzzocrea, A., Maabout, S. (eds.) MEDI 2013. LNCS, vol. 8216, pp. 237–248. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-41366-7_20 CrossRefGoogle Scholar
  7. 7.
    Oliveira, B., Belo, O.: An ontology for describing ETL patterns behavior. In: Proceedings of 5th International Conference on Data Management Technologies and Applications (DATA 2016), Lisboa, Portugal, 24–26 July 2016Google Scholar
  8. 8.
    McGuinness, D.L., van Harmelen, F.: OWL Web Ontology Language Overview (2004)Google Scholar
  9. 9.
    Oliveira, B., Belo, O.: A domain-specific language for ETL patterns specification in data warehousing systems. In: Pereira, F., Machado, P., Costa, E., Cardoso, A. (eds.) EPIA 2015. LNCS (LNAI), vol. 9273, pp. 597–602. Springer, Cham (2015). doi: 10.1007/978-3-319-23485-4_60 Google Scholar
  10. 10.
    McGuinness, D.L., Wright, J.R.: Conceptual modelling for configuration: a description logic-based approach. Artif. Intell. Eng. Des. Anal. Manuf. 12, 333–344 (1998)CrossRefGoogle Scholar
  11. 11.
    Dietrich, J., Elgar, C.: Towards a web of patterns. Web Semant. Sci. Serv. Agents World Wide Web 5, 108–116 (2007)CrossRefGoogle Scholar
  12. 12.
    Noy, N., McGuinness, D.: Ontology development 101, A guide to creating your first ontology. Development. 32, 1–25 (2001)Google Scholar
  13. 13.
    Antoniou, G., Van Harmelen, F.: OWL web ontology language. Handb. Ontol. Inf. Syst. 2007, 157–160 (2004)Google Scholar
  14. 14.
    Vassiliadis, P., Simitsis, A., Georgantas, P., Terrovitis, M.: A framework for the design of ETL scenarios. In: Eder, J., Missikoff, M. (eds.) CAiSE 2003. LNCS, vol. 2681, pp. 520–535. Springer, Heidelberg (2003). doi: 10.1007/3-540-45017-3_35 CrossRefGoogle Scholar
  15. 15.
    Vassiliadis, P., Simitsis, A., Skiadopoulos, S., Conceptual modeling for ETL processes. In: Proceedings of the 5th ACM International Workshop on Data Warehousing and OLAP, DOLAP 2002, pp. 1–25 (2002)Google Scholar
  16. 16.
    Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: On the logical modeling of ETL processes. Science 80, 782–786 (2002)zbMATHGoogle Scholar
  17. 17.
    Simitsis, A., Vassiliadis, P.: A method for the mapping of conceptual designs to logical blueprints for ETL processes. Decis. Support Syst. 45, 22–40 (2008)CrossRefGoogle Scholar
  18. 18.
    Vassiliadis, P., Vagena, Z., Skiadopoulos, S., Karayannidis, N., Sellis, T.: ARKTOS: a tool for data cleaning and transformation in data warehouse environments. Bull. IEEE Comput. Soc. Tech. Comm. Data Eng. 1–7 (2000)Google Scholar
  19. 19.
    Skoutas, D., Simitsis, A.: Ontology-based conceptual design of ETL processes for both structured and semi-structured data. Int. J. Semant. Web Inf. Syst. 3, 1–24 (2000)CrossRefGoogle Scholar
  20. 20.
    El Akkaoui, Z., Zimanyi, E.: Defining ETL worfklows using BPMN and BPEL. In: Proceedings of the ACM Twelfth International Workshop on Data Warehousing and OLAP, DOLAP 2009, pp. 41–48 (2009)Google Scholar
  21. 21.
    White, S.A., Corp, I.B.M.: Using BPMN to model a BPEL process. Business 3, 1–18 (2005)Google Scholar
  22. 22.
    El Akkaoui, Z., Zimànyi, E., Mazón, J.-N., Trujillo, J.: A model-driven framework for ETL process development. In: Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP, DOLAP, pp. 45–52 (2011)Google Scholar
  23. 23.
    Köppen, V., Brüggemann, B., Berendt, B.: Designing data integration: the ETL pattern approach. Eur. J. Inform. Prof. XII, 49–55 (2011)Google Scholar
  24. 24.
    Luján-Mora, S., Trujillo, J., Song, I.-Y.: A UML profile for multidimensional modeling in data warehouses. Data Knowl. Eng. 59, 725–769 (2006)CrossRefGoogle Scholar
  25. 25.
    Muñoz, L., Mazón, J.-N., Pardillo, J., Trujillo, J.: Modelling ETL processes of data warehouses with UML activity diagrams. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM 2008. LNCS, vol. 5333, pp. 44–53. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-88875-8_21 CrossRefGoogle Scholar
  26. 26.
    Muñoz, L., Mazón, J.-N., Trujillo, J.: Automatic generation of ETL processes from conceptual models. In: Proceedings of the ACM Twelfth International Workshop on Data Warehousing and OLAP, pp. 33–40. ACM, New York (2009)Google Scholar
  27. 27.
    W3.org, Semantic Web - W3C. http://www.w3.org/standards/semanticweb/
  28. 28.
    Motik, B., Patel-Schneider, P.F., Parsia, B., Bock, C., Fokoue, A., Haase, P., Hoekstra, R., Horrocks, I., Ruttenberg, A., Sattler, U., Smith, M.: OWL 2 Web Ontology Language - Structural Specification and Functional-Style Syntax, 2nd edn. Online, pp. 1–133 (2012)Google Scholar
  29. 29.
    Rahm, E., Do, H.: Data cleaning: Problems and current approaches. IEEE Data Eng. Bull. 23, 3–13 (2000)Google Scholar
  30. 30.
    Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling. Wiley, Hoboken (2002)Google Scholar
  31. 31.
    Protégé, The Protégé Ontology Editor (2011)Google Scholar
  32. 32.
    Horridge, M.: protégé-owl api. http://protege.stanford.edu/plugins/owl/api/
  33. 33.
    Akkaoui, Z., Mazón, J.-N., Vaisman, A., Zimányi, E.: BPMN-based conceptual modeling of ETL processes. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 1–14. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-32584-7_1 CrossRefGoogle Scholar
  34. 34.
    Oliveira, B., Santos, V., Gomes, C., Marques, R., Belo, O.: Conceptual-physical bridging - from BPMN models to physical implementations on Kettle. In: CEUR Workshop Proceedings, pp. 55–59 (2015)Google Scholar
  35. 35.
    Oliveira, B., Belo, O., Cuzzocrea, A.: A pattern-oriented approach for supporting ETL conceptual modelling and its YAWL-based implementation. In: 3rd International Conference on Data Management Technologies and Applications, DATA 2014, pp. 408–415 (2014)Google Scholar
  36. 36.
    Bouman, R., Van Dongen, J.: Pentaho® Solutions: Business Intelligence and Data Warehousing with Pentaho and MySQL® (2009)Google Scholar
  37. 37.
    Gradecki, J.D., Cole, J.: Mastering Apache Velocity - Java Open Source library (2003)Google Scholar
  38. 38.
    Jackson, D.: Software Abstractions: Logic, Language, and Analysis. MIT Press, Cambridge (2012)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.CIICESI, School of Management and TechnologyPorto PolytechnicFelgueirasPortugal
  2. 2.ALGORITMI CentreUniversity of MinhoBragaPortugal

Personalised recommendations