Skip to main content

Big Data—Integration and Cleansing Environment for Business Analytics with DICE

  • Chapter
  • First Online:
Domain-Specific Conceptual Modeling

Abstract

The paper presents the Data Integration and Cleansing Environment—DICE. Its embedded modeling method supports the data understanding and data preparation phases for business analytics endeavours and subsequently decision-making in business process activities. A prototypical implementation is presented by using an example in the field of campaign management which uses traditional customer data in combination with (big) data about customer sentiments from microblogging platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Laursen, G., Thorlund, J.: Business Analytics For Managers: Taking Business Intelligence Beyond Reporting, vol. 40. Wiley (2010)

    Google Scholar 

  2. Dunkl, R., Rinderle-Ma, S., Grossmann, W., Fröschl, K.A.: A method for analyzing time series data in process mining: application and extension of decision point analysis. In: Information Systems Engineering in Complex Environments, pp. 68–84. Springer (2014)

    Google Scholar 

  3. Hinkelmann, K., Pierfranceschi, A.: Combining process modelling and case modeling. In: 8th International Conference Methodologies Technologies Tools Enabling E-Government MeTTeG14 (2014)

    Google Scholar 

  4. Karagiannis, D., Kühn, H.: Metamodelling platforms, presented at the EC-Web, vol. 2455, p. 182 (2002)

    Google Scholar 

  5. Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., Wirth, R.: CRISP-DM 1.0, CRISP-DM Consort (2000)

    Google Scholar 

  6. Piatetsky-Shapiro, G.: KDnuggets Methodology Poll (2014)

    Google Scholar 

  7. Grossmann, W.: Metadata, Wiley StatsRef Stat. Ref. Online (2015)

    Google Scholar 

  8. DGIQ: Deutsche Gesellschaft für Informations- und Datenqualität—Graphische Übersicht der 15 IQ-Dimensionen (2007)

    Google Scholar 

  9. Marbán, Ó., Mariscal, G., Segovia, J.: A data mining & knowledge discovery process model. Data Min. Knowl. Discov. Real Life Appl. Tech 2009, 8 (2009)

    Google Scholar 

  10. Fröschl, K.A., Grossmann, W.: Deciding Statistical Data Quality. New Tech. Technol. Stat. Technol. Know- Pre-Proc, no. 1 (2001)

    Google Scholar 

  11. Rumbaugh, J., Jacobson, I., Booch, G.: Unified Modeling Language Reference Manual, The. Pearson Higher Education (2004)

    Google Scholar 

  12. Brockmans, S., Haase, P., Studer, R.: A MOF-based Metamodel and UML Syntax for Networked Ontologies. In: Presented at the International Semantic Web Conference Georgia, US (2006)

    Google Scholar 

  13. Buckl, S., Ernst, A.M., Lankes, J., Schneider, K., Schweda, C.M.: A pattern based approach for constructing enterprise architecture management information models. Wirtsch. Proc. 2007, 65 (2007)

    Google Scholar 

  14. Fischer, R., Winter, R.: Ein hierarchischer, architekturbasierter Ansatz zur Unterstützung des IT/Business Alignment. Wirtsch. Proc. 2007, 66 (2007)

    Google Scholar 

  15. Papageorgiou, H., Pentaris, F., Theodorou, E., Vardaki, M., Petrakos, M.: A statistical metadata model for simultaneous manipulation of both data and metadata. J. Intell. Inf. Syst. 17(2–3), 169–192 (2001)

    Article  MATH  Google Scholar 

  16. Papageorgiou, H., Vardaki, M., Pentaris, F.: Data and metadata transformations. Res. Off. Stat. 3(2), 27–43 (2000)

    MATH  Google Scholar 

  17. Rahm, E., Do, H.H.: Data cleaning: problems and current approaches. IEEE Data Eng Bull 23(4), 3–13 (2000)

    Google Scholar 

  18. Grossmann, W.: A conceptual approach for data integration in business analytics. Int. J. Softw. Inf. 4, 53–68 (2009)

    Google Scholar 

  19. Pearce, D.J., Kelly, P.H.: A dynamic topological sort algorithm for directed acyclic graphs. J. Exp. Algorithmics JEA 11, 1–7 (2007)

    MathSciNet  Google Scholar 

  20. Pieterse, V., Black, P.E.: Dictionary of Algorithms and Data Structures (2015)

    Google Scholar 

  21. Kahn, A.B.: Topological sorting of large networks. Commun. ACM 5(11), 558–562 (1962)

    Article  MATH  Google Scholar 

  22. Mishra, P., Eich, M.H.: Join processing in relational databases. ACM Comput. Surv. CSUR 24(1), 63–113 (1992)

    Article  Google Scholar 

  23. DeWitt, D.J., Naughton, J.F., Schneider, D.A.: An evaluation of non-equijoin algorithms. In: Presented at the Proceedings of the 17th International Conference on Very Large Data Bases, pp. 443–452 (1991)

    Google Scholar 

  24. Zhou, J.: Nested Loop Join. In: Encyclopedia of Database Systems, p. 1895 Springer (2009)

    Google Scholar 

  25. Herzog, T.H., Scheuren, F., Winkler, W.E.: Record linkage. Wiley Interdiscip. Rev. Comput. Stat. 2(5), 535–543 (2010)

    Article  MATH  Google Scholar 

  26. Cohen, W., Ravikumar, P., Fienberg, S.: A comparison of string metrics for matching names and records. In: Presented at the Kdd workshop on data cleaning and object consolidation, vol. 3, pp. 73–78 (2003)

    Google Scholar 

  27. Fill, H.-G., Redmond, T., Karagiannis, D.: FDMM: A Formalism for Describing ADOxx Meta Models and Models (2012)

    Google Scholar 

  28. Gordon, J., Perrey, J., Spillecke, D.: Big data, analytics and the future of marketing and sales. Forbes Com (2013)

    Google Scholar 

  29. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008)

    Article  Google Scholar 

  30. Yu, Y., Dang, J.: Semantic mining on customer survey. In: Presented at the Proceedings of the 8th International Conference on Semantic Systems, pp. 72–79 (2012)

    Google Scholar 

  31. Yuan, Y.C., Multiple imputation for missing data: Concepts and new development (Version 9.0). SAS Inst. Inc Rockv. MD, vol. 49 (2010)

    Google Scholar 

  32. ISO/IEC 9075-1.2008: Information technology—Database design—SQL—Part 1: Framework (SQL/Framework). ISO/IEC (2008)

    Google Scholar 

  33. Van der Loo, M.P.: The stringdist package for approximate string matching. The R (2014)

    Google Scholar 

  34. Starbuck, W.H.: Organizations as action generators. Am. Sociol. Rev. pp. 91–102, 1983

    Google Scholar 

  35. Delen, D., Demirkan, H.: Data, information and analytics as services. Decis. Support Syst. 55(1), 359–363 (2013)

    Article  Google Scholar 

  36. Karagiannis, D., Visic, N.: Platform-as-a-Service (PaaS): The ADOxx Metamodelling Platform (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wilfried Grossmann .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Grossmann, W., Moser, C. (2016). Big Data—Integration and Cleansing Environment for Business Analytics with DICE. In: Karagiannis, D., Mayr, H., Mylopoulos, J. (eds) Domain-Specific Conceptual Modeling. Springer, Cham. https://doi.org/10.1007/978-3-319-39417-6_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-39417-6_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-39416-9

  • Online ISBN: 978-3-319-39417-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics