Abstract
Healthcare data, like any data, may have all kinds of quality problems. In this chapter, we identify 27 data quality issues that may compromise the validity of process mining results. Examples are missing data, incorrect data, imprecise data, and irrelevant data. For example, an event may only have a date (e.g., 15-6-2015) and not a fine-grained timestamp. As a result, the ordering of events is unknown, thus complicating analysis. Practitioners were interviewed to estimate the frequency of the 27 types of data quality issues identified. This provides insights into typical problems that may arise in data-science projects in hospitals. The quality of the analysis results directly depends on the input data (i.e., Garbage-In Garbage-Out). Therefore, the chapter also discusses 12 guidelines for logging. These guidelines should be used when developing the next generation of hospital information systems. Improved event logs will enable more advanced forms of process mining related to prediction and recommendation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
R. P. Jagadeesh Chandra Bose, R.S. Mans, and W.M.P. van der Aalst. Wanna Improve Process Mining Results? – It’s High Time We Consider Data Quality Issues Seriously. BPM Center Report BPM-13-02, BPMcenter.org, 2013
W.M.P. van der Aalst. Extracting Event Data from Databases to Unleash Process Mining. In J. Vom Brocke and T. Schmiedel, editors, Business Process Management Roundtable 2014, pages 1–25. Springer, 2014
C.W. Günther and W.M.P. van der Aalst. Fuzzy Mining: Adaptive Process Simplification Based on Multi-perspective Metrics. In International Conference on Business Process Management (BPM 2007), volume 4714 of Lecture Notes in Computer Science, pages 328–343. Springer-Verlag, Berlin, 2007
W.M.P. van der Aalst. Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer-Verlag, Berlin, 2011
M.L. van Eck. Timestamps Within Healthcare Process Mining Logs. Master’s thesis, Eindhoven University of Technology, Eindhoven, 2013
IEEE Task Force on Process Mining. XES Standard Definition. www.xes-standard.org, 2013
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2015 The Author(s)
About this chapter
Cite this chapter
Mans, R.S., van der Aalst, W.M.P., Vanwersch, R.J.B. (2015). Data Quality Issues. In: Process Mining in Healthcare. SpringerBriefs in Business Process Management. Springer, Cham. https://doi.org/10.1007/978-3-319-16071-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-16071-9_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16070-2
Online ISBN: 978-3-319-16071-9
eBook Packages: Computer ScienceComputer Science (R0)