A Similarity Measure Based on Care Trajectories as Sequences of Sets
Comparing care trajectories helps improve health services. Medico-administrative databases are useful for automatically reconstructing the patients’ history of care. Care trajectories can be compared by determining their overlapping parts. This comparison relies on both semantically-rich representation formalism for care trajectories and an adequate similarity measure. The longest common subsequence (LCS) approach could have been appropriate if representing complex care trajectories as simple sequences was expressive enough. Furthermore, by failing to take into account similarities between different but semantically close medical events, the LCS overestimates differences. We propose a generalization of the LCS to a more expressive representation of care trajectories as sequences of sets. A set represents a medical episode composed by one or several medical events, such as diagnosis, drug prescription or medical procedures. Moreover, we propose to take events’ semantic similarity into account for comparing medical episodes. To assess our approach, we applied the method on a care trajectories’ sample from patients who underwent a surgical act among three kinds of acts. The formalism reduced calculation time, and introducing semantic similarity made the three groups more homogeneous.
KeywordsCare trajectories LCS-based similarity Semantic similarity
- 7.Studer, M., Ritschard, G.: A comparative review of sequence dissimilarity measures. LIVES Work. Pap. 2014, 1–47 (2014)Google Scholar
- 10.Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Presented at the Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics, 27 June 1994Google Scholar