Skip to main content

Advertisement

Log in

Analysis of medications change in Parkinson’s disease progression data

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

A Correction to this article was published on 22 June 2018

This article has been updated

Abstract

Parkinson’s disease is a neurodegenerative disorder that affects people worldwide. Careful management of patient’s condition is crucial to ensure the patient’s independence and quality of life. This is achieved by personalized treatment based on individual patient’s symptoms and medical history. The aim of this study is to determine patient groups with similar disease progression patterns coupled with patterns of medications change that lead to the improvement or decline of patients’ quality of life symptoms. To this end, this paper proposes a new methodology for clustering of short time series of patients’ symptoms and prescribed medications data, and time sequence data analysis using skip-grams to monitor disease progression. The results demonstrate that motor and autonomic symptoms are the most informative for evaluating the quality of life of Parkinson’s disease patients. We show that Parkinson’s disease patients can be divided into clusters ordered in accordance with the severity of their symptoms. By following the evolution of symptoms for each patient separately, we were able to determine patterns of medications change which can lead to the improvement or worsening of the patients’ quality of life.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Change history

  • 22 June 2018

    The original version of this article unfortunately contained a mistake. Figure 4 and Figure 5 in Section 5.4 have mistakenly been switched, while the captions of Figure 4 and Figure 5 are correct and correspond to the references in the text. The corrected figures are shown next page.

Notes

  1. Appendix B presents the clustering quality results on data set obtained by feature selection.

  2. The code is available upon request. Please note, we do not have a permission to share the data. Users can obtain permission from the Parkinson’s Progression Markers Initiative (PPMI): http://www.ppmi-info.org/

  3. Note that in Table 9 we present the Adjusted Random Index values where we compare the cluster similarity between the three best performing bi-view clustering settings.

References

  • Agrawal, R., Imieliński, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. In ACM sigmod record (Vol. 22, pp. 207–216). ACM.

  • Appice, A. (2017). Towards mining the organizational structure of a dynamic event scenario. Journal of Intelligent Information Systems, 1–29.

  • Appice, A., & Malerba, D. (2016). A co-training strategy for multiple view clustering in process mining. IEEE Trans Services Computing, 9(6), 832–845.

    Article  Google Scholar 

  • Arbelaitz, O., Gurrutxaga, I., Muguerza, J., Pérez, J.M., & Perona, I. (2013). An extensive comparative study of cluster validity indices. Pattern Recognition, 46(1), 243–256.

    Article  Google Scholar 

  • Bence, J.R. (1995). Analysis of short time series: correcting for autocorrelation. Ecology, 76(2), 628–639.

    Article  Google Scholar 

  • Bezdek, J.C. (1981). Pattern recognition with fuzzy objective function algorithms. New York: Plenum Press.

    Book  MATH  Google Scholar 

  • Blum, A., & Mitchell, T. (1998). Combining labeled and unlabeled data with co-training. In Proceedings of the eleventh annual conference on computational learning theory, ACM, New York, NY, USA, COLT’ (Vol. 98, pp. 92–100).

  • Broder, A.Z., Glassman, S.C., Manasse, M.S., & Zweig, G. (1997). Syntactic clustering of the web. Computer Networks and ISDN Systems, 29(8-13), 1157–1166.

    Article  Google Scholar 

  • Cai, X., Nie, F., & Huang, H. (2013). Multi-view k-means clustering on big data. In Proceedings of the 23rd international joint conference on artificial intelligence IJCAI 2013, Beijing, China, August 3-9, 2013 (pp. 2598–2604).

  • Caliński, T., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics-theory and Methods, 3(1), 1–27.

    Article  MathSciNet  MATH  Google Scholar 

  • Ceravolo, R., Rossi, C., Kiferle, L., & Bonuccelli, U. (2010). Nonmotor symptoms in Parkinson’s disease: the dark side of the moon. Future Neurology, 5(6), 851–871.

    Article  Google Scholar 

  • Chaudhuri, K., Kakade, S.M., Livescu, K., & Sridharan, K. (2009). Multi-view clustering via canonical correlation analysis. In Proceedings of the 26th annual international conference on machine learning, ICML 2009 (pp. 129–136).

  • Choi, E., Schuetz, A., Stewart, W.F., & Sun, J. (2017). Using recurrent neural network models for early detection of heart failure onset. Journal of the American Medical Informatics Association, 24(2), 361–370.

    Google Scholar 

  • Cleuziou, G., Exbrayat, M., Martin, L., & Sublemontier, J. (2009). CoFKM: a centralized method for multiple-view clustering. In Proceedings of the ninth IEEE international conference on data mining (ICDM 2009), miami, florida, USA, 6-9 December 2009 (pp. 752–757).

  • Dalrymple-Alford, J., MacAskill, M., Nakas, C., Livingston, L., Graham, C., Crucian, G., Melzer, T., Kirwan, J., Keenan, R., Wells, S., & et al. (2010). The moCA: well-suited screen for cognitive impairment in Parkinson disease. Neurology, 75(19), 1717–1725.

    Article  Google Scholar 

  • Davies, D.L., & Bouldin, D.W. (1979). A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1(2), 224–227.

    Article  Google Scholar 

  • De Alba, E., Mendoza, M., et al. (2007). Bayesian forecasting methods for short time series. The International Journal of Applied Forecasting, 8, 41–44.

    Google Scholar 

  • De Vine, L., Zuccon, G., Koopman, B., Sitbon, L., & Bruza, P. (2014). Medical semantic similarity with a neural language model. In Proceedings of the 23rd ACM international conference on conference on information and knowledge management (pp. 1819–1822). ACM.

  • Dorsey, E., Constantinescu, R., Thompson, J., Biglan, K., Holloway, R., Kieburtz, K., Marshall, F., Ravina, B., Schifitto, G., Siderowf, A., & et al. (2007). Projected number of people with Parkinson disease in the most populous nations, 2005 through 2030. Neurology, 68(5), 384–386.

    Article  Google Scholar 

  • Ernst, J., & Bar-Joseph, Z. (2006). STEM: a tool for the analysis of short time series gene expression data. BMC Bioinformatics, 7(1), 191.

    Article  Google Scholar 

  • Ernst, J., Nau, G.J., & Bar-Joseph, Z. (2005). Clustering short time series gene expression data. Bioinformatics, 21(suppl 1), i159–i168.

    Article  Google Scholar 

  • Ester, M., Kriegel, H.P., Sander, J., Xu, X., & et al. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD (Vol. 96, pp. 226–231).

  • European Parkinson’s Disease Association. (2016). http://www.epda.eu.com/, accessed: 2016/07/01.

  • Foltynie, T., Brayne, C., & Barker, R.A. (2002). The heterogeneity of idiopathic Parkinson’s disease. Journal of Neurology, 249(2), 138–145.

    Article  Google Scholar 

  • Gamberger, D., & Lavraċ, N. (2002). Expert-guided subgroup discovery: methodology and application. Journal of Artificial Intelligence Research, 17, 501–527.

    Article  MATH  Google Scholar 

  • Gatsios, D., Rigas, G., Miljkovic, D., Seljak, B.K., & Bohanec, M. (2016). m-health platform for Parkinson’s disease management. In Proceedings of 18th international conference on biomedicine and health informatics CBHI.

  • Gil, D., & Johnson, M. (2009). Diagnosing Parkinson by using artificial neural networks and support vector machines. Global Journal of Computer Science and Technology, 9(4), 63–71.

    Google Scholar 

  • Goetz, C.G., Tilley, B.C., Shaftman, S.R., Stebbins, G.T., Fahn, S., Martinez-Martin, P., Poewe, W., Sampaio, C., Stern, M.B., Dodel, R., & et al. (2008). Movement Disorder Society-sponsored Revision of the Unified Parkinson’s Disease Rating Scale (MDS-UPDRS): scale presentation and clinimetric testing results. Movement Disorders, 23(15), 2129–2170.

    Article  Google Scholar 

  • Goetz, C.G., Luo, S., Wang, L., Tilley, B.C., LaPelle, N.R., & Stebbins, G.T. (2015). Handling missing values in the MDS-UPDRS. Movement Disorders, 30 (12), 1632–1638.

    Article  Google Scholar 

  • Greene, D., Doyle, D., & Cunningham, P. (2010). Tracking the evolution of communities in dynamic social networks. In Advances in social networks analysis and mining (ASONAM), 2010 (pp. 176–183). IEEE.

  • Guthrie, D., Allison, B., Liu, W., Guthrie, L., & Wilks, Y. (2006). A closer look at skip-gram modelling. In Proceedings of the 5th international conference on language resources and evaluation (LREC-2006) (pp. 1–4).

  • He, X., Kan, M.Y., Xie, P., & Chen, X. (2014). Comment-based multi-view clustering of Web 2.0 items. In Proceedings of the 23rd international conference on World Wide Web (pp. 771–782): ACM.

  • Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.

    Article  MATH  Google Scholar 

  • Hughes, A.J., Daniel, S.E., Kilford, L., & Lees, A.J. (1992). Accuracy of clinical diagnosis of idiopathic Parkinson’s disease: a clinico-pathological study of 100 cases. Journal of Neurology Neurosurgery & Psychiatry, 55(3), 181–184.

    Article  Google Scholar 

  • Imhoff, M., Bauer, M., Gather, U., & Löhlein, D. (1998). Time series analysis in intensive care medicine. Tech. rep. SFB 475: Komplexitätsreduktion in Multivariaten Datenstrukturen, Universität Dortmund.

  • Kaufman, L., & Rousseeuw, P. (1987). Clustering by means of medoids. North-Holland.

  • Kaufman, L., & Rousseeuw, P.J. (1990). Finding groups in data. an introduction to cluster analysis. Wiley Series in Probability and Mathematical Statistics Applied Probability and Statistics.

  • Kumar, A., & III, H.D. (2011). A co-training approach for multi-view spectral clustering. In Proceedings of the 28th international conference on machine learning, ICML (pp. 393–400).

  • Lewis, S., Foltynie, T., Blackwell, A., Robbins, T., Owen, A., & Barker, R. (2005). Heterogeneity of Parkinson’s disease in the early clinical stages using a data driven approach. Journal of Neurology, Neurosurgery & Psychiatry, 76(3), 343–348.

    Article  Google Scholar 

  • Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R.P., Tang, J., & Liu, H. (2016). Feature selection: a data perspective. arXiv:160107996.

  • Lin, J., Keogh, E., Wei, L., & Lonardi, S. (2007). Experiencing SAX: a novel symbolic representation of time series. Data Mining and Knowledge Discovery, 15(2), 107–144.

    Article  MathSciNet  Google Scholar 

  • Liu, Y., Li, Z., Xiong, H., Gao, X., & Wu, J. (2010). Understanding of internal clustering validation measures. In Proceedings of IEEE 10th international conference on data mining (ICDM) (pp. 911–916).

  • Liu, Y., Li, W., Tan, C., Liu, X., Wang, X., Gui, Y., Qin, L., Deng, F., Hu, C., & Chen, L. (2014). Meta-analysis comparing deep brain stimulation of the globus pallidus and subthalamic nucleus to treat advanced Parkinson disease: a review. Journal of Neurosurgery, 121(3), 709–718.

    Article  Google Scholar 

  • Ma, L.Y., Chan, P., Gu, Z.Q., Li, F.F., & Feng, T. (2015). Heterogeneity among patients with Parkinson’s disease: cluster analysis and genetic association. Journal of the Neurological Sciences, 351(1), 41–45.

    Article  Google Scholar 

  • Marek, K., Jennings, D., Lasch, S., Siderowf, A., Tanner, C., Simuni, T., Coffey, C., Kieburtz, K., Flagg, E., Chowdhury, S., & et al. (2011). The Parkinson’s Progression Markers Initiative (PPMI). Progress in Neurobiology, 95(4), 629–635.

    Article  Google Scholar 

  • Michalski, R.S. (1983). A theory and methodology of inductive learning. Artificial Intelligence, 20(2), 111–161.

    Article  MathSciNet  Google Scholar 

  • Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv:http://arxiv.org/abs13013781.

  • Minarro-Giménez, J.A., Marín-alonso, O., & Samwald, M. (2013). Exploring the application of deep learning techniques on medical text corpora. Studies in Health Technology and Informatics, 205, 584–588.

    Google Scholar 

  • Murugesan, S., Bouchard, K., Chang, E., Dougherty, M., Hamann, B., & Weber, G.H. (2017). Multi-scale visual analysis of time-varying electrocorticography data via clustering of brain regions. BMC Bioinformatics, 18(6), 236.

    Article  Google Scholar 

  • National Collaborating Centre for Chronic Conditions. (2006). Parkinson’s disease: national clinical guideline for diagnosis and management in primary and secondary care. London: Royal College of Physicians.

    Google Scholar 

  • Patel, S., Lorincz, K., Hughes, R., Huggins, N., Growdon, J., Standaert, D., Akay, M., Dy, J., Welsh, M., & Bonato, P. (2009). Monitoring motor fluctuations in patients with Parkinson’s disease using wearable sensors. IEEE Transactions on Information Technology in Biomedicine, 13(6), 864–873.

    Article  Google Scholar 

  • PD_manager: m-Health platform for Parkinson’s disease management. (2015). EU Framework Programme for Research and Innovation Horizon 2020, Grant number 643706, 2015–2017. http://www.parkinson-manager.eu/.

  • Ramani, R.G., & Sivagami, G. (2011). Parkinson disease classification using data mining algorithms. International Journal of Computer Applications, 32(9), 17–22.

    Google Scholar 

  • Reijnders, J., Ehrt, U., Lousberg, R., Aarsland, D., & Leentjens, A. (2009). The association between motor subtypes and psychopathology in Parkinson’s disease. Parkinsonism & Related Disorders, 15(5), 379–382.

    Article  Google Scholar 

  • Riviere, C.N., Reich, S.G., & Thakor, N.V. (1997). Adaptive Fourier modeling for quantification of tremor. Journal of Neuroscience Methods, 74(1), 77–87.

    Article  Google Scholar 

  • Rousseeuw, P.J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65.

    Article  MATH  Google Scholar 

  • Samà, A., Pérez-López, C., Rodríguez-martín, D., Moreno-aróstegui, J.M., Rovira, J., Ahlrichs, C., Castro, R., Graċa, R., Guimarães, V., & et al. (2015). A double closed loop to enhance the quality of life of Parkinson’s disease patients: REMPARK system. Innovation in Medicine and Healthcare, 2014(207), 115.

    Google Scholar 

  • Schieb, L.J., Mobley, L.R., George, M., & Casper, M. (2013). Tracking stroke hospitalization clusters over time and associations with county-level socioeconomic and healthcare characteristics. Stroke, 44(1), 146–152.

    Article  Google Scholar 

  • SENSE-PARK. (2016). Project’s website: http://www.sense-park.eu/, accessed: 2016/07/01.

  • Stecher, J., Janssen, F., & Fürnkranz, J. (2014). Separating rule refinement and rule selection heuristics in inductive rule learning. In Proceedings of machine learning and knowledge discovery in databases - European conference, ECML PKDD 2014 (pp. 114–129).

  • Szymański, A., Kubis, A., & Przybyszewski, A.W. (2015). Data mining and neural network simulations can help to improve deep brain stimulation effects in Parkinson’s disease. Computer Science, 16(2), 199.

    Article  Google Scholar 

  • Timmer, J., Gantert, C., Deuschl, G., & Honerkamp, J. (1993). Characteristics of hand tremor time series. Biological Cybernetics, 70(1), 75–80.

    Article  MATH  Google Scholar 

  • Tzallas, A.T., Tsipouras, M.G., Rigas, G., Tsalikakis, D.G., Karvounis, E.C., Chondrogiorgi, M., Psomadellis, F., Cancela, J., Pastorino, M., Waldmeyer, M.T.A., & et al. (2014). PERFORM: a system for monitoring, assessment and management of patients with Parkinson’s disease. Sensors, 14(11), 21,329– 21,357.

    Article  Google Scholar 

  • Tzortzis, G., & Likas, A. (2009). Convex mixture models for multi-view clustering. In Proceedings of the 19th international conference artificial neural networks - ICANN (Vol. 2009, pp. 205–214).

  • Valmarska, A., Robnik-Šikonja, M., & Lavrač, N. (2015). Inverted heuristics in subgroup discovery. In Proceedings of the 18th international multiconference information society.

  • Valmarska, A., Miljkovic, D., Lavrač, N., & Robnik-Šikonja, M. (2016). Towards multi-view approach to Parkinson’s disease quality of life data analysis. In Proceedings of the 5th international workshop on new frontiers in mining complex patterns at ECML-PKDD2016.

  • Valmarska, A., Lavrač, N., Fürnkranz, J., & Robnik-Šikonja, M. (2017). Refinement and selection heuristics in subgroup discovery and classification rule learning. Expert Systems with Applications, 81, 147–162.

    Article  Google Scholar 

  • Visser, M., Marinus, J., Stiggelbout, A.M., & Van Hilten, J.J. (2004). Assessment of autonomic dysfunction in Parkinson’s disease: the SCOPA-AUT. Movement Disorders, 19(11), 1306–1312.

    Article  Google Scholar 

  • Washburn, R.A., Smith, K.W., Jette, A.M., & Janney, C.A. (1993). The physical activity scale for the elderly (PASE): development and evaluation. Journal of Clinical Epidemiology, 46(2), 153–162.

    Article  Google Scholar 

  • Xu, C., Tao, D., & Xu, C. (2013). A survey on multi-view learning. Neural Computing and Applications, 23(7–8), 2031–2038.

    Google Scholar 

  • Zhao, J., Papapetrou, P., Asker, L., & Boström, H. (2017). Learning from heterogeneous temporal data in electronic health records. Journal of Biomedical Informatics, 65, 105–119.

    Article  Google Scholar 

  • Zhao, Z., & Liu, H. (2007). Spectral feature selection for supervised and unsupervised learning. In Proceedings of the 24th international conference on machine learning (pp. 1151–1157). ACM.

Download references

Acknowledgements

This work was supported by the PD_manager and HBP SGA1 projects, funded within the EU Framework Program for Research and Innovation Horizon 2020 grants 643706 and 720270, respectively. We acknowledge also the support of the Slovenian Research Agency (research core funding P2-0103 and P2-0209).

Data used in the preparation of this article were obtained from the Parkinson’s Progression Markers Initiative (PPMI) (http://www.ppmi-info.org/data). For up-to-date information on the study, visit http://www.ppmi-info.org. PPMI—a public-private partnership—is funded by the Michael J. Fox Foundation for Parkinson’s Research and funding partners. Corporate Funding Partners: AbbVie, Avid Radiopharmaceuticals, Biogen, BioLegend, Bristol-Myers Squibb, GE Healthcare, GLAXOSMITHKLINE (GSK), Eli Lilly and Company, Lundbeck, Merck, Meso Scale Discovery (MSD), Pfizer Inc, Piramal Imaging, Roche, Sanofi Genzyme, Servier, Takeda, Teva, UCB. Philanthropic Funding Partners: Golub Capital. List of funding partners can be also found at http://www.ppmi-info.org/fundingpartners.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anita Valmarska.

Appendices

Appendix A: Comparison of clustering algorithms on merged data set

We considered three clustering approaches for the merged data set: k-means, k-medoids, and DBSCAN. We clustered the merged data into different number of clusters and evaluated the quality of the produced clusters with the internal cluster validity metrics: SA (Rousseeuw 1987), DB (Davies and Bouldin 1979), and CH (Caliński and Harabasz 1974). Table 5 presents the results of cluster validation for the selected clustering methods and the chosen number of clusters. The results show that the best performing approach is k-means.

Table 5 Cluster validation measures for k-means, k-medoids, and DBSCAN, where k presents the number of clusters. Clustering was performed on the merged data set. Better clusters quality is marked with higher values of SA and CH, and lower values of DB

Appendix B: Features selected by unsupervised feature selection

We used unsupervised feature subset selection to select the most relevant attributes for clustering algorithms. We used the SPEC algorithm (Zhao and Liu 2007) implemented in Python (Li et al. 2016). Figure 6 presents the evaluation of attributes relevance. Based on the results, we selected the attributes left from the red line in Fig. 6. This resulted in a list of 10 attributes, presented in detail in Table 6.

Fig. 6
figure 6

Attribute rank vs attribute importance as determined by the SPEC algorithm (the most influential attribute has rank 1)

Table 6 The most important attributes ordered according to SPEC (see Fig. 6)

In Table 7 we present the cluster validation values on the data set containing only the best attributes (listed in Table 6). The results reveal that the merged data set (consisting of sums of attributes) produces better quality clusters than the data set reduced with feature subset selection.

Table 7 Cluster validation measures for k-means, k-medoids, and DBSCAN, where k presents the number of clusters. Clustering was performed on the data set containing only attributes from Table 6

Results from Tables 5 and 7 show that better clusters are produced when sums of attribute values from the considered views are used as attributes in the merged data set. Parkinson’s disease patients experience a whole range of symptoms, both motor and non-motor, and it is tougher for traditional clustering algorithms to separate them into groups of similar patients. The introduction of sums makes it possible to have a view of the overall status of the patients concerning particular sets of symptoms (i.e. motor symptoms, non-motor symptoms, autonomic symtptoms etc.).

Appendix C: Evaluation of multi-view clusterings

In order to determine how the choice of data sets influence the results of multi-view clustering, we executed multi-view clustering on all 21 pairs of views, i.e. \(\frac {7 \cdot 6}{2}\) pairs. Clusters resulting from each pair were evaluated using SA (Rousseeuw 1987) and the results are presented in Table 8. SA is a normalized value (range from − 1 to 1) and is used to compare cluster quality on these data sets. Since clustering was performed on different data sets (each pair is effectively a different data set) and values of DB and CH are not comparable across data sets, we do not present these values. The value of each cell in Table 8 corresponds to the quality of clusters obtained by multi-view clustering on the data sets from the corresponding row and column. For example, SA (Rousseeuw 1987) on clusters obtained by multi-view clustering on the MDS-UPDRS Part I (NUPDRS1) and MoCA is 0.021. The best cluster is marked with bold.

The results show that all pairs produce clusters with low quality, but the three best performing pairs according to SA are: (SCOPA-AUT, MDS-UPDRS Part II), (MDS-UPDRS Part III, MDS-UPDRS Part II), and (PASE, MDS-UPDRS Part II).

Table 8 Value of SA on clusters discovered with multi-view clustering on pairs of data sets. Higher values of SA indicate clusters with better quality

We used the Adjusted Rand Index (ARI) (Hubert and Arabie 1985) to compare cluster structures discovered by different cluster configurations. The value of ARI is 0 for two random clusterings and 1 for two identical clusterings. Table 9 presents the ARI score computed on pairs of the winning two-view clustering settings. Results reveal that all pairs of clusterings are quite similar, and the (NUPDRS3, NUPDRS2P) and (PASE, NUPDRS2P) pairs produce almost identical clusters (ARI = 0.966). As the quality of individual pairs is rather low (see Table 8), there is little chance that further combinations of views would improve the quality.

Table 9 ARI scores for the best performing pairs of two-view multi-view clusterings

Nevertheless, we constructed two additional settings for multi-view clustering by systematically adding views (data sets) to the winning bi-view clustering setting (SCOPA-AUT, MDS-UPDRS Part III). We in turn added the remaining data sets from the second (MDS-UPDRS Part II and MDS-UPDRS Part III) and third (PASE and MDS-UPDRS Part III) best performing bi-view clustering setting, thus obtaining two new multi-view settings: (SCOPA-AUT, MDS-UPDRS Part II, MDS-UPDRS Part III) and (SCOPA-AUT, MDS-UPDRS Part II, MDS-UPDRS Part III, PASE). We evaluated the quality of clusters produced by these three settings and presented the results in Table 10, where we also included the cluster quality measures when all views are considered and the scores of the best single view clustering on the merged data set. Please note that since clustering was performed on different data sets, values of DB and CH are not comparable. SA is a normalized value (range from − 1 to 1) and is used to compare cluster quality on these data sets.

Based on the SA values from Table 10, clustering with the best clustering is produced on the merged data set that consists only of sums of attribute values from 7 data sets from Section 3.3. In the multi-view setting, best results were obtained when three data sets were considered (SCOPA-AUT, MDS-UPDRAS Part II, MDS-UPDRS Part III). The SCOPA-AUT data set contains attributes describing the autonomic symptoms of patients. The MDS-UPDRS Part II data expresses ‘motor experiences of daily living’, including speech problems, the need for assistance with the daily routines such as eating or dressing, etc, while the MDS-UPDRS Part III data set describes the motor symptoms which are the most characteristic symptoms of Parkinson’s disease. Even though the clusters produced by the multi-view setting are of lower quality than those produced on the merged data set, results from Table 10 reveal that it might be beneficial to combine multiple data sets: the inclusion of the MDS-UPDRS Part III data set in the best performing bi-view clustering setting (SCOPA-AUT, MDS-UPDRS Part II) (SA = 0.173) produces clusters with an improved quality (SA = 0.205). These results also show that the inclusion of other, seemingly uncorrelated data sets (PASE, MOCA, MDS-UPDRS Part I, MDS-UPDRS Part Ip) can lead toward significant decrease in the quality of clusters.

In addition to the work presented above, we also used unsupervised feature subset selection to select the most relevant attributes from each of the seven views (data sets). We evaluated the quality of clusters on the newly generated data sets following the procedure presented in this section. Results showed that the quality of the clusters in these new settings was significantly lower than the quality of clusters presented here. For that reason we did not include this part of research into the paper.

Appendix D: Rules describing multi-view clusters

We present rules describing clusters obtained by multi-view clustering using three views (SCOPA-AUT, MDS-UPDRS Part II, and MDS-UPDRS Part III) i.e. the best multi-view clustering according to SA from Table 10. Attributes with the prefix SCAU are symptoms from the SCOPA-AUT data set. The suffix in the names of these attributes designates the nature of the autonomic symptoms. Attributes SCAU1-SCAU7 describe gastrointestinal symptoms, urinary problems are recorded by attributes SCAU8-SCAU13, while attributes SCAU14-SCAU16 hold information about patient’s cardiovascular problems. Attributes SCAU17-SCAU18, SCAU20-SCAU21 describe thermoregulatory problems, while attribute SCAU19 describes any pupillomotor issues that a patient might be experiencing. Attribute prefixes determine the data set of their origin. Attributes with prefix NP2 are from the MDS-UPDRS Part II, while the prefix NP3 designates attributes from the MDS-UPDRS Part III data set (including attributes NHY and DYSKPRES).

Table 10 Comparison of cluster quality using silhouette analysis (SA) for different setting of multi-view clustering

Tables 1112, and 13 present rules describing cluster 0, cluster 1, and cluster 2 respectively, obtained by multi-view clustering. Rules are induced on the data set that is a concatenation of the three views: SCOPA-AUT, MDS-UPDRS Part II, and MDS-UPDRS Part III. Contrary to the rules obtained by the single view clustering on the merged data set where groups of patients were described by the severity of their overall status, the multi-view clusters are described by symptoms. These rules mostly describe the motor status of Parkinson’s disease patients (attributes from MDS-UPDRS Part III), and are supported by their motor ability in daily living (attributes from MDS-UPDRS Part II) and their autonomic symptoms (SCOPA-AUT).

Table 11 Description rules for cluster 0 of the multi-view clustering approach generating clusters with best quality. Views were represented by the SCOPA-AUT, MDS-UPDRS Part II, and MDS-UPDRS Part III data sets
Table 12 Description rules for cluster 1 of the multi-view clustering approach generating clusters with best quality. Views were represented by the SCOPA-AUT, MDS-UPDRS Part II, and MDS-UPDRS Part III data sets
Table 13 Description rules for cluster 2 of the multi-view clustering approach generating clusters with best quality. Views were represented by the SCOPA-AUT, MDS-UPDRS Part II, and MDS-UPDRS Part III data sets

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Valmarska, A., Miljkovic, D., Lavrač, N. et al. Analysis of medications change in Parkinson’s disease progression data. J Intell Inf Syst 51, 301–337 (2018). https://doi.org/10.1007/s10844-018-0502-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-018-0502-y

Keywords

Navigation