Clustering Functional Data with Application to Electronic Medication Adherence Monitoring in HIV Prevention Trials

  • Yifan ZhuEmail author
  • Chongzhi Di
  • Ying Qing Chen


Maintaining high medication adherence is essential for achieving desired efficacy in clinical trials, especially prevention trials. However, adherence is traditionally measured by self-reports that are subject to reporting biases and measurement error. Recently, electronic medication dispenser devices have been adopted in several HIV pre-exposure prophylaxis prevention studies. These devices are capable of collecting objective, frequent, and timely drug adherence data. The device opening signals generated by such devices are often represented as regularly or irregularly spaced discrete functional data, which are challenging for statistical analysis. In this paper, we focus on clustering the adherence monitoring data from such devices. We first pre-process the raw discrete functional data into smoothed functional data. Parametric mixture models with change-points, as well as several non-parametric and semi-parametric functional clustering approaches, are adapted and applied to the smoothed adherence data. Simulation studies were conducted to evaluate finite sample performances, on the choices of tuning parameters in the pre-processing step as well as the relative performance of different clustering algorithms. We applied these methods to the HIV Prevention Trials Network 069 study for identifying subgroups with distinct adherence behavior over the study period.


Drug adherence HIV prevention Clustering Functional data Latent class model 



This was partially supported by two grants from the National Institutes of Health (NIH), R01AI121259 and R01HL130483.

Supplementary material

12561_2019_9232_MOESM1_ESM.docx (457 kb)
Supplementary material 1 (docx 457 KB)


  1. 1.
    Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723MathSciNetzbMATHGoogle Scholar
  2. 2.
    Bélisle CJ (1992) Convergence theorems for a class of simulated annealing algorithms on \({\mathbb{R}}^d\). J Appl Probab 29(4):885–895MathSciNetzbMATHGoogle Scholar
  3. 3.
    Chiou JM, Li PL (2007) Functional clustering and identifying substructures of longitudinal data. J R Stat Soc 69(4):679–699MathSciNetGoogle Scholar
  4. 4.
    Fraley C, Raftery AE (2002) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97(458):611–631MathSciNetzbMATHGoogle Scholar
  5. 5.
    Gable AR, Lagakos SW (2008) Methodological challenges in biomedical HIV prevention trials. National Academies Press, Washington, DCGoogle Scholar
  6. 6.
    Gibaldi M, Nagashima R, Levy G (1969) Relationship between drug concentration in plasma or serum and amount of drug in the body. J Pharm Sci 58(2):193–197Google Scholar
  7. 7.
    Grant RM, Anderson PL, McMahan V, Liu A, Amico KR, Mehrotra M, Hosek S, Mosquera C, Casapia M, Montoya O (2014) Uptake of pre-exposure propylaxis, sexual practices, and hiv incidence in men and transgender women who have sex with men: a cohort study. Lancet Infect Dis 14(9):820–829Google Scholar
  8. 8.
    Gulick RM, Wilkin TJ, Chen YQ, Landovitz RJ, Amico KR, Young AM, Richardson P, Marzinke MA, Hendrix CW, Eshleman SH (2016) Phase 2 study of the safety and tolerability of maraviroc-containing regimens to prevent hiv infection in men who have sex with men (hptn 069/actg a5305). J Infect Dis 215(2):238–246Google Scholar
  9. 9.
    Haberer JE, Kahane J, Kigozi I, Emenyonu N, Hunt P, Martin J, Bangsberg DR (2010) Real-time adherence monitoring for HIV antiretroviral therapy. AIDS Behav 14(6):1340–1346 PMCID: PMC2974938Google Scholar
  10. 10.
    Hall P, Müller HG, Yao F (2008) Modelling sparse generalized longitudinal observations with latent gaussian processes. J R Stat Soc 70(4):703–723MathSciNetzbMATHGoogle Scholar
  11. 11.
    Hartigan JA, Wong MA (1979) Algorithm as 136: a k-means clustering algorithm. J R Stat Soc Ser C 28(1):100–108zbMATHGoogle Scholar
  12. 12.
    Huang H, Li Y, Guan Y (2014) Joint modeling and clustering paired generalized longitudinal trajectories with application to cocaine abuse treatment data. J Am Stat Assoc 109(508):1412–1424MathSciNetGoogle Scholar
  13. 13.
    Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218zbMATHGoogle Scholar
  14. 14.
    Jacques J, Preda C (2014) Functional data clustering: a survey. Adv Data Anal Classif 8(3):231–255MathSciNetGoogle Scholar
  15. 15.
    James GM, Sugar CA (2003) Clustering for sparsely sampled functional data. J Am Stat Assoc 98(462):397–408MathSciNetzbMATHGoogle Scholar
  16. 16.
    Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90(430):773–795MathSciNetzbMATHGoogle Scholar
  17. 17.
    Lewis AS, Overton ML (2013) Nonsmooth optimization via quasi-newton methods. Math Progr 141(1–2):135–163MathSciNetzbMATHGoogle Scholar
  18. 18.
    Louis TA (1982) Finding the observed information matrix when using the EM algorithm. J R Stat Soc Ser B 44:226–233MathSciNetzbMATHGoogle Scholar
  19. 19.
    Peng J, Müller HG (2008) Distance-based clustering of sparsely observed stochastic processes, with applications to online auctions. Ann Appl Stat 2(3):1056–1077MathSciNetzbMATHGoogle Scholar
  20. 20.
    Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Amn Stat Assoc 66(336):846–850Google Scholar
  21. 21.
    Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464MathSciNetzbMATHGoogle Scholar
  22. 22.
    Tibshirani R, Walther G (2005) Cluster validation by prediction strength. J Comput Graph Stat 14(3):511–528MathSciNetGoogle Scholar
  23. 23.
    Vermunt JK (2010) Latent class modeling with covariates: two improved three-step approaches. Polit Anal 18(4):450–469Google Scholar
  24. 24.
    Vrijens B, Vincze G, Kristanto P, Urquhart J, Burnier M (2008) Adherence to prescribed antihypertensive drug treatments: longitudinal study of electronically compiled dosing histories. BMJ 336(7653):1114–1117Google Scholar
  25. 25.
    Ward JH Jr (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc 58(301):236–244MathSciNetGoogle Scholar
  26. 26.
    Yao F, Müller HG, Wang JL (2005) Functional linear regression analysis for longitudinal data. Ann Stat 33(6):2873–2903MathSciNetzbMATHGoogle Scholar

Copyright information

© International Chinese Statistical Association 2019

Authors and Affiliations

  1. 1.SeattleUSA

Personalised recommendations