Recursive Partitioning in Continuous Time Analysis
Building models fully informed by theory is challenging when data sets are large and strong assumptions about all variables of interest and their interrelations cannot be made. Machine learning-inspired approaches have been gaining momentum in modeling such “big” data because they offer a systematic approach to searching for potential interrelationships among variables. In practice, researchers may often start with a small model strongly guided by theory. In a second step, however, they quickly face the challenge of selecting among additional variables as to whether they should be included in or omitted from the model. This situation calls for both a confirmatory statistical modeling approach and an exploratory statistical learning approach to data analysis within a single framework. Structural equation model (SEM) trees, a combination of SEM and decision trees (also known as classification and regression trees), offer a principled solution to this selection problem. SEM trees hierarchically split empirical data into homogeneous groups sharing similar data patterns by recursively selecting optimal predictors of these differences from a potentially large set of candidate variables. SEM forests are an extension of SEM trees, consisting of ensembles of SEM trees, each built on a random sample of the original data. By aggregating over ensembles of SEM trees (SEM forests), we obtain measures of variable importance that are more robust than measures from single trees. In the present chapter, we combine SEM trees and SEM-based continuous time modeling. The resulting approach of continuous time SEM trees will be illustrated by exploring dynamics in perceptual speed using data from the COGITO study.
We thank John J. Prindle, John J. McArdle, Timo von Oertzen, and Ulman Lindenberger who have each played essential roles in the development of SEM trees and forests. We would also like to thank Florian Schmiedek, Annette Brose, and Ulman Lindenberger for sharing the COGITO data with us. We are grateful to Julia Delius for her helpful assistance in language and style editing.
- Ammerman, B. A., Jacobucci, R., Kleiman, E. M., Muehlenkamp, J. J., & McCloskey, M. S. (2017). Development and validation of empirically derived frequency criteria for NSSI disorder using exploratory data mining. Psychological Assessment, 29(2), 221–231. https://doi.org/10.1037/pas0000334 CrossRefGoogle Scholar
- Brandmaier, A. M., & Prindle, J. J. (2017). semtree: Recursive partitioning for structural equation models [Computer software manual]. Retrieved from http://www.brandmaier.de/semtree (R package version 0.9.11)
- Brandmaier, A. M., von Oertzen, T., McArdle, J. J., & Lindenberger, U. (2013a). Exploratory data mining with structural equation model trees. In J. J. McArdle & G. Ritschard (Eds.), Contemporary issues in exploratory data mining in the behavioral sciences (pp. 96–127). New York: Routledge.Google Scholar
- Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). A comparison of methods for uncovering sample heterogeneity: Structural equation model trees and finite mixture models. Structural Equation Modeling: A Multidisciplinary Journal, 24(2), 270–282. https://doi.org/10.1080/10705511.2016.1250637 MathSciNetCrossRefGoogle Scholar
- Kuroki, Y. (2014). Identifying diverse pathways to cognitive decline in later life using genetic and environmental factors. Unpublished doctoral dissertation, University of Southern California.Google Scholar
- Mehl, M. R., & Conner, T. S. (2012). Handbook of research methods for studying daily life. New York: Guilford Press.Google Scholar
- Stan Development Team. (2016). Stan modeling language users guide and reference manual [Computer software manual] (version 2.9.0)Google Scholar
- Therneau, T., Atkinson, B., & Ripley, B. (2015). rpart: Recursive partitioning and regression trees [Computer software manual]. Retrieved from https://CRAN.R-project.org/package=rpart (R package version 4.1-10)
- Tucker, L. (1966). Learning theory and multivariate experiment: Illustration by determination of generalized learning curves. In R. B. Cattell (Ed.), Handbook of multivariate experimental psychology (pp. 476–501). Chicago: Rand McNally.Google Scholar
- Usami, S., Hayes, T., & Mcardle, J. J. (2017). Fitting structural equation model trees and latent growth curve mixture models in longitudinal designs: The influence of model misspecification. Structural Equation Modeling: A Multidisciplinary Journal, 24(4), 585–598. https://doi.org/10.1080/10705511.2016.1266267 MathSciNetCrossRefGoogle Scholar