Can Machine Learning Improve Screening for Targeted Delinquency Prevention Programs?

  • William E. PelhamIIIEmail author
  • Hanno Petras
  • Dustin A. Pardini


The cost-effectiveness of targeted delinquency prevention programs for children depends on the accuracy of the screening process. Screening accuracy is often poor, resulting in wasted resources and missed opportunities to avert negative outcomes. This study examined whether screening approaches based on logistic regression or machine learning algorithms could improve accuracy relative to traditional sum-score approaches when identifying boys in the 5th grade (N = 1012) who would be repeatedly arrested for violent and serious crimes from ages 13 to 30. Screening algorithms were developed that incorporated facets of teacher-reported externalizing problems and other known risk factors (e.g., peer rejection). The predictive performance of these algorithms was evaluated and compared in holdout (i.e., test) data using the area under the receiver operating curve (AUROC) and Brier score. Both the logistic and machine learning methods yielded AUROC superior to traditional sum-score screening approaches when a broad set of risk factors for future delinquency was considered. However, this improvement was modest and was not present when using item-level information from a composite scale assessing externalizing problems. Contrary to expectations, machine learning algorithms performed no better than simple logistic models. There was a large apparent advantage of machine learning that disappeared after appropriate cross-validation, underscoring the importance of careful evaluation of these methods. Results suggest that screening using logistic regression could improve the cost-effectiveness of targeted delinquency prevention programs in some cases, but screening using machine learning would confer no marginal benefit under currently realistic conditions.


Violence Delinquency Prevention Machine learning 



This research was funded by National Institute of Child Health and Human Development grant HD092094. Additional support was provided by grants from the National Institute on Drug Abuse (DA039772, DA009757, DA041713) and National Institute on Alcohol Abuse and Alcoholism (AA026768).

Compliance with Ethical Standards

Conflict of Interest

The authors declare that they have no conflicts of interest.

Ethical Approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Informed Consent

Informed consent/assent was obtained from all participants in this study.

Supplementary material

11121_2019_1040_MOESM1_ESM.docx (87 kb)
ESM 1 (DOCX 87.3 kb)


  1. Achenbach, T. M. (1991). Manual for the teacher’s report form and 1991 profile. Burlington: University of Vermont, Department of Psychiatry.Google Scholar
  2. Arlot, S., & Celisse, A. (2010). A survey of cross-validation procedures for model selection. Statistics Surveys, 4, 40–79.CrossRefGoogle Scholar
  3. Babyak, M. A. (2004). What you see may not be what you get: A brief, nontechnical introduction to overfitting in regression-type models. Psychosomatic Medicine, 66, 411–421.PubMedGoogle Scholar
  4. Bureau of Justice Statistics. (2015). Justice expenditure and exployment extracts, 2012 - Preliminary (no. NCJ 248628). U.S. Department of Justice.Google Scholar
  5. Christodoulou, E., Ma, J., Collins, G. S., Steyerberg, E. W., et al. (2019). A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. Journal of Clinical Epidemiology, 110, 12–22.CrossRefGoogle Scholar
  6. Davison, A. C., & Hinkley, D. V. (1997). Bootstrap Methods and Their Application. Cambridge: Cambridge University Press.Google Scholar
  7. Dawes, R. M. (1979). The robust beauty of improper linear models in decision making. American Psychologist, 34, 571–582.CrossRefGoogle Scholar
  8. DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics, 44, 837–845.CrossRefGoogle Scholar
  9. Dishion, T. J., Shaw, D., Connell, A., Gardner, F., et al. (2008). The family check-up with high-risk indigent families: Preventing problem behavior by increasing parents’ positive behavior support in early childhood. Child Development, 79, 1395–1414.CrossRefGoogle Scholar
  10. Dwyer, D. B., Falkai, P., & Koutsouleris, N. (2018). Machine learning approaches for clinical psychology and psychiatry. Annual Review of Clinical Psychology, 14, 91–118.CrossRefGoogle Scholar
  11. Federal Bureau of Investigation. (2017). Uniform crime report: Crime in the United States, 2016. Department of Justice: Washington D.C..Google Scholar
  12. Foster, E. M., & Jones, D. (2006). Can a costly intervention be cost-effective?: An analysis of violence prevention. Archives of General Psychiatry, 63, 1284–1291.CrossRefGoogle Scholar
  13. Hand, D. J. (2006). Classifier technology and the illusion. Statistical Science, 21, 1–14.CrossRefGoogle Scholar
  14. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. New York: Springer Science & Business Media.CrossRefGoogle Scholar
  15. Hawes, S. W., Perlman, S. B., Byrd, A. L., Raine, A., Loeber, R., & Pardini, D. A. (2016). Chronic anger as a precursor to adult antisocial personality features: The moderating influence of cognitive control. Journal of Abnormal Psychology, 125, 64–74.Google Scholar
  16. Hill, L. G., Coie, J. D., Lochman, J. E., & Greenberg, M. T. (2004). Effectiveness of early screening for externalizing problems: Issues of screening accuracy and utility. Journal of Consulting and Clinical Psychology, 72, 809–820.CrossRefGoogle Scholar
  17. Holte, R. C. (1993). Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11, 63–90.CrossRefGoogle Scholar
  18. Jamain, A., & Hand, D. J. (2008). Mining supervised classification performance studies: A meta-analytic investigation. Journal of Classification, 25, 87–112.CrossRefGoogle Scholar
  19. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning: With applications in R. New York: Springer Science & Business Media.CrossRefGoogle Scholar
  20. Jo, B., Findling, R. L., Hastie, T. J., Youngstrom, E. A., Wang, C.-P., Arnold, L. E., et al. (2018). Construction of longitudinal prediction targets using semisupervised learning. Statistical Methods in Medical Research, 27, 2674–2693.CrossRefGoogle Scholar
  21. Kleinberg, J., Lakkaraju, H., Leskovec, J., Ludwig, J., & Mullainathan, S. (2018). Human decisions and machine predictions. The Quarterly Journal of Economics, 133, 237–293.PubMedGoogle Scholar
  22. Kuhn, M., & Johnson, K. (2013). Applied predictive modeling. New York: Springer Science & Business Media.CrossRefGoogle Scholar
  23. Lochman, J. E., & Wells, K. C. (2004). The coping power program for preadolescent aggressive boys and their parents: Outcome effects at the 1-year follow-up. Journal of Consulting and Clinical Psychology, 72, 571–578.CrossRefGoogle Scholar
  24. Lochman, J. E., Boxmeyer, C. L., Powell, N. P., Barry, T. D., & Pardini, D. A. (2010). Anger control training for aggressive youths. In J. R. Weisz & A. E. Kazdin (Eds.), Evidence based psychotherapies for children and adolescents (2nd ed., pp. 227–242).Google Scholar
  25. Lochman, J. E., Dishion, T. J., Powell, N. P., Boxmeyer, C. L., Qu, L., & Sallee, M. (2015). Evidence-based preventive intervention for preadolescent aggressive children: One-year outcomes following randomization to group versus individual delivery. Journal of Consulting and Clinical Psychology, 83, 728–735.Google Scholar
  26. Loeber, R., Farrington, D. P., Stouthamer-Loeber, M., & Van Kammen, W. B. (1998). Antisocial behavior and mental health problems: Explanatory factors in childhood and adolescence. Mahwah: Lawrence Erlbaum Associates.CrossRefGoogle Scholar
  27. Loeber, R., Pardini, D., Homish, D. L., Wei, E. H., Crawford, A. M., Farrington, D. P., et al. (2005). The prediction of violence and homicide in young men. Journal of Consulting and Clinical Psychology, 73, 1074.CrossRefGoogle Scholar
  28. Loeber, R., Farrington, D. P., Stouthamer-Loeber, M., & White, H. R. (2008). Violence and serious theft: Development and prediction from childhood to adulthood. New York: Routledge.CrossRefGoogle Scholar
  29. Mason, S. J., & Graham, N. E. (2002). Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation. Quarterly Journal of the Royal Meteorological Society, 128, 2145–2166.Google Scholar
  30. Neuilly, M.-A., Zgoba, K. M., Tita, G. E., & Lee, S. S. (2011). Predicting recidivism in homicide offenders using classification tree analysis. Homicide Studies, 15, 154–176.CrossRefGoogle Scholar
  31. O’Connell, M. E., Boat, T., & Warner, K. E. (Eds.). (2009). Preventing Mental, Emotional, and Behavioral Disorders Among Young People: Progress and Possibilities. Washington: National Academies Press.Google Scholar
  32. Pardini, D. A., Byrd, A. L., Hawes, S. W., & Docherty, M. (2018). Unique dispositional precursors to early-onset conduct problems and criminal offending in adulthood. Journal of the American Academy of Child & Adolescent Psychiatry, 57, 583–592.e3.CrossRefGoogle Scholar
  33. Peduzzi, P., Concato, J., Kemper, E., Holford, T. R., & Feinstein, A. R. (1996). A simulation study of the number of events per variable in logistic regression analysis. Journal of Clinical Epidemiology, 49, 1373–1379.CrossRefGoogle Scholar
  34. Petras, H., Chilcoat, H. D., Leaf, P. J., Ialongo, N. S., & Kellam, S. G. (2004). Utility of TOCA-R scores during the elementary school years in identifying later violence among adolescent males. Journal of the American Academy of Child & Adolescent Psychiatry, 43, 88–96.Google Scholar
  35. Petras, H., Buckley, J. A., Leoutsakos, J.-M. S., Stuart, E. A., & Ialongo, N. S. (2013). The use of multiple versus single assessment time points to improve screening accuracy in identifying children at risk for later serious antisocial behavior. Prevention Science, 14, 423–436.CrossRefGoogle Scholar
  36. R Core Team (2018). R: A language and environment for statistical computing. Vienna, Austria: R Foundating for Statistical Computing.Google Scholar
  37. van der Ploeg, T., Austin, P. C., & Steyerberg, E. W. (2014). Modern modelling techniques are data hungry: A simulation study for predicting dichotomous endpoints. BMC Medical Research Methodology, 14, 137.CrossRefGoogle Scholar
  38. van der Ploeg, T., Nieboer, D., & Steyerberg, E. W. (2016). Modern modeling techniques had limited external validity in predicting mortality from traumatic brain injury. Journal of Clinical Epidemiology, 78, 83–89.CrossRefGoogle Scholar
  39. Verhulst, F. C., Koot, H. M., & Van der Ende, J. (1994). Differential predictive value of parents’ and teachers’ reports of children’s problem behaviors: A longitudinal study. Journal of Abnormal Child Psychology, 22, 531–546.CrossRefGoogle Scholar
  40. Wainer, H. (1976). Estimating coefficients in linear models: It don’t make no nevermind. Psychological Bulletin, 83, 213–217.CrossRefGoogle Scholar
  41. Wilson, S. J., & Lipsey, M. W. (2007). School-based interventions for aggressive and disruptive behavior: Update of a meta-analysis. American Journal of Preventive Medicine, 33, S130–S143.CrossRefGoogle Scholar
  42. Yarkoni, T., & Westfall, J. (2017). Choosing prediction over explanation in psychology: Lessons from machine learning. Perspectives on Psychological Science. Scholar

Copyright information

© Society for Prevention Research 2019

Authors and Affiliations

  1. 1.Department of PsychologyArizona State UniversityTempeUSA
  2. 2.American Institutes for ResearchWashingtonUSA
  3. 3.School of Criminology & Criminal JusticeArizona State UniversityPhoenixUSA

Personalised recommendations