Abstract
Because the criminal justice outcomes to be forecast are usually categorical (e.g., fail or not), this chapter considers crime forecasting as a classification problem. The goal is to assign classes to cases. There may be two classes or more than two. Machine learning is broadly considered before turning in later chapters to random forests as a preferred forecasting tool. There is no use of models and at best a secondary interest in explanation. Machine learning is based on algorithms, which should not be confused with models. The material is introduced in a conceptual manner with almost no mathematics. Nevertheless, some readers may find the material challenging because a certain amount of statistical maturity must be assumed. Later chapters will use somewhat more formal expositional methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
An indicator variable here implies that the relationship with the outcome variable is a spike function. The regression function, therefore, is a linear combination of spike functions. If an intercept is, as usual, included in the regression, one of the indicators would need to be dropped. Otherwise, the regressor cross-product matrix would be singular, and there would be no unique solution from which to obtain estimates of the regression coefficients.
- 2.
Stagewise should not be confused with stepwise, as in stepwise regression. In stepwise regression, all of the regression coefficients from the previous step are re-estimated as the next predictor is dropped (i.e., backward selection) or added (i.e. forward selection). Were there a procedure called stagewise regression, the earlier regression coefficients would not be re-estimated.
- 3.
Technically, a Bayes classifier choses the class that has the largest probability. We are choosing the class with the largest proportion. To get from proportions to probabilities depends on how the data were generated. We are not there yet.
- 4.
Classification trees is a special case of classification and regression trees (CART). A regression tree uses recursive partitioning with a quantitative outcome variable. A classification tree uses recursive partitioning with a categorical outcome variable. The tree representation is upside down because the “roots” are at the top and the “leaves” are at the bottom.
- 5.
The matter of linearity can be subtle. A single break can be represented by a step function, which is nonlinear. For the collection of all the partition constructed from a single variable, two or more breaks imply two or more step functions which are also nonlinear. One should distinguish between the functions responsible for the each partition and the lines separating the partitions.
- 6.
In this instance, “bias” refer to a systematic tendency to underestimate or overestimate the terminal node proportions in the population responsible for the data.
References
Berk, R. A. (2016) Statistical Learning from a Regression Perspective second edition New York: Springer.
Breiman, L. (1996) Bagging predictors. Machine Learning 26:123–140.
Freedman, D.A. (2009) Statistical ModelsCambridge, UK: Cambridge University Press.
Friedman, J. H. (2002) Stochastic gradient boosting. Computational Statistics and Data Analysis 38: 367–378.
Hastie, T., Tibshirani, R., & Friedman, J. (2009) The Elements of Statistical Learning. Second Edition. New York: Springer.
Ho, T.K. (1998) The random subspace method for constructing decision trees. IEEE Transactions on Pattern Recognition and Machine Intelligence 20 (8) 832–844.
Leeb, H., & Pötscher, B.M. (2005) Model selection and inference: facts and fiction,” Econometric Theory21: 21–59.
Leeb, H., & Pötscher, B.M. (2006) Can one estimate the conditional distribution of post-model-selection estimators? The Annals of Statistics 34(5): 2554–2591.
Monahan, J., & Solver, E. (2003) Judicial decision thresholds for violence risk management. International Journal of Forensic Mental Health 2(1):1–6.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Berk, R. (2019). A Conceptual Introduction to Classification and Forecasting. In: Machine Learning Risk Assessments in Criminal Justice Settings. Springer, Cham. https://doi.org/10.1007/978-3-030-02272-3_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-02272-3_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02271-6
Online ISBN: 978-3-030-02272-3
eBook Packages: Computer ScienceComputer Science (R0)