Abstract
We have called this chapter “fine-tuning your model” because it covers a variety of concepts and methods that all share a common goal: to get more out of your model. Section 6.1 revisits the topic of model quality. Recall that we have already discussed the quality of a model in Chapter 3 (in Section 3.2.3, to be more precise). Why do we revisit the same topic again in this chapter? Because a model’s quality depends on one’s point of view. In Section 3.2.3, we have argued that a model is “good” if it explains a large portion of the uncertainty in the response variable or, in other words, if it explains many of the patterns observed in the data. However, since observed data is necessarily data from the past, our assessment of a model’s quality has been retrospective until this point. But what about a model’s ability to predict the future? A model that can explain the past very well does not necessarily predict the future equally well. In Section 6.1, we will thus revisit the topic of a model’s quality and discuss the shortcomings of model fit statistics such as R-squared. We will also change our point of view and discuss ideas that can help us assess a model’s ability to “look into the future.”
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Of course, we would only have this information available after the end of that year. Here, we use it to illustrate the predictive capabilities of our regression model.
- 2.
The test set is also referred to as the validation set. In fact, some data miners use three different splits of the data: a training set for model building; a validation set for checking the model’s performance and optimization of model parameters; and a test set for evaluating the final result. Here, we refer to test set and validation set interchangeably.
- 3.
But it may depend on other factors not shown here, such as the square footage or the number of bedrooms, or even other factors not considered in this data.
- 4.
See, for example, http://en.wikipedia.org/wiki/Decision_tree.
- 5.
If there was no relationship between X and Y, a model would not make any sense.
- 6.
Some of these additional activities may include the close investigation of outliers or extreme events and their potential removal from the data.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Jank, W. (2011). Data Modeling IV-Fine-Tuning Your Model. In: Business Analytics for Managers. Use R. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-0406-4_6
Download citation
DOI: https://doi.org/10.1007/978-1-4614-0406-4_6
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-0405-7
Online ISBN: 978-1-4614-0406-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)