Overfitting refers to the process of learning specific idiosyncrasies from a training set such as spurious artifacts or random noise, which results in an over-adaption to the training set and therefore in a degradation of the ability to generalize to new, unseen data (Duda et al. 2001). Such an overadapted or overtrained model is called overfitted.
A central goal in predictive analysis is the identification of a model with good generalization ability for new, unseen data. It is a fundamental tenet that the training and test data originate from the same distribution (stationarity assumption). However, if a model adapts too well to the idiosyncrasies of the training data, then the model will not generalize well to new cases. Hence, the model is said to be overfitted to the training set.