Forecasting product sales with a stochastic Bass model
With the Bass model and data of previous sales a point estimate of future sales can be made for the purpose of stock management. In order to obtain information about the accuracy of that estimate a confidence interval can be of use. In this study such an interval is constructed from a Bass model extended with a noise term. The size of the noise is assumed to be proportional with the yearly sales. It is also assumed that the deviation from the deterministic solution is sufficiently small to make a small noise approximation. This perturbation takes the form of a time dependent Ornstein–Uhlenbeck process. For the variance of the perturbation an exact expression can be given which is needed in order to obtain confidence intervals.
KeywordsBass model Ornstein–Uhlenbeck process Sensitivity of parameter to data Confidence domain
The initial purchase of a new product is often explained in terms of innovators who buy the product on their own initiative without being influenced by other buyers. Imitators buy the product because others have done so. Together they form the total number of buyers, the saturation level of a market. The Bass diffusion model  yields an estimate of these metrics on the basis of sales data. The model consists of a differential equation that relates current sales growth to past accumulated sales levels. This modelling approach has become extremely popular, not only within the context of new-product diffusion patterns [2, 3, 4, 5, 6, 7].
Product introducing firms that have a key interest in reliable market forecasts, may face the following fundamental challenges when using the Bass diffusion modelling approach. First, to obtain reliable forecasts, one needs accurate measurements of the innovation and imitation effects and of the saturation level. It has been recognized, however, that the information content of the market data, from which one has to derive the accuracy, may not only change over time, but may also differ among these metrics of interest . Because the distribution of the information content is reflected in the sensitivity of the market metrics with respect to the available data, its derivation is an empirical issue, for which a proper statistical approach is needed. Furthermore, the solution of the differential equation of the Bass diffusion model yields point estimates of futures sales. From a point estimate, however, one cannot conclude about its accuracy. A stochastic approach that facilitates the construction of confidence intervals for the estimated future sales is warranted.
In this study, the following steps can be discerned. We first consider the case that sales data of all years are known and fit the Bass diffusion model to these data. Next, we investigate the over-time information content of the metrics of interest. Based on these results, we identify the minimal sample size that is needed to obtain reliable forecasts. For that purpose the sample may be compared with samples of similar product introductions in the past . In our approach we only use information from sales of the product under consideration. We also did not expand the deterministic model with additional parameters, which indeed may give rise to interesting results [3, 4, 10, 11].
We introduce a stochastic extension of the Bass model. Assuming that stochastic perturbations, in the form of white noise, are small, we consider solutions of this extended model that stay in the neighbourhood of the solution of the deterministic Bass diffusion model. Next, we compute the variance of the stochastic component of the solution which in turn will be used for the construction of the upper and lower boundary of the confidence interval for the yearly sales. Finally, we apply a forecasting approach, based upon the previous methodological steps. Ultimately, confidence intervals for the point forecasts are derived. For the ease of exposition, we make use of a dataset  in each of these consecutive steps. Our approach is related to that of Boswijk and Franses , who apply a similar approach in their analysis of the validity of the parameter estimation procedure.
3 Estimating the parameters of the model
4 Information content of the data
5 Deriving a confidence domain
7 Discussion and conclusions
In this study we introduce confidence intervals in a way different from the commonly used definition in statistics, where a point estimate of a quantity is obtained by the mean of a sample of that quantity. Then a 95% confidence interval contains about 95% of the set of data points. Here the point estimate of a quantity is obtained from a model that is fitted with data of other quantities of the system. Consequently, if the model is not perfect, the number of data points within the 95% confidence interval may differ more from 95% than expected. It is noted that the confidence interval obtained from the data for \(t \leq 18.5\) does not differ much from the confidence interval obtained from the complete data set (\(t \leq 30.5\)). This is due to the fact that once the inclination point is reached, information has been gained from low yearly sales up to high yearly sales: after the inclination point a same route is taken in the opposite direction.
It is remarked that with the Bass model reliable future sale estimates can be made near the inclination point. Near this point the curve of cumulated sales looks close to a straight line. A linear forecast based on local data may result in a large overestimation of required stocks. The Bass model quite well reduces this risk.
With respect to our empirical analysis, it is noted that both at the beginning and in the end when yearly sales are small the model does not agree with the market observations. Apparently, the model does not cover buying actions different from the assumed innovative and imitational traits. Promotion of products or special discounts may be responsible for such deviant behaviour.
The parameter estimation has been carried out with the numerical solution of the Bass equation. Since this first order differential equation has a smooth solution, a high accuracy can be achieved. The optimal values of the parameters were found using the Levenberg-Marquardt method (MATLAB). When the information matrix is ill-posed this approximation process does not converge so that no estimate can made. In this way it is found at which stage of the process sufficient data from the past are available in order to make a dependable forecast. Note that with linear regression  highly uncertain parameter values are found for data sets that cover only a small initial time interval. In the example of steam iron sales the point estimate could be made just before arriving at the inclination point and a confidence interval came within reach just after this point. If the behaviour of the buyers does not meet the requirements of the Bass model, then the computed confidence interval should be discarded. This applies to episodes with low yearly sales at the beginning and at the end.
By guarantying the exclusion of bad estimates and giving expected variances, our way of estimating parameters yields accuracy and precision in the new-product sales forecasting process. Doing so, benefits the investment decisions of companies regarding the introduction of new products. Our methodological framework may also be of use in optimal stock management .
Availability of data and materials
Use have been made of data presented in reference .
MK started the study and delivered the main contribution to the introduction and discussion sections. JG carried out the sensitivity analysis for the parameters as they depend on the data and elaborated the different mathematical methods needed for constructing the confidence intervals. Both authors read and approved the final manuscript.
The authors declare that they have no competing interests.
- 3.Mahajan V, Muller E, Wind Y, editors. New-product diffusion models. Berlin: Springer; 2000. Google Scholar
- 12.Lilien GL, Rangaswamy A, Van den Bulte C. Diffusion models: managerial applications and software. In: Mahajan V, Muller E, Wind Y, editors. New-product diffusion models. New York: Springer; 2000. p. 295–310. Google Scholar
- 16.Ott RL, Longnecker M. An introduction to statistical methods and data analysis. 6th ed. Brooks/Cole; 2010. Google Scholar
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.