Support Vector Machines for Predicting Customer Activity and Future Best Customers in Non-Contractual Settings
The Pareto/NBD and the BG/NBD models owe their names to their underlying distributional assumptions, which emphasizes the strong theoretical foundation of the models. Yet, the last chapter showed that they do not outperform simple management heuristics. In fact, even back in the late 1960s, Tukey (1969) has already postulated that putting too much emphasis on the mathematical theories of statistics did not help in solving the real world problems. It was his mantra that statistical work is detective work and that one should let the data speak for itself. The branch of exploratory data analysis emerged, but was dismissed by mathematical statisticians for a long period of time. Many of them proclaimed that proper statistical analysis must be based on hypothesis and distributional assumptions. Their argument was that looking at data before formulating a scientific hypothesis would bias the hypothesis towards what the data might show. The term data mining typically was used in a derogatory connotation. The argument culminated in the reproach of improper scientific use, the reproach of torturing the data until it confesses everything.
KeywordsSupport Vector Machine Unsupervised Learning Cost Parameter Simple Heuristic Empirical Risk
Unable to display preview. Download preview PDF.