Abstract
Most of the KDD literature focuses on analyzing the effectiveness of KDD techniques, in the sense e.g. of reducing the classification error rate in the case of classification tasks. Efficiency issues are usually considered of secondary importance, it considered at all. In contrast, we focus on the cost-effectiveness of KDD techniques, i.e. on the trade-off between effectiveness (reduction of error rate) and efficiency (reduction of processing time). In particular, we show that a gain in efficiency can be transformed into a gain in effectiveness, and this principle can be used to evaluate the cost-effectiveness of KDD systems in a fair manner. We discuss the application of this general principle to evaluate the cost-effectiveness of two general kinds of KDD techniques, namely classification algorithms and attribute selection algorithms.
Financially supported by the Brazilian government's CNPq, grant No. 200384/93-7.
Chapter PDF
References
D.W. Aha and R.L. Bankert. A comparative evaluation of sequential feature selection algorithms. Proc. 5th Int. Workshop on Artif. Intel. and Statistics, 1–7. Ft. Lauderdale, FL. 1995.
J. Bala, J. Huang, H. Vafaie, K. DeJong and H. Wechsler. Hybrid learning using genetic algorithms and decision trees for pattern classification. Proc. 14th Int. Joint Conf. AI (IJCAI-95), 119–724. 1995.
R. Caruana and D. Freitag. Greedy attribute selection. Proc. 11th Int. Conf. Machine Learning, 28–36. 1994.
P. Davies. God and the New Physics. Penguim Books, 1983.
G.H. John, R. Kohavi and K. Pfleger. Irrelevant features and the subset selection problem. Proc. 11th Int. Conf. Machine Learning, 121–129. 1994.
K. Kira and L.A. Rendell. The feature selection problem: traditional techniques and a new algorithm. Proc. 10th Nat. Conf. AAAI, 129–134. 1992.
R. Kohavi, D. Sommerfield and J. Dougherty. Data mining using MLC++: a machine learning library in C++. Technical Report, Silicon Graphics Inc., 1996.
D. Koller and M. Sahami. Toward optimal feature selection. Proc. 13th Int. Conf. Machine Learning, 1996.
I. Kononenko. Estimating attributes: analysis and extensions of RELIEF. Proc. 1994 European Conf. Machine Learning, LNAI 784, 171–182. 1994.
D. Michie, D.J. Spiegelhalter and C.C. Taylor. Machine Learning, Neural and Statistical Classification. Ellis Horwood, 1994.
A.W. Moore and M.S. Lee. Efficient algorithms for minimizing cross validation error. Proc. 11th Int. Conf. Machine Learning, 190–198. 1994.
J.R. Quinlan. The Minimum Description Length Principle and categorical theories. Proc. 11th Int. Conf. Machine Learning, 233–241. 1994.
J.R. Quinlan and R.L. Rivest. Inferring decision trees using the minimum description length principle. Info. and Computation 80(3), 1989, 227–248.
J.R. Quinlan and R.M. Cameron-Jones. Oversearching and layered search in empirical learning. Proc. 14th Int. Joint Conf. AI (IJCAI-95), 1019–1024.
M. Richeldi and P.L. Lanzi. Performing effective feature selection by investigating the deep structure of data. Proc. 2nd Int. Conf. Knowledge Discovery & Data Mining, 379–382. 1996.
C. Schaffer. A conservation law for generalization performance. Proc. 11th Int. Conf. Machine Learning, 259–265. 1994.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Freitas, A.A. (1997). The principle of transformation between efficiency and effectiveness: Towards a fair evaluation of the cost-effectiveness of KDD techniques. In: Komorowski, J., Zytkow, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1997. Lecture Notes in Computer Science, vol 1263. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63223-9_128
Download citation
DOI: https://doi.org/10.1007/3-540-63223-9_128
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63223-8
Online ISBN: 978-3-540-69236-2
eBook Packages: Springer Book Archive