Advertisement

Data Mining and Statistics

A Systems Point of View
  • A. Siebes
Part of the International Centre for Mechanical Sciences book series (CISM, volume 408)

Abstract

Moore’s law has never been so obvious as it is now. New PC’s are equiped with hundreds of Megabytes of main memory, many Gigabytes of secondary storage and processors approaching a Gigaherz clockspeed. Fortunately1 the need for such resources is growing just as fast if not faster.

Keywords

Data Mining Bayesian Network Association Rule Classification Tree Quality Function 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison Wesley, 1994.Google Scholar
  2. [2]
    R. Agrawal, P. Stolorz, and G. Piatetsky-Shapiro, editors. AAAI-98 Conference on Knowledge Discovery and Data Mining, New York, New York, 1998.Google Scholar
  3. [3]
    Rakesh Agrawal, Tomasz Imielinski, and Arun Swami. Mining association rules between sets of items in large databases. In Proceedings of the 1993 International Conference on Management of Data (SIGMOD 93), pages 207–216, May 1993.Google Scholar
  4. [4]
    Rakesh Agrawal, Heikki Mannila, Ramakrishnan Srikant, Hannu Toivonen, and A. Inkeri Verkamo. Fast discovery of association rules. In Fayyad et al. [16].Google Scholar
  5. [5]
    C. Bishop. Neural Networks for Pattern Recognition. Clarendon Press, 1995.Google Scholar
  6. [6]
    Leo Breiman, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone. Classification and Regression Trees. Wadsworth, 1984.Google Scholar
  7. [7]
    S. Chauduri and D. Madigan, editors. ACM-99 Conference on Knowledge Discovery and Data Mining, San Diego, California, 1999.Google Scholar
  8. [8]
    Peter Cheeseman and John Stutz. Bayesian Classification (Autoclass): Theory and Results pages 153–180. In Fayyad et al. [16], 1996.Google Scholar
  9. [9]
    D. Chickering, D. Geiger, and D. Heckerman. Learning bayesian networks: Search methods and experimental results. In Proceedings of the Fifth Conference on Artificial Intelligence and Statistics, 1995.Google Scholar
  10. [10]
    Gregory F. Cooper and Edward Herskovits. A bayesian method for the induction of probabilistic networks from data. Machine Learning, 9: 309–347, 1992.MATHGoogle Scholar
  11. [11]
    Saul Jacka. David J. IIand. Statistics in Finance. Arnold, 1998.Google Scholar
  12. [12]
    Benjamin S. Duran and -Patrick L. Odell. Cluster Analysis, A Survey. Lecture Notes in Economics and Mathematical Systems, vol 100. Springer-Verlag, 1974.Google Scholar
  13. [13]
    R. Kohavi F. Provost, T. Fawcet. Analysis and visualization of classifier performance. Proceedings of the 15th ICML, 1998.Google Scholar
  14. [14]
    Usaana M. Fayyad. Branching on attribute values in decision tree generation. In Proceedings of the 12th National Conference on Artificial Intelligence, pages 601–606. AAAI/MIT Press, 1994.Google Scholar
  15. [15]
    Usama M. Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth. From Data Mining to Knowledge Discovery: An Overview, pages 1–34. In Fayyad et al. [16], 1996.Google Scholar
  16. [16]
    Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, and Ramasamy Uthurusamy, editors. Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, 1996.Google Scholar
  17. [17]
    Usama M. Fayyad and Ramasamy Uthurusamy, editors. AAAI-95 Conference on Knowledge Discovery and Data Mining, Montreal, Quebec, 1995.Google Scholar
  18. [18]
    J.H. Friedman and J.W. Tukey. A projection pursuit algorithm for exploratory data analysis. IEEE Transactions on Computing, C-23: 881–889, 1974.Google Scholar
  19. [19]
    Jim Gray, Surajit Chaudhuri, Adam Bosworth, Andrew Layman, Don Reichart, Murali Venkatrao, Frank Pellow, and Hamid Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub totals. Data Mining and Knowledge Discovery, An International Journal, 1, 1997.Google Scholar
  20. [20]
    Peter Grünwald. The Minimum description Length Principle and Reasoning under Uncertainty. PhD thesis, University of Amsterdam, 1998.Google Scholar
  21. [21]
    David J. Hand, Joost N. Kok, and Michael R. Berthold, editors. Advances in Intelligent Data Analysis, number 1642 in LNCS, Amsterdam, The Netherlands, 1999. Springer.Google Scholar
  22. [22]
    Venky Harinarayan, Anand Rajaraman, and Jeffrey D. Ullman. Implementing data cubes efficiently. In Proceedings of the 1996 SIGMOD Conference, pages 205–216, 1996.CrossRefGoogle Scholar
  23. [23]
    David Heckerman, Heikki Mannila, Daryl Pregibon, and Ramasamy Uthurusamy, editors. AAA I-97 Conference on Knowledge Discovery and Data Mining, Newport Beach, California, 1997.Google Scholar
  24. [24]
    John Hertz, Anders Krogh, and Richard G. Palmer. Introduction to the Theory of Neural Networks. Santa Fe Institute Lecture Notes vol 1. Addison-Wesley, 1991.Google Scholar
  25. [25]
    Marcel Holsheimer, Martin Kersten, and Arno Siebes. Data surveyor: Searching the nuggets in parallel. In Advances in Knowledge Discovery and Data Mining, pages 447–467. MIT Press/AAAI Press, 1996.Google Scholar
  26. [26]
    Peter J. Huber. Projection pursuit. The Annals of Statistics, 13 (2): 435–475, 1985.MathSciNetCrossRefMATHGoogle Scholar
  27. [27]
    Finn V. Jensen. An Introduction to Bayesian Networks. Springer, 1996.Google Scholar
  28. [28]
    Jan Komorowski and Jan Zytkow, editors. Principles of Data Mining and Knowledge Discovery, number 1263 in LNAI, Trondheim, Norway, 1997. Springer.Google Scholar
  29. [29]
    John R. Koza. Genetic programming, volume 1. MIT Press, 1992.Google Scholar
  30. [30]
    John R. Koza. Genetic programming, volume 2. MIT Press, 1994.Google Scholar
  31. [31]
    Ming Li and Paul Vitänyi. An Introduction to Kolmogorov Complexity and its Applications. Texts and Monographs in Computer Science. Springer Verlag, 1993.CrossRefMATHGoogle Scholar
  32. [32]
    X. Liu, P. Cohen, and M. Berthold, editors. Advances in Intelligent Data Analysis, number 1280 in LNCS, London, UK, 1997. Springer.Google Scholar
  33. [33]
    Hongjun Lu, Hiroshi Motoda, and Huan Liu, editors. KDD: techniques and applications, Singapore, 1997. World Scientific.Google Scholar
  34. [34]
    Heikki Mannila and Kari-Jouko Räihä. Algorithms for inferring functional dependencies from relations. Data and Knowledge Engineering, 12: 83–99, 1994.CrossRefMATHGoogle Scholar
  35. [35]
    K.V. Mardia, J.T. Kent, and J.M. Bibby. Multivariate Analysis. Probability and Mathematical Statistics. Academic Press, 1979.MATHGoogle Scholar
  36. [36]
    D. Michie, D.J. Spiegelhalter, and C.C. Taylor, editors. Machine Learning, Neural and Statistical Classification. Ellis Horwood series in Artificial Intelligence. Ellis Horwood, 1994.MATHGoogle Scholar
  37. [37]
    Anthony O’Hagan. Bayesian Inference. Kenda.11’s Advanced Theory of Statistics, vol 2B. Edward Arnold, 1994.Google Scholar
  38. [38]
    S Stolfo P. Chan. Towards scalabale learning with non-uniform class and cost distributions. Proceedings of IiDD98, 1998.Google Scholar
  39. [39]
    J. Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, 1988.Google Scholar
  40. [40]
    J.R. Quinlan. Induction of decision trees. Machine Learning, 1: 81–106, 1986.Google Scholar
  41. [41]
    J.R. Quinlan. Probabilistic decision trees. In Y. Rodratoff and R.. Michalski, editors, drachme Learning: An Artificial Intelligence Approach, Vol 3. Morgan Kaufmann, 1990.Google Scholar
  42. [42]
    B.D. Ripley. Pattern Recognition and Neural Networks. Cambridge University Press, 1996.Google Scholar
  43. [43]
    C.P. Robert. The Bayesian Choice. Springer Verlag, 1994.Google Scholar
  44. [44]
    Arno Siebes. Data surveying, foundations of an inductive query language. Ln Fayyad and Uthurusa.my [17], pages 269–274.Google Scholar
  45. [45]
    Evangelos Simoudis, Jiawei Han, Usama M. Fayyad, and Ramasamy Uthurusamy, editors. AAAI-96 Conference on Knowledge Discovery and Data Mining, Portland, Oregon, 1996.Google Scholar
  46. [46]
    Alan Stuart and Keith Ord. distribution Theory. Kendall’s Advanced Theory of Statistics, vol 1. Edward Arnold, 1994.Google Scholar
  47. [47]
    Alan Stuart, Keith Ord, and Steven Arnold. Classical Inference and the Linear Model. Kendall’s Advanced Theory of Statistics, vol 2A. Edward Arnold, 1999.Google Scholar
  48. [48]
    J.W. Tukey. Exploratory Data Analysis. Addison-Wesley, 1977.Google Scholar
  49. [49]
    Xindong Wu, Ramamohanarao Kotagiri, and Kevin B. Korp, editors. Research and Development in Knowledge Discovery and Data Mining, number 1394 in LNAI, Melbourne, Australia, 1998. Springer.Google Scholar
  50. [50]
    Ning Zhong and Lizhu Zhou, editors. Research and Development in Knowledge Discovery and Data Mining, number 1574 in LNAI, Beijing, China, 1999. Springer.Google Scholar
  51. [51]
    Jan Zytkow and Jan Rauch, editors. Principles of Data Mining and Knowledge Discovery, number 1704 in LNAI, Prague, Czech Republic, 1999. Springer.Google Scholar
  52. [52]
    Jan M. Zytkow and Mohamed Quafafou, editors. Principles of Data Mining and Knowledge Discovery, number 1510 in LNAI, Nantes, France, 1998. Springer.Google Scholar

Copyright information

© Springer-Verlag Wien 2000

Authors and Affiliations

  • A. Siebes
    • 1
  1. 1.CWIAmsterdamThe Netherlands

Personalised recommendations