Abstract
This paper provides a survey of various data mining techniques for advanced database applications. These include association rule generation, clustering and classification. With the recent increase in large online repositories of information, such techniques have great importance. The focus is on high dimensional data spaces with large volumes of data. The paper discusses past research on the topic and also studies the corresponding algorithms and applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aggarwal C. C., Procopiuc C., Wolf J. L., Yu P. S. Park J.-S.: A Framework for Finding Projected Clusters in High Dimensional Spaces. IBM Research Report RC 21286.
Aggarwal C. C., Yu P. S.: Online Generation of Association Rules. International Conference on Data Engineering. Orlando, Florida, (1998).
Aggarwal C. C., Sun Z., Yu P. S.: Online Algorithms for Finding Profile Association Rules. Knowledge Discovery and Data Mining, (1998).
Aggarwal C. C., Yu P. S.: A New Framework for Itemset Generation. Proceedings of the ACM Symposium on PODS, (1998).
Agrawal R., Imielinski T., Swami A.: Mining Association Rules between Sets of Items in Very Large Databases. Proceedings of the ACM SIGMOD Conference (1993) pages 207–216.
Agrawal R., Srikant R.: Fast Algorithms for Mining Association Rules in Large Databases. Proceedings of the 20th VLDB Conference (1994) pages 478–499.
Bayardo R. J.: Efficiently Mining Long Patterns from Databases. Proceedings of the ACM SIGMOD (1998).
Berger M., Rigoutsos I.: An Algorithm for Point Clustering and Grid Generation. IEEE Transactions on Systems, Man and Cybernetics, Vol. 21, No. 5:1278–1286, (1991).
Brin S., Motwani R., Silverstein C.: Beyond Market Baskets: Generalizing Association Rules to Correlations. Proceedings of the ACM SIGMOD (1997) pages 265–276
Apte C, Hong S. J., Lepre J., Prasad S., Rosen B.: RAMP: Rules Abstraction for Modeling and Prediction. IBM Research Report.
Chen M.-S., Yu P. S.: Using Multi-Attribute Predicates for Mining Classification Rules. IBM Research Report 20562, (1996).
Ester M., Kriegel H.-P., Xu X.: A Database Interface for Clustering in Large Spatial Databases. Knowledge Discovery and Data Mining (1995).
Ester M., Kriegel H.-P., Xu X.: Knowledge Discovery in Large Spatial Databases: Focusing Techniques for Efficient Class Identification. International Symposium on Large Spatial Databases (1995).
Keim D., Berchtold S., Bohm C., Kriegel, H.-P.: A Cost Model for Nearest Neighbor Search in High-dimensional Data Space. International Symposium on Principles of Database Systems (PODS). (1997), pages 78–86.
Ester M., Kriegel H.-P., Sander J., Xu X.: A Density Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. International Conference on Knowledge Discovery in Databases and Data Mining (1995).
Jain A., Dubes R.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs, New Jersey, (1998).
Langley P., Iba W., Thompson K.: An analysis of Bayesian classifiers. AAAI, (1990), 223–228.
Ng R., Han J.: Efficient and Effective Clustering Methods for Spatial Data Mining. Proceedings of the 20th VLDB Conference (1994) pages 144–155.
Zhang T., Ramakrishnan R., Livny M.: BIRCH: An Efficient Data Clustering Method for Very Large Databases. Proceedings of the ACM SIGMOD Conference (1996).
Kohavi R., Sommerfield D.: Feature Subset Selection Using the Wrapper Method: Overfitting and Dynamic Search Space Topology. Knowledge Discovery and Data Mining (1995).
Liu B., Hsu W., Ma Y.: Integrating Classification and Association Rule Mining. Knowledge Discovery and Data Mining, pages 80–86, (1998).
Lu H., Setiono R., Liu H.: NeuroRule: A Connectionist Approach to Data Mining. Proceedings of the 21st VLDB Conference (1995).
Mehta M., Agrawal R., Rissanen J.: SLIQ: A Fast Scalable Classifier for Data Mining. IBM Research Report.
Park J. S., Chen M. S., Yu P. S.: Using a Hash-based Method with Transaction Trimming for Mining Association Rules. IEEE Transactions on Knowledge and Data Engineering, Volume 9, no 5, (1997), pages 813–825.
Quinlan J. R.: Induction of Decision Trees, Machine Learning, Volume 1, Number 1, (1986).
Savasere A., Omiecinski E., Navathe S. B: An efficient algorithm for mining association rules in large databases. Proceedings of the 21st VLDB Conference (1995).
Shafer J., Agrawal R., Mehta M.: SPRINT: A Scalable Parallel Classifier for Data Mining. Proceedings of the 22nd VLDB Conference (1996).
Srikant R., and Agrawal R.: Mining Generalized Association Rules. Proceedings of the 21st VLDB Conference (1995) pages 407–419.
Weiss S. M., Kulikowski C. A.: Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert Systems. Morgan Kaufman, (1991).
Srikant R., Agrawal R.: Mining quantitative association rules in large relational tables. Proceedings of the ACM SIGMOD Conference, (1996) pages 1–12.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Aggarwal, C.C., Yu, P.S. (1999). Data Mining Techniques for Associations, Clustering and Classification. In: Zhong, N., Zhou, L. (eds) Methodologies for Knowledge Discovery and Data Mining. PAKDD 1999. Lecture Notes in Computer Science(), vol 1574. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48912-6_4
Download citation
DOI: https://doi.org/10.1007/3-540-48912-6_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65866-5
Online ISBN: 978-3-540-48912-2
eBook Packages: Springer Book Archive