Implementing Data Mining in a DBMS
We have developed a clustering algorithm called CLIMIS to demonstrate the advantages of implementing a data mining algorithm in a database management system (DBMS). CLIMIS clusters data held in a DBMS, stores the resulting clusters in the DBMS and executes inside the DBMS. By tightly coupling CLIMIS with the database environment the algorithm scales better to large databases. This is achieved through an index-like structure that uses the database to overcome memory limitations. We further improve the performance of the algorithm by using a technique called adaptive clustering, which controls the size of the clusters.
KeywordsData Mining Cluster Algorithm Leaf Node Large Database Database Management System
Unable to display preview. Download preview PDF.
- 1.Fisher D. H., 1987, Knowledge Acquisition Via Incremental Conceptual Clustering, Machine Learning (2), pp. 139–172.Google Scholar
- 2.Netz, A., Chaudhuri, S., Bernhardt, J., Fayyad, U., 2000, Integration of Data Mining and Relational Databases, in Proceedings of the 26th International Conference on Very Large Databases, Cairo, Egypt, pp. 285–296.Google Scholar
- 3.Oracle Relational Database Management System, 2002, http://www.oracle.com/.
- 4.Witten I. H., Frank E., 2000, Data Mining, Morgan Kaufmann Publishers.Google Scholar
- 5.Zhang T., Ramakrishnan R., Livny M., 1996, BIRCH: An Efficient Data Clustering Method for Very Large Databases, in Proceedings-ACM-SIGMOD International Conference on Management of Data, Montreal, Canada, pp. 103–114.Google Scholar