Abstract
Data warehouses are a challenging field of application for data mining tasks such as clustering. Usually, updates are collected and applied to the data warehouse periodically in a batch mode. As a consequence, all mined patterns discovered in the data warehouse (e.g. clustering structures) have to be updated as well. In this paper, we present a method for incrementally updating the clustering structure computed by the hierarchical clustering algorithm OPTICS. We determine the parts of the cluster ordering that are affected by update operations and develop efficient algorithms that incrementally update an existing cluster ordering. A performance evaluation of incremental OPTICS based on synthetic datasets as well as on a real-world dataset demonstrates that incremental OPTICS gains significant speed-up factors over OPTICS for update operations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
McQueen, J.: Some Methods for Classification and Analysis of Multivariate Observations. In: 5th Berkeley Symp. Math. Statist. Prob. vol. 1, pp. 281–297 (1967)
Ng, R., Han, J.: Efficient and Affective Clustering Methods for Spatial Data Mining. In: Proc. 20th Int. Conf. on Very Large Databases (VLDB 1994), Santiago, Chile, pp. 144–155 (1994)
Zhang, T., Ramakrishnan, R. Livny, M.: BIRCH: An Efficient Data Clustering Method for Very Large Databases. In: Proc. ACM SIGMOD Int. Conf. on Management of Data (SIGMOD 1996), Montreal, Canada, pp. 103–114 (1996)
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining (KDD 1996), Portland, OR, pp. 291–316. AAAI Press, Menlo Park (1996)
Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: OPTICS: Ordering Points to Identify the Clustering Structure. In: Proc. ACM SIGMOD Int. Conf. on Management of Data (SIGMOD 1999), Philadelphia, PA, pp. 49–60 (1999)
Ester, M., Kriegel, H.P., Sander, J., Wimmer, M., Xu, X.: Incremental Clustering for Mining in a Data Warehousing Environment. In: Proc. 24th Int. Conf. on Very Large Databases (VLDB 1998), pp. 323–333 (1998)
Feldman, R., Aumann, Y., Amir, A., Mannila, H.: Efficient Algorithms for Discovering Frequent Sets in Incremental Databases. In: Proc. ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Tucson, AZ, pp. 59–66 (1997)
Ester, M., Wittmann, R.: Incremental Generalization for Mining in a Data Warehousing Environment. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 135–152. Springer, Heidelberg (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kriegel, HP., Kröoger, P., Gotlibovich, I. (2003). Incremental OPTICS: Efficient Computation of Updates in a Hierarchical Cluster Ordering. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2003. Lecture Notes in Computer Science, vol 2737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45228-7_23
Download citation
DOI: https://doi.org/10.1007/978-3-540-45228-7_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40807-9
Online ISBN: 978-3-540-45228-7
eBook Packages: Springer Book Archive