Pattern-Oriented Hierarchical Clustering

Morzy, Tadeusz; Wojciechowski, Marek; Zakrzewicz, Maciej

doi:10.1007/3-540-48252-0_14

Pattern-Oriented Hierarchical Clustering

Tadeusz Morzy³,
Marek Wojciechowski³ &
Maciej Zakrzewicz³

Conference paper
First Online: 01 January 2001

260 Accesses
13 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1691))

Abstract

Clustering is a data mining method, which consists in discovering interesting data distributions in very large databases. The applications of clustering cover customer segmentation, catalog design, store layout, stock market segmentation, etc. In this paper, we consider the problem of discovering similarity-based clusters in a large database of event sequences. We introduce a hierarchical algorithm that uses sequential patterns found in the database to efficiently generate both the clustering model and data clusters. The algorithm iteratively merges smaller, similar clusters into bigger ones until the requested number of clusters is reached. In the absence of a well-defined metric space, we propose the similarity measure, which is used in cluster merging. The advantage of the proposed measure is that no additional access to the source database is needed to evaluate the inter-cluster similarities.

Abstract

This work was partially supported by the grant no. KBN 43-1309 from the State Committee for Scientific Research (KBN), Poland.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal R., Srikant R.: Mining Sequential Patterns. Proc. of the 11th Int’l Conference on Data Engineering (ICDE), Taipei, Taiwan (1995)
Google Scholar
Ester M., Kriegel H-P., Sander J., Xu X.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Proc. of the 2nd Int’l Conference on Knowledge Discovery and Data Mining (KDD), Portland, Oregon (1996)
Google Scholar
Ester M., Kriegel H-P., Xu X.: A Database Interface for Clustering in Large Spatial Databases. Proc. of the 1st Int’l Conference on Knowledge Discovery and Data Mining (KDD), Montreal, Canada (1995)
Google Scholar
Gibson D., Kleinberg J.M., Raghavan P.: Clustering Categorical Data: An Approach Based on Dynamical Systems. Proc. of the 24th Int’l Conference on Very Large Data Bases (VLDB), New York City, New York (1998)
Google Scholar
Guha S., Rastogi R., Shim K.: CURE: An Efficient Clustering Algorithm for Large Databases. Proc. of the ACM SIGMOD International Conference on Management of Data, Seattle, Washington, USA (1998)
Google Scholar
Han E., Karypis G., Kumar V., Mobasher B.: Clustering based on association rules hypergraphs. Proc. Workshop on Research Issues on Data Mining and Knowledge Discovery (1997)
Google Scholar
Han E., Karypis G., Kumar V., Mobasher B.: Hypergraph Based Clustering in High-Dimensional Data Sets: A summary of Results. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, Vol.21No. 1 (1998)
Google Scholar
Ng R.T., Han J.: Efficient and effective clustering methods for spatial data mining. Proc. of the 20th International Conference on Very Large Data Bases (VLDB), Santiago de Chile, Chile (1994)
Google Scholar
Ramkumar G. D., Swami A.: Clustering Data Without Distance Functions. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, Vol.21No. 1 (1998)
Google Scholar
Srikant R., Agrawal R.: Mining Sequential Patterns: Generalizations and Performance Improvements. Proc. of the 5th Int’l Conference on Extending Database Technology (EDBT), Avignon, France (1996)
Google Scholar
Zhang T., Ramakrishnan R., Livny M.: Birch: An efficient data clustering method for very large databases. Proc. of the ACM SIGMOD International Conference on Management of Data, Montreal, Canada (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computing Science, Poznan University of Technology, ul. Piotrowo 3a, 60-965, Poznan, Poland
Tadeusz Morzy, Marek Wojciechowski & Maciej Zakrzewicz

Authors

Tadeusz Morzy
View author publications
You can also search for this author in PubMed Google Scholar
Marek Wojciechowski
View author publications
You can also search for this author in PubMed Google Scholar
Maciej Zakrzewicz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Economics, Business Administration and Informatics, University of Klagenfurt, Uiversitätsstr. 65-67, 9020, Klagenfurt, Austria
Johann Eder
Faculty of Electrical Engineering and Computer Science, University of Maribor, Smetanova 17, 2000, Maribor, Slovenia
Ivan Rozman & Tatjana Welzer &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Morzy, T., Wojciechowski, M., Zakrzewicz, M. (1999). Pattern-Oriented Hierarchical Clustering. In: Eder, J., Rozman, I., Welzer, T. (eds) Advances in Databases and Information Systems. ADBIS 1999. Lecture Notes in Computer Science, vol 1691. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48252-0_14

Download citation

DOI: https://doi.org/10.1007/3-540-48252-0_14
Published: 29 March 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66485-7
Online ISBN: 978-3-540-48252-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Abstract

Abstract

Buying options

Preview

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation