A Two-Step Iterative Procedure for Clustering of Binary Sequences

Palumbo, Francesco; D’Enza, A. Iodice

doi:10.1007/978-3-642-03739-9_4

A Two-Step Iterative Procedure for Clustering of Binary Sequences

Francesco Palumbo &
A. Iodice D’Enza⁴

Conference paper
First Online: 25 November 2009

1561 Accesses

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

Abstract

Association Rules (AR) are a well known data mining tool aiming to detect patterns of association in data bases. The major drawback to knowledge extraction through AR mining is the huge number of rules produced when dealing with large amounts of data. Several proposals in the literature tackle this problem with different approaches. In this framework, the general aim of the present proposal is to identify patterns of association in large binary data. We propose an iterative procedure combining clustering and dimensionality reduction techniques: each iteration involves a quantification of the starting binary attributes and an agglomerative algorithm on the obtained quantitative variables. The objective is to find a quantification that emphasizes the presence of groups of co-occurring attributes in data.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Iodice D’Enza, A., Palumbo, F., & Greenacre, M. (2007). Exploratory data analysis leading towards the most interesting simple association rules. Computational Statistics and Data Analysis, doi:10.1016/j.csda.2007.10.006.
Google Scholar
Lauro, C. N., & Balbi, S. (1999). The analysis of structured qualitative data. Applied Stochastic Models and Data Analysis, 15(1), 1–27.
Article MATH MathSciNet Google Scholar
Lenca, P., Vaillant, B., Meyer, P., & Lallich, S. (2007). Association rule interestingness measures: Experimental and theoretical studies. In G. Guillet & H. J. Hamilton (Eds.), Quality measures in data mining. Berlin: Springer.
Google Scholar
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, University of California Press.
Google Scholar
Palumbo, F., & Verde, R. (1996). Analisi Fattoriale Discriminante Non-Simmetrica su Predittori Qualitativi (in italian). In Atti della XXXVIII Riunione scientifica della Societ Italiana di Statistica, Rimini, Italy.
Google Scholar
Plasse, M., Niang, N., Saporta, G., Villeminot, A., & Leblond, L. (2007). Combined use of association rules mining and clustering methods to find relevant links between binary rare attributes in a large data set. Computational Statististics Data Analysis, doi:10.1016/j.csda.2007.02.020.
MATH MathSciNet Google Scholar
Vichi, M., & Kiers, H. A. L. (2001). Factorial k-means analysis for two way data. Computational Statistics and Data Analysis, 37, 49–64.
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Scienze Economiche e Finanziarie, Università di Cassino, Rome, Italy
A. Iodice D’Enza

Authors

Francesco Palumbo
View author publications
You can also search for this author in PubMed Google Scholar
A. Iodice D’Enza
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. Iodice D’Enza .

Editor information

Editors and Affiliations

Fac. Economia, Università Macerata, Via Crescimbeni 20, Macerata, 62100, Italy
Francesco Palumbo
Dipto. Matematica e Statistica, Università Federico II di Napoli, Via Cinthia (Monte S. Angelo), Napoli, 80126, Italy
Carlo Natale Lauro
Depto. Economía y Empresa, Universitat Pompeu Fabra, Ramon Trias Fargas 25-27, Barcelona, 08005, Spain
Michael J. Greenacre

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Palumbo, F., D’Enza, A.I. (2010). A Two-Step Iterative Procedure for Clustering of Binary Sequences. In: Palumbo, F., Lauro, C., Greenacre, M. (eds) Data Analysis and Classification. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03739-9_4

Download citation

DOI: https://doi.org/10.1007/978-3-642-03739-9_4
Published: 25 November 2009
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03738-2
Online ISBN: 978-3-642-03739-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics