Abstract
A banded pattern in “zero-one” high dimensional data is one where all the dimensions can be organized in such a way that the “ones” are arranged along the leading diagonal across the dimensions. Rearranging zero-one data so as to feature bandedness allows for the identification of hidden information and enhances the operation of many data mining algorithms that work with zero-one data. In this paper an effective ND banding algorithm, the ND-BPM algorithm, is presented together with a full evaluation of its operation. To illustrate the utility of the banded pattern concept a case study using the GB Cattle movement database is also presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD 1993 pp. 207–216 (1993)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings 20th International Conference on Very Large Data Bases (VLDB 1994), pp. 487–499 (1994)
Alizadeh, F., Karp, R.M., Newberg, L.A., Weisser, D.K.: Physical mapping of chromosomes: A combinatorial problem in molecular biology. Algorithmica 13, 52–76 (1995)
Atkins, J., Boman, E., Hendrickson, B.: Spectral algorithm for seriation and the consecutive ones problem. SIAM J. Comput. 28, 297–310 (1999)
Aykanat, C., Pinar, A., Catalyurek, U.: Permuting sparse rectangular matrices into block-diagonal form. SIAM Journal on Scientific Computing 25, 1860–1879 (2004)
Baeza-Yates, R., RibeiroNeto., B.: Modern Information Retrieval. Addison-Wesley (1999)
Banerjee, A., Krumpelman, C., Ghosh, J., Basu, S., Mooney, R.: Model-based overlapping clustering. In: Proceedings of Knowledge Discovery and DataMining, pp. 532–537 (2005)
Blake, C.L, Merz, C.J.: Uci repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.htm
Coenen, F.P., Goulbourne, G., Leng, P.: Computing association rules using partial totals. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 54–66. Springer, Heidelberg (2001)
Cuthill, A.E., McKee, J.: Reducing bandwidth of sparse symmentric matrices. In: Proceedings of the 1969 29th ACM National Conference, pp. 157–172 (1969)
Fortelius, M., Puolamaki, M.F.K., Mannila, H.: Seriation in paleontological data using markov chain monte method. PLoS Computational Biology, 2 (2006)
Garriga, G.C., Junttila, E., Mannila, H.: Banded structures in binary matrices. Knowledge Discovery and Information System 28, 197–226 (2011)
Junttila, E.: Pattern in Permuted Binary Matrices. Ph.D. thesis (2011)
Von Luxburg, U.A.: A tutorial on spectral clustering. Statistical Computation 17, 395–416 (2007)
Mueller, C.: Sparse matrix reordering algorithms for cluster identification. Machune Learning in Bioinformatics (2004)
Mäkinen, E., Siirtola, H.: The barycenter heuristic and the reorderable matrix. Informatica 29, 357–363 (2005)
Rosen, R.: Matrix bandwidth minimisation. In: ACM National conference Proceedings, pp. 585–595 (1968)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Abdullahi, F.B., Coenen, F., Martin, R. (2014). A Scalable Algorithm for Banded Pattern Mining in Multi-dimensional Zero-One Data. In: Bellatreche, L., Mohania, M.K. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2014. Lecture Notes in Computer Science, vol 8646. Springer, Cham. https://doi.org/10.1007/978-3-319-10160-6_31
Download citation
DOI: https://doi.org/10.1007/978-3-319-10160-6_31
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10159-0
Online ISBN: 978-3-319-10160-6
eBook Packages: Computer ScienceComputer Science (R0)