Abstract
Co-clustering with augmented matrices (CCAM) [11] is a two-way clustering algorithm that considers dyadic data (e.g., two types of objects) and other correlation data (e.g., objects and their attributes) simultaneously. CCAM was developed to outperform other state-of-the-art algorithms in certain real-world recommendation tasks [12]. However, incorporating multiple correlation data involves a heavy scalability demand. In this paper, we show how the parallel co-clustering with augmented matrices (PCCAM) algorithm can be designed on the Map-Reduce framework. The experimental work shows that the input format, the number of blocks, and the number of reducers can greatly affect the overall performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Banerjee, A., Dhillon, I., Ghosh, J., Merugu, S., Modha, D.: A generalized maximum entropy approach to Bregman co-clustering and matrix approximation. JMLR 8, 1919–1986 (2007)
Bisson, G., Grimal, C.: Co-clustering of multi-view datasets: a parallelizable approach. In: ICDM 2012 (2012)
Deodhar, M., Ghosh, J.: A framework for simultaneous co-clustering and learning from complex data. In: KDD 2007, pp. 250–259 (2007)
Deodhar, M., Jones, C., Ghosh, J.: Parallel simultaneous co-clustering and learning with Map-Reduce. In: IEEE International Conference on Granular Computing, pp. 149–154 (2010)
Dhillon, I.S., Mallela, S., Modha, D.S.: Information theoretic co-clustering. In: KDD 2003, pp. 89–98 (2003)
Forbes, P., Zhu, M.: Content-boosted Matrix Factorization for Recommender Systems: Experiments with Recipe Recommendation. In: RecSys 2011 (2011)
George, T., Merugu, S.: A scalable collaborative filtering framework based on co-clustering. In: ICDM 2005 (2005)
Pacheco, P.S.: Parallel Programming with MPI (1997) ISBN 1-55860-339-5
Papadimitriou, S., Sun, J.: Disco: distributed co-clustering with Map-Reduce. In: ICDM 2008 (2008)
Ramanathan, V., Ma, W., Ravi, V.T., Liu, T., Agrawal, G.: Parallelizing an Information Theoretic Co-clustering Algorithm Using a Cloud Middleware. In: 2010 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 186–193 (2010)
Wu, M.-L., Chang, C.-H., Liu, R.-Z.: Co-clustering with augmented data matrix. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2011. LNCS, vol. 6862, pp. 289–300. Springer, Heidelberg (2011)
Wu, M.L., Chang, C.-H., Liu, R.-Z.: Integrating content-based filtering with collaborative filtering using co-clustering with augmented matrices. Expert Systems with Applications 41(6), 2754–2761 (2014)
Yu, H.F., Hsieh, C.J., Si, S., Dhillon, I.: Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In: ICDM 2012, pp. 765–774 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Wu, ML., Chang, CH. (2014). Parallel Co-clustering with Augmented Matrices Algorithm with Map-Reduce. In: Bellatreche, L., Mohania, M.K. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2014. Lecture Notes in Computer Science, vol 8646. Springer, Cham. https://doi.org/10.1007/978-3-319-10160-6_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-10160-6_17
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10159-0
Online ISBN: 978-3-319-10160-6
eBook Packages: Computer ScienceComputer Science (R0)