Abstract
In e-business, recommender systems have been instrumental in guiding users through their online experiences. However, these systems are often limited by the lack of labels data and data sparsity. Increasingly, data-mining techniques are utilized to address this issue. In most research, recommendations to be made are achieved via supervised learning that typically employs the k-nearest neighbor learner. However, supervised learning relies on labeled data, which may not be available at the time of learning. Data sparsity, which refers to situations where the number of items that have been recommended represents only a small subset of all available items, further affects model performance. One suggested solution is to apply cluster analysis as a preprocessing step and thus guide the learning process from natural grouping, typically using similar customer profiles, to improve predictive accuracy. In this paper, we study the benefits of applying cluster analysis as a preprocessing step prior to constructing classification models. Our HCC-Learn framework combines content-based analysis in the preprocessing stage and collaborative filtering in the final prediction stage. Our results show the value of our HCC-Learn framework applied to real-world data sets, especially when combining soft clustering and ensembles based on feature subspaces.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Su, X., Khoshgoftaar, T.M.: A survey of collaborative filtering techniques. Adv. Artif. Intell. (2009). Article no. 4
Wei, K., Huang, J., Fu, S.: A survey of e-commerce recommender systems. In: 2007 International Conference on Service Systems and Service Management, pp. 1–5. IEEE (2007)
Minkov, E., Charrow, B., Ledlie, J., Teller, S., Jaakkola, T.: Collaborative future event recommendation. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 819–828. ACM (2010)
Acosta, O.C., Behar, P.A., Reategui, E.B.: Content recommendation in an inquiry-based learning environment. In: Frontiers in Education Conference (FIE), pp. 1–6. IEEE (2014)
Liao, C.-L., Lee, S.-J.: A clustering based approach to improving the efficiency of collaborative filtering recommendation. Electron. Commer. Res. Appl. 18, 1–9 (2016)
Saha, T., Rangwala, H., Domeniconi, C.: Predicting preference tags to improve item recommendation. In: Proceedings of the 2015 SIAM International Conference on Data Mining, pp. 864–872. SIAM (2015)
Elahi, M., Ricci, F., Rubens, N.: Active learning strategies for rating elicitation in collaborative filtering: a system-wide perspective. ACM Trans. Intell. Syst. Technol. (TIST) 5(1), 13 (2013)
Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Burlington (2016)
Panov, P., Džeroski, S.: Combining bagging and random subspaces to create better ensembles. In: RB, M., Shawe-Taylor, J., Lavrač, N. (eds.) IDA 2007. LNCS, vol. 4723, pp. 118–129. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74825-0_11
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)
Sun, S.: An improved random subspace method and its application to EEG signal classification. In: Haindl, M., Kittler, J., Roli, F. (eds.) MCS 2007. LNCS, vol. 4472, pp. 103–112. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72523-7_11
Alabdulrahman, R., Viktor, H., Paquet, E.: Beyond k-NN: combining cluster analysis and classification for recommender systems. In: The 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2018), Seville, Spain, KDIR 2018, pp. 82–91 (2018)
Kanagal, B., Ahmed, A., Pandey, S., Josifovski, V., Yuan, J., Garcia-Pueyo, L.: Supercharging recommender systems using taxonomies for learning user purchase behavior. Proc. VLDB Endow. 5(10), 956–967 (2012)
Wang, H., Wang, N., Yeung, D.-Y.: Collaborative deep learning for recommender systems. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1235–1244. ACM (2015)
Nikolaenko, V., Ioannidis, S., Weinsberg, U., Joye, M., Taft, N., Boneh, D.: Privacy-preserving matrix factorization. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security, pp. 801–812. ACM (2013)
Guo, G., Zhang, J., Thalmann, D.: A simple but effective method to incorporate trusted neighbors in recommender systems. In: Masthoff, J., Mobasher, B., Desmarais, M.C., Nkambou, R. (eds.) UMAP 2012. LNCS, vol. 7379, pp. 114–125. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31454-4_10
Li, X., Cong, G., Li, X.-L., Pham, T.-A.N., Krishnaswamy, S.: Rank-GeoFM: a ranking based geographical factorization method for point of interest recommendation. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 433–442. ACM (2015)
Lian, D., Zhao, C., Xie, X., Sun, G., Chen, E., Rui, Y.: GeoMF: joint geographical modeling and matrix factorization for point-of-interest recommendation. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 831–840. ACM (2014)
Guo, H., Viktor, H.L.: Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach. ACM SIGKDD Explor. Newsl. 6(1), 30–39 (2004)
Jayasree, S., Gavya, A.A.: Addressing imbalance problem in the class–a survey (2014)
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
Giovanni, S., John, E.: Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions. Morgan Claypool (2010). https://doi.org/10.2200/S00240ED1V01Y200912DMK002
Dilon, B.: Short overview of weka. University De Strasbourg (2016). https://slideplayer.com/slide/3312931/
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Cong, Z., Zhang, X., Wang, H., Xu, H.: Human resource recommendation algorithm based on ensemble learning and Spark. J. Phys. Conf. Ser. 887, 012048 (2017)
Lili, C.: Recommender algorithms based on boosting ensemble learning. Int. J. Smart Sens. Intell. Syst. 8(1), 368–386 (2015)
Pande, S.R., Sambare, S.S., Thakre, V.M.: Data clustering using data mining techniques. Int. J. Adv. Res. Comput. Commun. Eng. 1(8), 494–499 (2012)
Mishra, R., Kumar, P., Bhasker, B.: A web recommendation system considering sequential information. Decis. Support Syst. 75, 1–10 (2015)
Han, J., Pei, J., Kamber, M.: Data mining: concepts and techniques (2011)
Frank, E., Hall, M.A., Witten, I.H.: The WEKA workbench. In: Data Mining: Practical Machine Learning Tools and Techniques (2016)
Vargas-Govea, B., González-Serna, G., Ponce-Medellın, R.: Effects of relevant contextual features in the performance of a restaurant recommender system. ACM RecSys 11(592), 56 (2011)
Canada, N.R.: Fuel Consumption Ratings. Open Government Canada (2018)
Alabdulrahman, R., Viktor, H., Paquet, E.: An active learning approach for ensemble-based data stream mining. In: Proceedings of the International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, Porto, Portugal, pp. 275–282. SCITEPRESS-Science and Technology Publications, Lda (2016)
Mythili, S., Madhiya, E.: An analysis on clustering algorithms in data mining. J. IJCSMC 3(1), 334–340 (2014)
Zhang, Y., Li, T.: Dclustere: a framework for evaluating and understanding document clustering using visualization. ACM Trans. Intell. Syst. Technol. (TIST) 3(2), 24 (2012)
Sridevi, M., Rao, R.R., Rao, M.V.: A survey on recommender system. Int. J. Comput. Sci. Inf. Secur. 14(5), 265 (2016)
Katarya, R., Verma, O.P.: A collaborative recommender system enhanced with particle swarm optimization technique. Multimed. Tools Appl. 75(15), 9225–9239 (2016). https://doi.org/10.1007/s11042-016-3481-4
Bifet, A., Kirkby, R.: Data Stream Mining a Practical Approach (2009)
Caliński, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat. Theory Methods 3(1), 1–27 (1974)
Flach, P.: Machine Learning: The Art and Science of Algorithms that Make Sense of Data. Cambridge University Press, Cambridge (2012)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Alabdulrahman, R., Viktor, H., Paquet, E. (2020). HCC-Learn Framework for Hybrid Learning in Recommender Systems. In: Fred, A., Salgado, A., Aveiro, D., Dietz, J., Bernardino, J., Filipe, J. (eds) Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2018. Communications in Computer and Information Science, vol 1222. Springer, Cham. https://doi.org/10.1007/978-3-030-49559-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-49559-6_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49558-9
Online ISBN: 978-3-030-49559-6
eBook Packages: Computer ScienceComputer Science (R0)