Advertisement

World Wide Web

, Volume 22, Issue 5, pp 2225–2239 | Cite as

Layered convolutional dictionary learning for sparse coding itemsets

  • Sameen ManshaEmail author
  • Hoang Thanh Lam
  • Hongzhi Yin
  • Faisal Kamiran
  • Mohsen Ali
Article
  • 191 Downloads
Part of the following topical collections:
  1. Special Issue on Big Data Management and Intelligent Analytics

Abstract

Dictionary learning for sparse coding has been successfully used in different domains, however, has never been employed for the interesting itemset mining. In this paper, we formulate an optimization problem for extracting a sparse representation of itemsets and show that the discrete nature of itemsets makes it NP-hard. An efficient approximation algorithm is presented which greedily solves maximum set cover to reduce overall compression loss. Furthermore, we incorporate our sparse representation algorithm into a layered convolutional model to learn nonredundant dictionary items. Following the intuition of deep learning, our convolutional dictionary learning approach convolves learned dictionary items and discovers statistically dependent patterns using chi-square in a hierarchical fashion; each layer having more abstract and compressed dictionary than the previous. An extensive empirical validation is performed on thirteen datasets, showing better interpretability and semantic coherence of our approach than two existing state-of-the-art methods.

Keywords

Interesting itemset mining Convolutional sparse dictionary learning Lossless compression Deep learning 

References

  1. 1.
    Aggarwal, C.C., Han, J.: Frequent Pattern Mining. Springer, Berlin (2014)zbMATHGoogle Scholar
  2. 2.
    Agrawal, R., Srikant, R., et al.: Fast Algorithms for Mining Association Rules. Morgan Kaufmann, San Mateo (1994)Google Scholar
  3. 3.
    Boureau, Y.-l., Cun, Y. L., et al.: Sparse feature learning for deep belief networks. In: Proceedings of Advances in Neural Information Processing Systems, pp. 1185–1192 (2008)Google Scholar
  4. 4.
    Calders, T., Goethals, B.: Non-derivable itemset mining. Data Min. Knowl. Discov. 14(1), 171–206 (2007)MathSciNetGoogle Scholar
  5. 5.
    Chandola, V., Kumar, V.: Summarization—compressing data into an informative representation. Knowl. Inf. Syst. 12(3), 355–378 (2007)Google Scholar
  6. 6.
    Coenen, F.: The lucs-kdd discretised/normalised arm and carm data library. http://www.csc.liv.ac.uk/frans/KDD/Software/LUCS-KDD-DN/DataSets/dataSets.html
  7. 7.
    Fowkes, J., Sutton, C.: A bayesian network model for interesting itemsets. In: Proceeding of European Conference Machine Learning and Knowledge Discovery in Databases, pp. 410–425. Springer, Berlin (2016)Google Scholar
  8. 8.
    Fowkes, J., Sutton, C.: A subsequence interleaving model for sequential pattern mining. In: Proceedings of ACM International Conference on Knowledge Discovery and Data Mining, pp. 835–844. KDD, USA (2016)Google Scholar
  9. 9.
    Geerts, F., Goethals, B., Mielikäinen, T.: Tiling Databases, pp. 278–289. Springer, Berlin (2004)zbMATHGoogle Scholar
  10. 10.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explor. Newsl. 11 (1), 10–18 (2009)Google Scholar
  11. 11.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. SIGMOD Rec. 29(2), 1–12 (2000)Google Scholar
  12. 12.
    Kavukcuoglu, K., Sermanet, P., Boureau, Y.-L., Gregor, K., Mathieu, M., Cun, Y.L.: Learning convolutional feature hierarchies for visual recognition. In: Proceedings of Advances in Neural Information Processing Systems, pp. 1090–1098 (2010)Google Scholar
  13. 13.
    Lam, H.T., Mörchen, F., Fradkin, D., Calders, T.: Mining compressing sequential patterns. Stat. Anal. Data Min. 7(1), 34–52 (2014)MathSciNetGoogle Scholar
  14. 14.
    LeCun, Y., et al: Lenet-5, convolutional neural networks. http://yann.lecun.com/exdb/lenet (2015)
  15. 15.
    Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. In: Proceedings of Advances in Neural Information Processing Systems, pp. 801–808 (2006)Google Scholar
  16. 16.
    Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online dictionary learning for sparse coding. In: Proceedings of International Conference on Machine Learning, pp. 689–696 (2009)Google Scholar
  17. 17.
    Mampaey, M., Tatti, N., Vreeken, J.: Tell me what i need to know: succinctly summarizing data with itemsets. In: Proceedings of ACM International Conference on Knowledge Discovery and Data Mining, pp. 573–581 (2011)Google Scholar
  18. 18.
    Mampaey, M., Vreeken, J., Tatti, N.: Summarizing data succinctly with the most informative itemsets. ACM Trans. Knowl. Discov. Data 6(4), 16:1–16:42 (2012)Google Scholar
  19. 19.
    Mansha, S., Babar, Z., Kamiran, F., Karim, A.: Neural network based association rule mining from uncertain data. In: Proceedings of Neural Information Processing, pp. 129–136. Springer, Berlin (2016)Google Scholar
  20. 20.
    Mansha, S., Kamiran, F., Karim, A., Anwar, A.: A self-organizing map for identifying influentialcommunities in speech-based networks. In: Proceedings of ACM International on Conference on Information and Knowledge Management, pp. 1965–1968. CIKM (2016)Google Scholar
  21. 21.
    Mörchen, F., Fradkin, D.: Robust mining of time intervals with semi-interval partial order patterns. In: Proceedings of Society for Industrial and Applied Mathematics (2010)Google Scholar
  22. 22.
    Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Syst. 24(1), 25–46 (1999)zbMATHGoogle Scholar
  23. 23.
    Shang, S., Ding, R., Yuan, B., Xie, K., Zheng, K., Kalnis, P.: User oriented trajectory search for trip recommendation. In: Proceedings of the 15th International Conference on Extending Database Technology, pp. 156–167. ACM (2012)Google Scholar
  24. 24.
    Shang, S., Chen, L., Jensen, C.S., Wen, J.-R., Kalnis, P.: Searching trajectories by regions of interest. IEEE Trans. Knowl. Data Eng. 29(7), 1549–1562 (2017)Google Scholar
  25. 25.
    Smets, K., Vreeken, J.: Slim: directly mining descriptive patterns. In: Proceedings of SIAM International Conference on Data Mining, pp. 236–247 (2012)Google Scholar
  26. 26.
    Tatti, N., Vreeken, J.: The long and the short of it: summarising event sequences with serial episodes. In: Proceedings of ACM International Conference on Knowledge Discovery and Data Mining, pp. 462–470. KDD (2012)Google Scholar
  27. 27.
    Vazirani, V.V.: Approximation Algorithms. Springer, Berlin (2013)Google Scholar
  28. 28.
    Vreeken, J., Van Leeuwen, M., Siebes, A.: Krimp: mining itemsets that compress. Data Min. Knowl. Discov. 23(1), 169–214 (2011)MathSciNetzbMATHGoogle Scholar
  29. 29.
    Wang, W., Yin, H., Sadiq, S., Chen, L., Xie, M., Spore, X. Zhou.: A sequential personalized spatial item recommender system. In: Proceedings of International Conference on Data Engineering, pp. 954–965 (2016)Google Scholar
  30. 30.
    Webb, G.I.: Self-sufficient itemsets: an approach to screening potentially interesting associations between items. ACM Trans. Knowl. Data Discov. 4(1), 3:1–3:20 (2010)MathSciNetGoogle Scholar
  31. 31.
    Webb, G.I., Vreeken, J.: Efficient discovery of the most interesting associations. ACM Trans. Knowl. Discov. Data 8(3), 15:1–15:31 (2013)Google Scholar
  32. 32.
    Weisstein, E.W.: Chi-squared test. From MathWorld–A Wolfram Web Resource (1999)Google Scholar
  33. 33.
    Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)Google Scholar
  34. 34.
    Xie, K., Deng, K., Shang, S., Zhou, X., Zheng, K.: Finding alternative shortest paths in spatial networks. ACM Trans. Database Syst. 37(4), 29:1–29:31 (2012)Google Scholar
  35. 35.
    Xie, Q., Shang, S., Yuan, B., Pang, C., Zhang, X.: Local correlation detection with linearity enhancement in streaming data. In: Proceedings of ACM International on Conference on Information and Knowledge Management, pp. 309–318. CIKM (2013)Google Scholar
  36. 36.
    Xie, M., Yin, H., Wang, H., Xu, F., Chen, W., Wang, S.: Learning graph-based poi embedding for location-based recommendation. In: Proceedings of ACM International on Conference on Information and Knowledge Management, pp. 15–24 (2016)Google Scholar
  37. 37.
    Yang, B., Guo, C., Jensen, C.S., Kaul, M., Shang, S.: Stochastic skyline route planning under time-varying uncertainty. In: Proceedings of IEEE International Conference on Data Engineering, pp. 136–147 (2014)Google Scholar
  38. 38.
    Yin, H., Sun, Y., Cui, B., Hu, Z., Lcars, L. Chen.: A location-content-aware recommender system. In: Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining, pp. 221–229. KDD (2013)Google Scholar
  39. 39.
    Yiu, M.L., Mamoulis, N., Papadias, D.: Aggregate nearest neighbor queries in road networks. IEEE Trans. Knowl. Data Eng. 17(6), 820–833 (2005)Google Scholar
  40. 40.
    Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)Google Scholar
  41. 41.
    Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Proceedings of European Conference on Computer Vision, pp. 818–833. Springer, Berlin (2014)Google Scholar
  42. 42.
    Zeiler, M.D., Krishnan, D., Taylor, G.W., Fergus, R.: Deconvolutional networks. In: Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2528–2535 (2010)Google Scholar
  43. 43.
    Zhang, A., Shi, W., Webb, G.I.: Mining significant association rules from uncertain data. Data Min. Knowl. Discov. 30(4), 928–963 (2016)MathSciNetzbMATHGoogle Scholar
  44. 44.
    Zheng, K., Shang, S., Yuan, N.J., Yang, Y.: Towards efficient search for activity trajectories. In: Proceedings of IEEE International Conference on Data Engineering, pp. 230–241 (2013)Google Scholar
  45. 45.
    Zheng, K., Zheng, Y., Yuan, N.J., Shang, S.: On discovery of gathering patterns from trajectories. In: Proceedings of IEEE International Conference on Data Engineering, pp. 242–253 (2013)Google Scholar
  46. 46.
    Zheng, K., Su, H., Zheng, B., Shang, S., Xu, J., Liu, J., Zhou, X.: Interactive top-k spatial keyword queries. In: Proceedings of IEEE International Conference on Data Engineering, pp. 423–434 (2015)Google Scholar
  47. 47.
    Zhu, S., Wang, Y., Shang, S., Zhao, G., Wang, J.: Probabilistic routing using multimodal data. Neurocomputing 253(C), 49–55 (2017)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Sameen Mansha
    • 1
    Email author
  • Hoang Thanh Lam
    • 2
  • Hongzhi Yin
    • 1
  • Faisal Kamiran
    • 3
  • Mohsen Ali
    • 3
  1. 1.School of Information Technology and Electrical EngineeringThe University of QueenslandBrisbaneAustralia
  2. 2.IBM ResearchDublinIreland
  3. 3.Information Technology University of The PunjabLahorePakistan

Personalised recommendations