Abstract
In this paper, the problem of mining high utility time interval sequential patterns with multiple utility thresholds in a distributed environment is considered. Mining high utility sequential patterns (HUSP) is an emerging issue and the existing HUSP algorithms can mine the order of items and they do not consider the time interval between the successive items. In real-world applications, time interval patterns provide more useful information than the conventional HUSPs. Recently, we proposed distributed high utility time interval sequential pattern mining (DHUTISP) algorithm using MapReduce in support of the BigData environment. The algorithm has been designed considering a single minimum utility threshold. It is not convincing to use the same utility threshold for all the items in the sequence, which means that all the items are given the same importance. Hence, in this paper, a new distributed framework is proposed to efficiently mine high utility time interval sequential patterns with multiple minimum utility thresholds (DHUTISP-MMU) using the MapReduce approach. The experimental results show that the proposed approach can efficiently mine HUTISPs with multiple minimum utility thresholds.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Agrawal, A. Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Databases, pp. 487–499. ACM, Santiago, Chile (1994)
Agrawal, A. Srikant, R.: Mining sequential patterns. In: Proceedings of the 11th International Conference on Data Engineering, pp. 3–14. IEEE, Taipei, Taiwan (1995)
Yao, H., Howard, J.H., Cory, J.B.: A foundational approach to mining itemset utilities from databases. In: Proceedings of the 2004 SIAM International Conference on Data Mining, pp. 482–486 (2004)
Yao, H., Howard, J.H., Liqiang, G.: A unified framework for utility based measures for mining itemsets. In: Proceedings of ACM SIGKDD 2nd Workshop Utility-Based Data Mining, pp. 28–37 (2006)
Hong, Yao., Howard J.H.: Mining itemset utilities from transaction databases. Data Knowl. Eng. 59(3), 603–626 (2006)
Ahmed, C.F., Tanbeer, S.K., Jeong, B.-S.: A novel approach for mining high- utility sequential patterns in sequence databases. ETRI J. 32(5), 676–686 (2010)
Yin, J., Zheng, Z., Cao, L.: USpan: an efficient algorithm for mining high utility sequential patterns. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 660–668 (2012)
Alkan, O.K., Karagoz, P.: Crom and huspext: improving efficiency of high utility sequential pattern extraction. IEEE Trans. Knowl. Data Eng. 27(10), 2645–2657 (2015)
Wang, J.-Z., Huang, J.-L., Chen, Y.-C.: On efficiently mining high utility sequential patterns. Knowl. Inf. Syst. 49(2), 597–627 (2016). https://doi.org/10.1007/s10115-015-0914-8
Wang, W.-Y., Huang, A. Y.-Q.: Considering high utilities for time interval sequential pattern mining. In: Proceedings of 2015 Conference on Technologies and Applications of Artificial Intelligence, pp. 412–418 (2015)
Wang, W.-Y., Huang, A.Y.-Q.: Mining time-interval sequential patterns with high utility from transaction databases. J. Adv. Comput. Intell. Intell. Inform. 20(6), 1018–1026 (2016)
Sumalatha, S., Subramanyam, RBV.: distributed mining of high utility time interval sequential patterns using mapreduce approach. Expert Syst. Appl. 141, 1–25 (2019)
Lin, J.C.-W., Gan, W., Fournier-Viger, P., Hong, T.-P.: Mining high-utility itemsets with multiple minimum utility thresholds. In: Proceedings of the 8th International Conference on Computer Science & Software Engineering, pp. 9–17 (2015)
Lin, J.C.-W., Gan, W., Fournier-Viger, P., Hong, T.-P., Zhan, J.: Efficient mining of high-utility itemsets using multiple minimum utility thresholds. Knowl. Based Syst. 113, 100–115 (2016)
Srikumar, K.: Efficient mining of high utility itemsets with multiple minimum utility thresholds. Eng. Appl. Artif. Intell. 69, 112–126 (2018)
Gan, W., Lin, J.C.-W., Zhang, J., Fournier-Viger, P.: Utility mining across multi-sequences with individualized thresholds. ACM/IMS Trans. Data Sci. 1(2), (2020)
Lin, J.C.-W., Zhang, J., Fournier-Viger, P.: High-utility sequential pattern mining with multiple minimum utility thresholds. In: Chen, L., Jensen, C.S., Shahabi, C., Yang, X., Lian, X. (eds.) APWeb-WAIM 2017. LNCS, vol. 10366, pp. 215–229. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63579-8_17
Chen, Y.-L., Chiang, M.-C., Ko, M.-T.: Discovering time-interval sequential patterns in sequence databases. Expert Syst. Appl. 25(3), 343–354 (2003)
Chen, Y.-L., Huang, T.C.-K.: Discovering fuzzy time-interval sequential patterns in sequence databases. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 35(5), 959–972 (2005)
Yen, S.-J., Lee, Y.-S.: Mining time-gap sequential patterns. In: Proceedings of Advanced Research in Applied Artificial Intelligence, pp. 637–646, Springer, Berlin, Heidelberg (2012)
Yen, S.-J., Lee, Y.-S.: Mining non-redundant time-gap sequential patterns. Appl. Intell. 39(4), 727–738 (2013). https://doi.org/10.1007/s10489-013-0426-8
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Saleti, S., Tangirala, J.L., Thirumalaisamy, R. (2021). Distributed Mining of High Utility Time Interval Sequential Patterns with Multiple Minimum Utility Thresholds. In: Fujita, H., Selamat, A., Lin, J.CW., Ali, M. (eds) Advances and Trends in Artificial Intelligence. Artificial Intelligence Practices. IEA/AIE 2021. Lecture Notes in Computer Science(), vol 12798. Springer, Cham. https://doi.org/10.1007/978-3-030-79457-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-79457-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-79456-9
Online ISBN: 978-3-030-79457-6
eBook Packages: Computer ScienceComputer Science (R0)