Abstract
Water resources is one of the most important natural resources. With the development of industry, water resource is harmed by various types of pollution. However, water pollution process is affected by many factors with high complexity and uncertainty. How to accurately predict water quality and generate scheduling plan in time is an urgent problem to be solved. In this paper, we propose a novel method with semantical mining technology to discover knowledge contained in historical water quality data, which can be further used to improve forecast accuracy and achieve early pollution warning, thus effectively avoiding unnecessary economic losses. Specifically, the proposed semantical mining method consists of two stages, namely frequent sequence extraction and association rule mining. During the first stage, we propose FOFM (Fast One-Off Mining) mining algorithm to extract frequently occurred sequences from quantity of water quality data, which can be further considered as input of the second stage. During the process of association rule mining, we propose PB-ITM (Prefix-projected Based-InterTransaction Mining) algorithm to find relationship between frequently occurred water pollution events, which can be regarded as knowledge to explain water pollution process. Through experimental comparisons, we can conclude the proposed method can result in flexible, accurate and diverse patterns of water quality events.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference Very Large Data Bases, VLDB, vol. 1215, pp. 487–499 (1994)
Agrawal, R., Srikant, R., et al.: Mining sequential patterns. In: ICDE, vol. 95, pp. 3–14 (1995)
Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential pattern mining using a bitmap representation. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 429–435. ACM (2002)
Chen, G., Wu, X., Zhu, X., Arslan, A.N., He, Y.: Efficient string matching with wildcards and length constraints. Knowl. Inf. Syst. 10(4), 399–419 (2006)
Chiu, D.Y., Wu, Y.H., Chen, A.L.: An efficient algorithm for mining frequent sequences by a new strategy without support counting. In: Proceedings of 20th International Conference on Data Engineering, pp. 375–386. IEEE (2004)
Cook, D.J., Holder, L.B.: Substructure discovery using minimum description length and background knowledge. J. Artif. Intell. Res. 1, 231–255 (1993)
Fournier-Viger, P., Lin, J.C.W., Kiran, R.U., Koh, Y.S., Thomas, R.: A survey of sequential pattern mining. Data Sci. Pattern Recogn. 1(1), 54–77 (2017)
Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Philip, S.Y.: HUOPM: high-utility occupancy pattern mining. IEEE Trans. Cybern. 99, 1–14 (2019)
Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Yu, P.S.: A survey of parallel sequential pattern mining. arXiv preprint arXiv:1805.10515 (2018)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. ACM SIGMOD Rec. 29, 1–12 (2000)
He, Y., Wu, X., Zhu, X., Arslan, A.N.: Mining frequent patterns with wildcards from biological sequences. In: 2007 IEEE International Conference on Information Reuse and Integration, pp. 329–334. IEEE (2007)
Inokuchi, A., Washio, T., Motoda, H.: An Apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45372-5_2
Inokuchi, A., Washio, T., Motoda, H.: Complete mining of frequent patterns from graphs: mining graph data. Mach. Learn. 50(3), 321–354 (2003)
Ji, X., Bailey, J., Dong, G.: Mining minimal distinguishing subsequence patterns with gap constraints. Knowl. Inf. Syst. 11(3), 259–286 (2007)
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proceedings 2001 IEEE International Conference on Data Mining, pp. 313–320. IEEE (2001)
Lepping, J.: Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery (2018)
Li, C., Wang, J.: Efficiently mining closed subsequences with gap constraints. In: Proceedings of the 2008 SIAM International Conference on Data Mining, pp. 313–322. SIAM (2008)
Lin, J.C.W., Gan, W., Fournier-Viger, P., Hong, T.P., Chao, H.C.: FDHUP: fast algorithm for mining discriminative high utility patterns. Knowl. Inf. Syst. 51(3), 873–909 (2017)
Nijssen, S., Kok, J.N.: A quickstart in frequent structure mining can make a difference. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 647–652. ACM (2004)
Pei, J., et al.: PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings 17th International Conference on Data Engineering, pp. 215–224. IEEE (2001)
Srikant, R., Agrawal, R.: Mining sequential patterns: generalizations and performance improvements. In: Apers, P., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 1–17. Springer, Heidelberg (1996). https://doi.org/10.1007/BFb0014140
Yan, X., Gspan, J.: Graph-based substructure pattern mining. In: Proceedings of 2002 International Conference Data Mining (ICDM 2002), pp. 721–724 (2001)
Zaki, M.J.: SPADE: An efficient algorithm for mining frequent sequences. Mach. Learn. 42(1–2), 31–60 (2001)
Zhang, M., Kao, B., Cheung, D.W., Yip, K.Y.: Mining periodic patterns with gap requirement from sequences. ACM Trans. Knowl. Discovery Data (TKDD) 1(2), 7 (2007)
Zhu, X., Wu, X.: Mining complex patterns across sequences with gap requirements. A... A 1(S2), S3 (2007)
Zou, X., Zhang, W., Liu, Y., Cai, Q.: Study on distributed sequential pattern discovery algorithm. J. Softw. 16(7), 1262–1269 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Feng, J., Yu, Q., Wu, Y. (2020). Time-Varying Water Quality Analysis with Semantical Mining Technology. In: Zhang, X., Liu, G., Qiu, M., Xiang, W., Huang, T. (eds) Cloud Computing, Smart Grid and Innovative Frontiers in Telecommunications. CloudComp SmartGift 2019 2019. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 322. Springer, Cham. https://doi.org/10.1007/978-3-030-48513-9_29
Download citation
DOI: https://doi.org/10.1007/978-3-030-48513-9_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-48512-2
Online ISBN: 978-3-030-48513-9
eBook Packages: Computer ScienceComputer Science (R0)