Time-Varying Water Quality Analysis with Semantical Mining Technology

Feng, Jun; Yu, Qinghan; Wu, Yirui

doi:10.1007/978-3-030-48513-9_29

Jun Feng²⁰,
Qinghan Yu²⁰ &
Yirui Wu²⁰

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 322))

Included in the following conference series:

836 Accesses

Abstract

Water resources is one of the most important natural resources. With the development of industry, water resource is harmed by various types of pollution. However, water pollution process is affected by many factors with high complexity and uncertainty. How to accurately predict water quality and generate scheduling plan in time is an urgent problem to be solved. In this paper, we propose a novel method with semantical mining technology to discover knowledge contained in historical water quality data, which can be further used to improve forecast accuracy and achieve early pollution warning, thus effectively avoiding unnecessary economic losses. Specifically, the proposed semantical mining method consists of two stages, namely frequent sequence extraction and association rule mining. During the first stage, we propose FOFM (Fast One-Off Mining) mining algorithm to extract frequently occurred sequences from quantity of water quality data, which can be further considered as input of the second stage. During the process of association rule mining, we propose PB-ITM (Prefix-projected Based-InterTransaction Mining) algorithm to find relationship between frequently occurred water pollution events, which can be regarded as knowledge to explain water pollution process. Through experimental comparisons, we can conclude the proposed method can result in flexible, accurate and diverse patterns of water quality events.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference Very Large Data Bases, VLDB, vol. 1215, pp. 487–499 (1994)
Google Scholar
Agrawal, R., Srikant, R., et al.: Mining sequential patterns. In: ICDE, vol. 95, pp. 3–14 (1995)
Google Scholar
Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential pattern mining using a bitmap representation. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 429–435. ACM (2002)
Google Scholar
Chen, G., Wu, X., Zhu, X., Arslan, A.N., He, Y.: Efficient string matching with wildcards and length constraints. Knowl. Inf. Syst. 10(4), 399–419 (2006)
Google Scholar
Chiu, D.Y., Wu, Y.H., Chen, A.L.: An efficient algorithm for mining frequent sequences by a new strategy without support counting. In: Proceedings of 20th International Conference on Data Engineering, pp. 375–386. IEEE (2004)
Google Scholar
Cook, D.J., Holder, L.B.: Substructure discovery using minimum description length and background knowledge. J. Artif. Intell. Res. 1, 231–255 (1993)
Google Scholar
Fournier-Viger, P., Lin, J.C.W., Kiran, R.U., Koh, Y.S., Thomas, R.: A survey of sequential pattern mining. Data Sci. Pattern Recogn. 1(1), 54–77 (2017)
Google Scholar
Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Philip, S.Y.: HUOPM: high-utility occupancy pattern mining. IEEE Trans. Cybern. 99, 1–14 (2019)
Google Scholar
Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Yu, P.S.: A survey of parallel sequential pattern mining. arXiv preprint arXiv:1805.10515 (2018)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. ACM SIGMOD Rec. 29, 1–12 (2000)
Google Scholar
He, Y., Wu, X., Zhu, X., Arslan, A.N.: Mining frequent patterns with wildcards from biological sequences. In: 2007 IEEE International Conference on Information Reuse and Integration, pp. 329–334. IEEE (2007)
Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: An Apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45372-5_2
Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: Complete mining of frequent patterns from graphs: mining graph data. Mach. Learn. 50(3), 321–354 (2003)
MATH Google Scholar
Ji, X., Bailey, J., Dong, G.: Mining minimal distinguishing subsequence patterns with gap constraints. Knowl. Inf. Syst. 11(3), 259–286 (2007)
Google Scholar
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proceedings 2001 IEEE International Conference on Data Mining, pp. 313–320. IEEE (2001)
Google Scholar
Lepping, J.: Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery (2018)
Google Scholar
Li, C., Wang, J.: Efficiently mining closed subsequences with gap constraints. In: Proceedings of the 2008 SIAM International Conference on Data Mining, pp. 313–322. SIAM (2008)
Google Scholar
Lin, J.C.W., Gan, W., Fournier-Viger, P., Hong, T.P., Chao, H.C.: FDHUP: fast algorithm for mining discriminative high utility patterns. Knowl. Inf. Syst. 51(3), 873–909 (2017)
Google Scholar
Nijssen, S., Kok, J.N.: A quickstart in frequent structure mining can make a difference. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 647–652. ACM (2004)
Google Scholar
Pei, J., et al.: PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings 17th International Conference on Data Engineering, pp. 215–224. IEEE (2001)
Google Scholar
Srikant, R., Agrawal, R.: Mining sequential patterns: generalizations and performance improvements. In: Apers, P., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 1–17. Springer, Heidelberg (1996). https://doi.org/10.1007/BFb0014140
Google Scholar
Yan, X., Gspan, J.: Graph-based substructure pattern mining. In: Proceedings of 2002 International Conference Data Mining (ICDM 2002), pp. 721–724 (2001)
Google Scholar
Zaki, M.J.: SPADE: An efficient algorithm for mining frequent sequences. Mach. Learn. 42(1–2), 31–60 (2001)
MATH Google Scholar
Zhang, M., Kao, B., Cheung, D.W., Yip, K.Y.: Mining periodic patterns with gap requirement from sequences. ACM Trans. Knowl. Discovery Data (TKDD) 1(2), 7 (2007)
Google Scholar
Zhu, X., Wu, X.: Mining complex patterns across sequences with gap requirements. A... A 1(S2), S3 (2007)
Google Scholar
Zou, X., Zhang, W., Liu, Y., Cai, Q.: Study on distributed sequential pattern discovery algorithm. J. Softw. 16(7), 1262–1269 (2005)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer and Information, Hohai University, Nanjing, China
Jun Feng, Qinghan Yu & Yirui Wu

Authors

Jun Feng
View author publications
You can also search for this author in PubMed Google Scholar
Qinghan Yu
View author publications
You can also search for this author in PubMed Google Scholar
Yirui Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qinghan Yu .

Editor information

Editors and Affiliations

Macquarie University, Sydney, NSW, Australia
Xuyun Zhang
Macquarie University, Sydney, NSW, Australia
Guanfeng Liu
Pace University, New York, NY, USA
Meikang Qiu
James Cook University, Cairns, QLD, Australia
Wei Xiang
James Cook University, Smithfield, QLD, Australia
Tao Huang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Feng, J., Yu, Q., Wu, Y. (2020). Time-Varying Water Quality Analysis with Semantical Mining Technology. In: Zhang, X., Liu, G., Qiu, M., Xiang, W., Huang, T. (eds) Cloud Computing, Smart Grid and Innovative Frontiers in Telecommunications. CloudComp SmartGift 2019 2019. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 322. Springer, Cham. https://doi.org/10.1007/978-3-030-48513-9_29

Download citation

DOI: https://doi.org/10.1007/978-3-030-48513-9_29
Published: 23 May 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-48512-2
Online ISBN: 978-3-030-48513-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics