Abstract
Electric power user classification is one of the most important methods to realize the optimal allocation of power resources. Through the analysis of users’needs, behavior and habits, Countries and enterprises can offer different incentives for different users. In this way, people are more willing to use green and clean Electric power resources. In the analysis of user clustering, there is a need for real-time processing of massive and high-speed data. In this paper we propose a novel distributed user data stream clustering method based on Spark streaming, improved clusStream algorithm and improved K-means algorithm named “DStreamEPK”. In the final experimental evaluation, we first tested the clustering effectiveness of DStreamEPK on UCI datasets, the results show that the proposed DStreamEPK is better than the traditional K-means clustering algorithm. At the same time, it is found that DStreamEPK can cluster user’s electricity data quickly and efficiently through testing on user’s real data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ackermann, M.R., Märtens, M., Raupach, C., Swierkot, K., Lammersen, C., Sohler, C.: Streamkm++: a clustering algorithm for data streams. ACM J. Exp. Algorithmics 17(1), 2–4 (2012)
Bogojeska, J., Alexa, A., Altmann, A., Lengauer, T., Rahnenführer, J.: Rtreemix: an R package for estimating evolutionary pathways and genetic progression scores. Bioinformatics 24(20), 2391–2392 (2008)
Chen, W., Zhou, K., Yang, S., Cheng, W.: Data quality of electricity consumption data in a smart grid environment. Renew. Sustain. Energy Rev. 75, 98–105 (2016)
Freytag, J.C., Lockemann, P.C., Abiteboul, S., Carey, M.J., Selinger, P.G., Heuer, A. (eds.): VLDB 2003, Proceedings of 29th International Conference on Very Large Data Bases, 9–12 September 2003, Berlin, Germany. Morgan Kaufmann (2003)
Goldbergs, G., Maier, S.W., Levick, S.R., Edwards, A.: Limitations of high resolution satellite stereo imagery for estimating canopy height in Australian tropical savannas. Int. J. Appl. Earth Obs. Geoinf. 75, 83–95 (2019)
Hartigan, J.A., Wong, M.A.: Algorithm as 136: a k-means clustering algorithm. J. R. Stat. Soc. 28(1), 100–108 (1979)
Kranen, P., Assent, I., Baldauf, C., Seidl, T.: The clustree: indexing micro-clusters for anytime stream mining. Knowl. Inf. Syst. 29(2), 249–272 (2011)
Udommanetanakit, K., Rakthanmanon, T., Waiyamai, K.: E-Stream: evolution-based technique for stream clustering. In: Alhajj, R., Gao, H., Li, J., Li, X., Zaïane, O.R. (eds.) ADMA 2007. LNCS (LNAI), vol. 4632, pp. 605–615. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73871-8_58
Wang, H.Z., Liu, K., Zhou, J., Wang, Y.F.: Pretreatment of short-term load forecasting based on k-means clustering algorithm. Computer Simulation (2016)
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Usenix Conference on Hot Topics in Cloud Computing (2010)
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. SIGMOD Rec. 25(2), 103–114 (1996)
Zhao, W., Gong, Y.: Load curve clustering based on kernel k-means. Electr. Power Autom. Equip. (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, X., Qian, Z., Shen, S., Shi, J., Wang, S. (2019). Streaming Massive Electric Power Data Analysis Based on Spark Streaming. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds) Database Systems for Advanced Applications. DASFAA 2019. Lecture Notes in Computer Science(), vol 11448. Springer, Cham. https://doi.org/10.1007/978-3-030-18590-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-18590-9_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-18589-3
Online ISBN: 978-3-030-18590-9
eBook Packages: Computer ScienceComputer Science (R0)