Customer Clustering of French Transmission System Operator (RTE) Based on Their Electricity Consumption
We develop an efficient approach for customer clustering of French transmission system operator (RTE) based on their electricity consumption. The ultimate goal of customer clustering is to automatically detect patterns for understanding the behaviors of customers in their evolution. It will allow RTE to better know its customers and consequently to propose them more adequate services, to optimize the maintenance schedule, to reduce costs, etc. We tackle three crucial issues in high-dimensional time-series data clustering for pattern discovery: appropriate similarity measures, efficient procedures for high-dimensional setting, and fast/scalable clustering algorithms. For that purpose, we use the DTW (Dynamic Time Warping) distance in the original time-series data space, the t-distributed stochastic neighbor embedding (t-SNE) method to transform the high-dimensional time-series data into a lower dimensional space, and DCA (Difference of Convex functions Algorithm) based clustering algorithms. The numerical results on real-data of RTE’s customer have shown that our clutering result is coherent: customers in the same group have similar consumption curves and the dissimilarity between customers of different groups are quite clear. Furthermore, our method is able to detect whether or not a customer changes his way of consuming.
KeywordsElectricity management Clustering High-dimensional time-series data DTW t-SNE DCA based clustering
This research is part of the project “Smart Marketing” founded by RTE in collaboration with Computer Science and Applications Department, LGIPM, University of Lorraine, France.
The authors would like to thank Mr Romain Gemignani for his contributions to the starting step of the project. We thank also Dr Duy Nhat Phan for his discussion on the use of t-SNE transformation.
- 4.Chu, S., Keogh, E., Hart, D., Pazzani, M.: Iterative deepening dynamic time warping for time series. In: Proceedings of the 2002 SIAM International Conference on Data Mining, pp. 195–212. SIAM (2002)Google Scholar
- 5.Goldin, D.Q., Kanellakis, P.C.: On similarity queries for time-series data: Constraint specification and implementation. In: Montanari, U., Rossi, F. (eds.) Principles and Practice of Constraint Programming – CP 1995. Lecture Notes in Computer Science, pp. 137–153. Springer, Heidelberg (1995)Google Scholar
- 8.Le Thi, H.A., Le, H.M., Phan, D.N., Tran, B.: A DCA-Like Algorithm and its Accelerated Version with Application in Data Visualization. arXiv:1806.09620 [Cs, Math]. p. 8 (2018)
- 13.Meinard, M.: Dynamic time warping. In: Information Retrieval for Music and Motion, pp. 69–84. Springer, Heidelberg (2007)Google Scholar
- 14.Paparrizos, J., Gravano, L.: K-Shape: efficient and accurate clustering of time series. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 1855–1870. ACM Press (2015)Google Scholar
- 17.Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q., Zakaria, J., Keogh, E.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2012, p. 262. ACM Press, Beijing, China (2012)Google Scholar
- 22.Yang, Z., Peltonen, J., Kaski, S.: Majorization-minimization for manifold embedding. In: Artificial Intelligence and Statistics, pp. 1088–1097 (2015)Google Scholar
- 23.Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary Lp norms. In: VLDB, vol. 385, p. 99 (2000)Google Scholar