Advertisement

Customer Clustering of French Transmission System Operator (RTE) Based on Their Electricity Consumption

  • Gabriel Da Silva
  • Hoai Minh LeEmail author
  • Hoai An Le Thi
  • Vincent Lefieux
  • Bach Tran
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 991)

Abstract

We develop an efficient approach for customer clustering of French transmission system operator (RTE) based on their electricity consumption. The ultimate goal of customer clustering is to automatically detect patterns for understanding the behaviors of customers in their evolution. It will allow RTE to better know its customers and consequently to propose them more adequate services, to optimize the maintenance schedule, to reduce costs, etc. We tackle three crucial issues in high-dimensional time-series data clustering for pattern discovery: appropriate similarity measures, efficient procedures for high-dimensional setting, and fast/scalable clustering algorithms. For that purpose, we use the DTW (Dynamic Time Warping) distance in the original time-series data space, the t-distributed stochastic neighbor embedding (t-SNE) method to transform the high-dimensional time-series data into a lower dimensional space, and DCA (Difference of Convex functions Algorithm) based clustering algorithms. The numerical results on real-data of RTE’s customer have shown that our clutering result is coherent: customers in the same group have similar consumption curves and the dissimilarity between customers of different groups are quite clear. Furthermore, our method is able to detect whether or not a customer changes his way of consuming.

Keywords

Electricity management Clustering High-dimensional time-series data DTW t-SNE DCA based clustering 

Notes

Acknowledgment

This research is part of the project “Smart Marketing” founded by RTE in collaboration with Computer Science and Applications Department, LGIPM, University of Lorraine, France.

The authors would like to thank Mr Romain Gemignani for his contributions to the starting step of the project. We thank also Dr Duy Nhat Phan for his discussion on the use of t-SNE transformation.

References

  1. 1.
    Aach, J., Church, G.M.: Aligning gene expression time series with time warping algorithms. Bioinformatics 17(6), 495–508 (2001)CrossRefGoogle Scholar
  2. 2.
    Aghabozorgi, S., Seyed Shirkhorshidi, A., Ying Wah, T.: Time-series clustering - a decade review. Inf. Syst. 53, 16–38 (2015)CrossRefGoogle Scholar
  3. 3.
    Bagnall, A., Lines, J., Bostrom, A., Large, J., Keogh, E.: The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min. Knowl. Discov. 31(3), 606–660 (2017)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Chu, S., Keogh, E., Hart, D., Pazzani, M.: Iterative deepening dynamic time warping for time series. In: Proceedings of the 2002 SIAM International Conference on Data Mining, pp. 195–212. SIAM (2002)Google Scholar
  5. 5.
    Goldin, D.Q., Kanellakis, P.C.: On similarity queries for time-series data: Constraint specification and implementation. In: Montanari, U., Rossi, F. (eds.) Principles and Practice of Constraint Programming – CP 1995. Lecture Notes in Computer Science, pp. 137–153. Springer, Heidelberg (1995)Google Scholar
  6. 6.
    Le Thi, H.A., Belghiti, M.T., Pham Dinh, T.: A new efficient algorithm based on DC programming and DCA for clustering. J. Glob. Optim. 37(4), 593–608 (2007)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Le Thi, H.A., Le, H.M., Pham, D.T.: New and efficient DCA based algorithms for minimum sum-of-squares clustering. Pattern Recognit. 47(1), 388–401 (2014)CrossRefGoogle Scholar
  8. 8.
    Le Thi, H.A., Le, H.M., Phan, D.N., Tran, B.: A DCA-Like Algorithm and its Accelerated Version with Application in Data Visualization. arXiv:1806.09620 [Cs, Math]. p. 8 (2018)
  9. 9.
    Le Thi, H.A., Nguyen, M.C., Pham Dinh, T.: A DC programming approach for finding communities in networks. Neural Comput. 26(12), 2827–2854 (2014)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Le Thi, H.A., Pham, D.T.: The DC (Difference of Convex Functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann. Oper. Res. 133(1), 23–46 (2005)MathSciNetzbMATHGoogle Scholar
  11. 11.
    Le Thi, H.A., Pham Dinh, T.: DC programming and DCA: thirty years of developments. Math. Program. 169(1), 5–68 (2018)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)zbMATHGoogle Scholar
  13. 13.
    Meinard, M.: Dynamic time warping. In: Information Retrieval for Music and Motion, pp. 69–84. Springer, Heidelberg (2007)Google Scholar
  14. 14.
    Paparrizos, J., Gravano, L.: K-Shape: efficient and accurate clustering of time series. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 1855–1870. ACM Press (2015)Google Scholar
  15. 15.
    Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to DC programming: theory, algorithms and applications. Acta Math. Vietnam. 22(1), 289–355 (1997)MathSciNetzbMATHGoogle Scholar
  16. 16.
    Pham Dinh, T., Le Thi, H.A.: A D.C. optimization algorithm for solving the trust-region subproblem. SIAM J. Optim. 8(2), 476–505 (1998)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q., Zakaria, J., Keogh, E.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2012, p. 262. ACM Press, Beijing, China (2012)Google Scholar
  18. 18.
    Schäfer, P.: The BOSS is concerned with time series classification in the presence of noise. Data Min. Knowl. Discov. 29(6), 1505–1530 (2015)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Scrucca, L., Fop, M., Murphy, T.B., Raftery, A.E.: Mclust 5: clustering, classification and density estimation using gaussian finite mixture models. R J. 8, 29 (2016)CrossRefGoogle Scholar
  20. 20.
    Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 63(2), 411–423 (2001)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Warren Liao, T.: Clustering of time series data–a survey. Pattern Recognit. 38(11), 1857–1874 (2005)CrossRefGoogle Scholar
  22. 22.
    Yang, Z., Peltonen, J., Kaski, S.: Majorization-minimization for manifold embedding. In: Artificial Intelligence and Statistics, pp. 1088–1097 (2015)Google Scholar
  23. 23.
    Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary Lp norms. In: VLDB, vol. 385, p. 99 (2000)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Gabriel Da Silva
    • 1
  • Hoai Minh Le
    • 2
    Email author
  • Hoai An Le Thi
    • 2
  • Vincent Lefieux
    • 1
  • Bach Tran
    • 2
  1. 1.French transmission system operator (RTE)ParisFrance
  2. 2.Computer Science and Applications DepartmentLGIPM, University of LorraineMetzFrance

Personalised recommendations