A Deep Learning Approach for the Detection and Classification of Power Quality Disturbances with Windowed Signals


The modernization of Power Systems (PSs) to smart grids, the expansion of microgrids, the ever-increasing presence of distributed power generation, the more frequent use of non-linear and voltage-sensitive loads by the consumers have caused problems to the Power Quality (PQ). The studies in PQ are commonly related to disturbances that alter the sinusoidal characteristics of the voltage waveforms and/or current. The first step to analyzing the PQ is to detect and then classify the disturbances, since by identifying the disturbance, it is possible to know its causes and deliberate over strategies to mitigate it. Thus, this paper proposes a deep-learning approach using voltage signals, without pre-processing, extraction, nor manual selection of features in order to detect and classify PQ disturbances automatically. The proposed approach is composed of convolution layers, a pooling layer, a long short-term memory layer, and batch normalization. A 1D convolution was used to adapt the data from the voltage signals. Overlapping windowed signals with different Signal–Noise Ratio (SNR) (40 dB, 30 dB, 20 dB and 0 dB) and with different sampling rates (16, 32, and 64 samples/cycle) were used. For a more in-depth view of the results, the proposed approach was evaluated for its accuracy, precision, recall, and F1-Score in different scenarios. An analysis of the obtained results shows that even for the worst case scenario (SNR of 20 dB and sampling rates of 16 samples/cycle), the approach performs satisfactorily with values above 0.97 for the analyzed metrics, allowing, thus, consumer action in a demand-side management scenario.


The increase in energy consumption due to the populational growth and the addition of new equipment connected to the power grid is mostly supplied by the electric utility, and, when possible, by the consumers themselves via their own local methods of power generation or storage [35]. Currently, the increased integration of distributed power generation, especially using renewable energy sources (photovoltaic and eolic), and the consolidation of microgrids [16] has lead to a change in the management and operation of the electrical system [10]. Thus, the electric system has become smarter and its operations more decentralized [2].

This integration is one of the biggest sources of power quality (PQ) disturbances, and can cause, such as overvoltage and undervoltage, increased outages, elevation, voltage sags, interruption, etc [13, 40]. The change in load by consumers’ facilities [27] regardless if they are residential, industrial, or commercial, is also one of the main causes of disturbances. The ample use of non-linear and voltage sensitive loads such as electronic equipment (computers and energetically efficient lighting) has caused PQ disturbances affecting the consumers’ experience, as these disturbances can cause malfunctioning or damage equipment, and, in the case of industrial consumers, this can cause a production line to halt [17,18,19, 30].

There are several works in the literature which propose the detection and classification of PQ disturbances. The majority of these works employs a three-step approach: (1) analysis and features extraction through Wavelet Transform (WT) [18], S Transform (ST) [22], Hilbert Transform (HT) [18], Fourier Fast Transform (FFT) [4], statistical metrics; (2) features selection by means optimization algorithms such as Ant Colony (AC) [37], Genetic Algorithm (GA), Direct Sequential Selection (DSS) and Maximum Redundancy Relevance (MRR) [18]; The last step consists in the use of artificial intelligence methods, responsible for finding, through the extracted and selected features, decision boundaries between the disturbances. Some of the main methods are: Wavelet Neural Network (WNN) [28], Probabilistic Neural Network (PNN) [29], Multilayer Perceptrons (MLP) [4], Decision Tree (DT) [4, 18].

Approaches based on these three steps must function coordinately, as these steps are sequential and correlated, which directly affects the performance of the detection and classification of PQ disturbances. However, the performance does not guide the steps of extraction and selection of features. For instance, in a multi-layer perceptron, the performance, through the backpropagation algorithm, guides the weights given to each input, but has no influence on the extraction and selection of these features. This lack of feedback between the performance of the approach and the features extracted and selected can result in poorer performance when detecting and classifying disturbances. This is further aggravated when the disturbance is overlapped by noise and/or another disturbance (complex disturbance). Generally, the noise and the overlapping of disturbances alter the voltage signal significantly, resulting in a mischaracterization of the measured waveform, which can result in poor performance of detection and classification of disturbances.

Moreover, many works use a data window of 10 cycles as input for their methods, but this approach is not adequate for hardware embedding. Hence, methods for the detection and classification of PQ disturbances that use data windows that are equal or shorter than one cycle are more adequate for hardware embedding, even though this approach makes the problem more complex, given that there is less information contained in the window when compared to the windows of 10 cycles [4]. Therefore, alternative approaches must be investigated to unify, simplify, and optimize the process to improve the performance of the detection and classification of PQ disturbances.

Recently, deep learning models have improved the state of the art related to recognizing speech or objects, image processing, signal and information [12, 21, 23]. For the problem of detection and classification of PQ disturbances, deep neural networks can not only improve the performance, but also automate the process of extraction and selection of features, reducing and simplifying the process, from a three-step process into one unified process.

Thus, some works have applied deep learning techniques for detecting and classifying PQ disturbances. In [3], the disturbance data are converted into images to be used as an input matrix for a convolutional neural network. This approach was chosen so as to use of the great performance of these networks in image processing. However, the disturbance data are 1D, thus the correlation between the data is one-directional, as opposed to image data that has a significant correlation in two directions(horizontal and vertical). In [31] and [25] the data are used in their raw form as input for the deep learning models. However neither of those works considers overfitting and in [31] the model’s ability to classify disturbances when faced with noisy signals is not discussed.

Thus, this work proposes an architecture based on deep learning using 1D convolution, long short-term memory (LSTM) to extract and select the spatial and temporal features of the windowed voltage signals, to detect and classify 15 types of PQ disturbances and normal signals automatically, regardless if these disturbances are single or complex and in presence of noise, so as to consider the most realistic scenario possible in a power system. The experiments are executed using synthetic data generated by mathematical models. To better detail the results, accuracy, precision, recall, and F1-Score are used to evaluate the proposed approach. The main contributions of the work are listed as follows:

  1. 1.

    To ensure reliability and feasibility of the results obtained, the method was tested in different scenarios. A database of synthetic windowed signals, noises of 20dB, 30dB and 40 dB, complex disturbances, generated over different sampling rates (16 points/cycle, 32 points/cycle, and 64 points/cycle) were used for the experiment;

  2. 2.

    A feature analysis, through t-Distributed Stochastic Neighbor Embedding (t-SNE), was performed to verify the decision boundary of PQ disturbances. This analysis facilitates the understanding of the high-level features extracted by convolutional neural networks.

  3. 3.

    The proposed approach was validated using accuracy, recall, precision, and F1-Score in order to provide clearer observations of the results, as opposed to other works, which only analyze accuracy. Additionally, due to the large amount of parameters, overfitting techniques were applied;

  4. 4.

    A low computational cost methodology was developed and embedded in hardware. Results show significant improvements for power quality analysis in smart grid scenarios, enabling consumer-side actions based on high-precision PQ disturbances detection and classification.

  5. 5.

    An extensive comparison of the performance of the proposed model in contrast with other deep learning-based methods for PQ detection and classification.

This paper is organized as described: “Power Quality Disturbance Database” illustrates the types of PQ disturbances used and how they were generated; the next section explores the proposed approach, explaining in detail how it works and its architecture; “Performance evaluation and result analysis” shows how the experiments were conducted and the results obtained, and also compares the results with the state of the art; and finally “Conclusion and future works” presents the conclusions.

Proposed Approach

Fig. 1

Comparison between the conventional methods and deep learning

This section describes the details of the proposed PQ disturbances detection and classification approach. The conventional methods are first presented in “Conventional methods”, followed by a presentation of the proposed approach in “Deep learning-based methods”, illustrating the applicability and advantages of employing deep learning for the detection and classification of PQ disturbances. Figure 1 highlights the main differences between conventional and deep learning-based methods.

Conventional Methods

Conventional methods extract (illustrated in the lower part of Fig. 1) features from the voltage signals, use all extracted features to perform a selection, to find the subset of features that best represents the input signal. It then uses the selected features as input for a classifier, that maps the input into respective disturbances, if any, present in the voltage signal, thus performing the detection and classification of the PQ disturbances. In the training process for conventional methods, the performance of the classifier guides the classification process, weighing the features used as input, but has no bearing on the extraction and selection of features themselves [4].

Deep Learning-Based Methods

By applying deep learning techniques, the problem is modeled and solved holistically (illustrated at the top of Fig. 1), unlike other traditional approaches which employ a model partitioned in three steps.

The proposed approach receives the voltage signals as input, but uses deep learning techniques to perform the selection, extraction, detection and classification of PQ disturbances automatically and in a unified way. During the training process of deep neural networks for detecting and classifying PQ disturbances, the weights of the extraction, selection and classification layers are automatically updated by the backpropagation.

The proposed approach was designed based on PQ disturbances features. Disturbances of the same type can vary their features significantly, i.e., they can have short duration (Transitory Impulse), occur randomly, and there are disturbances with periodic features (notch and harmonic distortion). Moreover, the proposed approach must have a good capacity for generalization to deal with the noise present in the signal.

Fig. 2

Architecture and functioning of the proposed approach. a Windows process. b Features extraction and selection. c Classification. d Classified disturbances

Figure 2 illustrates the architecture and how the proposed approach works:

  1. 1.

    The block (a windowing process) occurs the acquisition of the signals from a data window of one cycle, where this window moves point by point until it covers the entire signal.

  2. 2.

    The raw data window of one cycle is used as the input for the features extraction and selection block (block b)

  3. 3.

    The features extracted and selected are used as input for the classification block (block c) and then the disturbances is detected and classified (block d).

This approach allows the disturbances to be detected quickly, hence measures can be taken by the consumer to protect their equipment. Additionally, the computational cost is lower when compared to methods found in the literature, where the window has 10 cycles. An input with less than 10 cycles provides a faster detection and classification of PQ disturbances and immediate decision making [4].

However, this method of data acquisition makes the detection and classification problem more complex, since each window acquired is labelled according to the presence or absence of disturbances in it. Thus, windows with no points that characterize a disturbance are classified as normal, whereas windows with at least one point of disturbance occurrence are labeled as the specific disturbance. The windows with both disturbances and normal points are called transition windows. To detect the disturbance, the transition windows have 64 points of disturbance in the best case scenario, meaning that the window contains a lot of information characterizing the disturbance, which makes the detection process easier when compared to the worst scenario, where there is only one disturbance point, which hinders the process of characterizing, and consequently, detecting this disturbance.

The proposed architecture consists of fourteen sequential layers: twelve layers to extract the features, and two layers to classify the data. The voltage signals are used as input, and, then, the process of extracting, selecting, detecting, and classifying the PQ disturbances is done automatically. The process of features extraction considers two blocks, each with three convolutional layers. The last convolutional layer of each block is followed by a max-pooling layer. A LSTM layer, a BN layer and a dropout are used after the last max-pooling layer. The output of the dropout is used as input for the classification layers, where the last layer is a SoftMax layer. Convolutional layers usually have 2D images or present 2D features maps as input. However, voltage signals are 1D. Thus, the convolutional layers are adopted with 1D filters, so the the raw voltage signals can be used as inputs.

According to [32], the convolutional and LSTM layers work well for situations where the network learns the features, regardless of where they occur, and in situations where they learn relationships between long sequences, respectively. Using convolutional and LSTM layers is then justified to extract the spatial and temporal features and enables the architecture to detect short duration and periodic disturbances. Since there is a large amount of parameters to be optimized, the layers of max-pooling, dropout and BN were used to help control for overfitting by reducing the size of the data, eliminating connections between neurons and normalization between the layers, respectively.

The deep learning techniques offer a set of methods to process audio signals (speech) and audiovisual data (videos and images) and textual content. The high performance of these techniques is partially explained by the combination of input data with the proposed deep learning network architecture [33], i.e., the deep learning architectures are modeled according to the input data type. Therefore, the techniques that compose the proposed architecture are described below.

1D Convolution

According to [33], the convolution operation is defined as a feature detector for convolutional networks, since these networks perform convolutions to capture good features. In the convolutional layer, the set of weights is commonly known as a filter (or kernel). The result of the convolution between this filter and the input is called the feature map. This can be expressed by the Eq. 1: 1:

$$\begin{aligned} X^l_{0,fl}=f\left( \sum _{i\ \in \ m}{X^{l-1}_i}\ \times K^l_{i0,fl}+\ B^l\right) \, \end{aligned}$$

where the filter number at the l-th layer is \(\textit{F}_{l}\), \(\textit{X}_{i}\) is the vector input (\(n\times 1\)). The filters are denoted as K (\(k\times 1\)), m = n-k+1 and f(x) is the activation function and \(\textit{B}^{l}\) is the bias for the l-th layer. The Rectified Linear Units function is commonly used because it usually learns faster with deep neural networks and also shows better training results in practical applications than the sigmoid and tanh activation functions [21, 33]. The result of the ReLU function can be expressed by the Eq. 2.

$$\begin{aligned} f_{ReLU}\left( x\right) ={\mathrm {max} (0,x)}. \end{aligned}$$


In convolutional neural networks, pooling layers are used after successive convolution layers. The goal of the pooling layers is to decrease the dimensionality of the data representation, thus helping to control overfitting and reduce computational cost. A commonly used operation is max-pooling. An important property of max-pooling is the ability to be locally invariant, that is, even if the input suffers small changes, its output remains constant [5]. This property is highly valued for PQ disturbance detection and classification because a disturbance can have many variations due to overlapping signal noise. The Eq. 3 defines max-pooling.

$$\begin{aligned} X^l_0=f\left[ max\left( \sum _{i\ \in \ m}{X^{l-1}_i}\right) +B^l\right] \ \end{aligned}$$

Batch Normalization

Batch Normalization (BN) has become a standard method in recent years, replacing the use of regularization. This method accelerates training by normalizing the activation of the previous layer in each batch, preventing significant changes to the input distribution for each layer [5, 33]. The result of batch normalization can be expressed by Eq. 4.

$$\begin{aligned} y_i=\ \gamma \ \times \ \frac{x_i-\ {\mu }_x}{\sqrt{{\sigma }^2_x+\epsilon }\ }+\beta \, \end{aligned}$$

where \(\mu _{\mathrm{x}}\) represents the average value of input \(\hbox {x}_{\mathrm{i}}\). \(\sigma ^2_x\) is the variance of \(\hbox {x}_{\mathrm{i}}\). \(\gamma\) and \(\beta\) are regular terms to ensure that the output meets the Gaussian distribution pattern with a mean value of 0 and a variance of 1.


One possible solution to reduce overfitting is to decrease the number of parameters; this is the purpose of the dropout technique. During the training phase, in the back-propagation step, the dropout operation randomly sets the weights of neurons to zero with probability p. That is, it disables neurons with probability p to improve generalization [32, 36]. Besides, this technique reduces the network’s sensitivity to input data variations, favoring the correct classification of PQ disturbances under noise.


LSTM is the most common variation of recurrent neural networks. They can reliably and securely transmit information for long periods. The main structures of this network are called gates and memory cells. The memory cell’s content is defined by the input gates and forget gates; if both gates are closed, then the cell’s content remains unchanged for a one-time unit. Gates can learn what information to keep or forget during training. The input gate protects the network from useless data, deciding how much of the current information needs to pass on. The forget gate decides what information should be kept or forgotten. Finally, the exit gate exposes the contents of the memory cell.

Dense Layer

Dense or fully connected layers are generally used for classification. These layers are usually fully connected to all neurons in the previous layers. They receive the extracted features from previous layers and then perform their classification. The Eq. 5 expresses the output of the dense layer, where D are the learnable parameters of the l-th dense layer.

$$\begin{aligned} X^l_0=f\left( X^{l-1}_i\ \times D^l_{i0}+B^l\right) \ \end{aligned}$$

The ReLU and softmax functions are used in the hidden and output layers, respectively. The value of the softmax function means the probability that the input data belong to a corresponding class. The number of neurons in the output layer must equal the number of classes (except for binary classification that only needs a single neuron). For a vector Z, j is the size of Z, i is the index of Z, i = 1,2, ..., j. The output value of the softmax Si can be obtained with Expression 6:

$$\begin{aligned} S_i=\ \frac{e^{Zi}}{\sum ^i_j{e^{zj}}}\ \end{aligned}$$

Power Quality Disturbance Database

The PQ disturbances database, for experimental purposes, can be synthetically generated by using mathematical models. To evaluate the proposed approach, mathematical models proposed in  [6, 14, 28] were used to obtain synthetic signals, which contain signals with and without PQ disturbances. The use of mathematical models allows the generation and characterization of several types of disturbances that can be used to analyze the capacity for generalization of the classifier [11]. The used equations for synthetic data generation are listed in Table 1.

For this experimental evaluation of the proposed approach, synthetic voltage signals were used: (i) with no disturbances (normal signals), (ii) with one disturbance (simple disturbance), (iii) with two disturbances (complex disturbances). The disturbances used were: Voltage Sags (D1), Voltage Elevation (D2), Voltage Fluctuation(D3), Harmonic Distortion (D4), Transitory Impulse (D5), Momentary Interruption (D6), notch (D7), spike (D8), Transitory Oscillation (D9), Sags with Transitory Oscillation (D10), Harmonic Distortion with Interruption (D11), Sags with Harmonic Distortion (D12), Elevation with Harmonic Distortion (D13), Voltage Fluctuation with Harmonic Distortion (D14) and Elevation with Transitory Oscillation (D15).

Table 1 Used equations for synthetic PQ disturbances generation

The data are sampled at rates of 16, 32, and 64 samples per cycle. These rates were used to evaluate the performance of detection and classification of the proposed approach when faced with situations that provide different amounts of information about the disturbance present in the voltage signal. This is because the smaller the sampling rate, the less characterized the disturbance will be. On the other hand, the larger the sampling rate, the more characterized the disturbance will be, but the demand for hardware resources to embed the method also increases. The smallest sampling rate was fixed at 16 samples/cycle due to the regulation from PRODIST (Procedure for distribution of Electricity) on the minimum requirements that measuring equipment must meet. A frequency of 60 Hz established by PRODIST standard , IEEE Std. 1159-2009 [15], IEC 61000-4-15 [9] and IEC 61000-4-30 [8], was used.

For each type of disturbance, simple or complex, 100 random samples of voltage signals with that type of disturbance were generated. Thus, a database of PQ disturbances with a large variety of signals could be obtained, so as to have a set of signals of the various situations that can happen in a real distribution system. Therefore, an extensive and robust database was obtained, contemplating a variety of PQ disturbances, with the presence of noise. Levels of SNR of 40 dB, 30 dB, 20 dB and 0 dB (noiseless), which mischaracterize the signal severely were used, to evaluate the robustness of the proposed approach. It is worth noting that the smaller the SNR, the more it mischaracterizes the signal. Therefore, 12 disturbance bases were generated with varying SNR for each sampling rate.

Performance Evaluation and Result Analysis

The implementation of the proposed approach for the detection and classification of PQ disturbances was made using Tensorflow [1] and Keras [7]. The performance evaluation of the proposed method is performed through the following metrics: accuracy, recall, precision, and F1-Score. To calculate these metrics, a confusion matrix was used, and the TP (true positive), FP (false positive), TN (true negative), and FN (false negative) were calculated [38]. The accuracy was obtained by dividing the number of TPs plus TNs by the total number of samples in the test set. The precision metric (Eq. 7) reflects the proportion of TPs for all positive predictions (false and true). The recall metric (Eq. 8) is the proportion of TPs that were correctly identified.

$$\begin{aligned} \text {precision}= & {} \frac{TP}{TP + FP} , \end{aligned}$$
$$\begin{aligned} \text {Recall}\,= & {} \frac{TP}{TP + FN} . \end{aligned}$$

In addition to these metrics, the F1-Score (Eq. 9) was used to calculate the balance between precision and recall. Results range from 0 to 1 for all metrics. Therefore, the results indicate an improvement in learning when their values are closer to 1. All metrics were calculated using the Scikit-learn Python 3 library [34].

$$\begin{aligned} \text {F1-Score} = \frac{2 . \text {Recall}. \text {precision}}{\text {Recall} + \text {precision}}. \end{aligned}$$

A tenfold cross-validation was executed to evaluate the capacity of generalization of the proposed approach. The training process was performed in two steps, using the optimized Adam [20], dynamic learning rate and Early-Stop. The goal is to avoid oscillating around the optimum, getting as close as possible to it with a high learning rate, and after that, decreasing the learning rate to refine the solution. The loss value is monitored to verify the oscillation around the optimum. The process begins with a learning rate of 0.001, and moves to the second phase when the loss value decreases 10 consecutive times. In the second phase, the leaning rate is decreased to 0.0001 and ends when the loss value worsens 20 consecutive times.

The proposed approach is modeled for each of the 3 sampling rates, that is, 3 deep neural networks are created, responsible for extracting, selecting, detecting and classifying the disturbances generated over the sampling rates related to it. Each deep neural network has as inputs the voltage signals of each sampling rates, totalling 3 deep networks with input sizes of 16, 32 and 64.


The experiments with the proposed deep networks are performed by varying the SNR rate. For a better analysis of the results graphs of each sampling rate for each average of the metrics considered were plotted. From Figs. 3, 4, 5, it can be seen that, for most cases, when the SNR decreases, the value of that metric also decreases. This relation is due mainly to the decharacterization of the signal by the overlapping noise, so that the classifier makes mistakes when relating the features extracted to the corresponding disturbance.

Fig. 3

Graphical analysis of the metrics for 64 samples/cycles

Fig. 4

Graphical analysis of the metrics for 32 samples/cycles

Fig. 5

Graphical analysis of the metrics for 16 samples/cycles

Figure 6 summarizes the average values of the metrics in relation to the sampling rate and the SNR. From this graph, it is possible to see that the bases with sampling rates of 64 samples/cycle have the highest values for the metrics analyzed. This is justified by the amount of information given to the classifier, since the greater the sampling rate, the more information can be obtained from the signal. However, the values of the metrics for all the sampling rates have similar values, which shows that, despite having less information, the classifier can still obtain satisfactory results. Moreover, the results remain satisfactory even in the presence of noisy data, which shows that the proposed approach has an adequate generalization.

Fig. 6

Graphical analysis of the sampling rates

By analyzing the detection of the disturbances from the calculated metrics for signals without overlapping noise (0 dB) with sampling rate of 64 samples/cycle, it is possible to see, from the precision metric, that 99.24% of the signals that were classified as normal, effectively were normal signals. In relation to the recall metric, it can be inferred that when the signal was normal, in 100% of the time the method classified it as normal. Through this analysis, it can be concluded that the method has a 0.76% chance of not detecting a signal with disturbance and 0% chance of a normal signal being mistaken for a signal with a disturbance. Therefore, an analysis is fundamental to obtain more information about the error in detection presented by the method. To that purpose, the method must be tested in all situations in which disturbances occur, which means every situation with transition windows, from the worst to the best case scenario.

Hence, the modeling of the experiment to more deeply analyze the disturbance detection used only two classes: with disturbances and without disturbances. All databases (40 dB, 30 dB, 20 dB and 0 dB) were grouped for all sampling rates considered. The database was then partitioned into two subset: training and testing. The training subset is composed of 10% of all situations with disturbances happen, The training subset is composed of the remaining 90% of the data.

Figure 7 illustrates the performance of the approach for all the situations with disturbances with different sampling rates. The errors in identification occur more frequently in the windows where the disturbance is not perfectly characterized, that is, where the windows present few points that characterize the disturbance. However, even in these cases, the detection is satisfactory and, as the number of points characterizing the disturbance increases, so does the performance of the proposed approach.

Fig. 7

Graphical analysis the performance of the proposed approach for the detection

The proposed approach was tested on a raspberry to verify the performance in a real scenario with low-cost hardware. To accomplish that, the classification time of a sample and the time to acquire a new point were analyzed. Figure 8 illustrates the execution time for the classification of a sample for the sampling rates. From these results, it can be seen that the classification time increases as the sampling rate increases. Moreover, it can be inferred that it is possible to embed this method in a hardware, in this case a raspberry, so as to utilize it in a real scenario.

The voltage signal has the duration of 0.167 seconds, thus for the sampling rates of 16, 32 and 64 samples/cycle, a new point is acquired every 1.04822, 0.522466 and 0.260824 milliseconds, hence the classification time must be shorter than the time to acquire a new point. If the classification time is greater than the time to acquire a new point, there can be a memory leak. Thus, for a raspberry, the proposed approach can be embedded with a sampling rate of 16 samples per cycle. Sampling rates of 32 and 64 samples per cycle can be embedded in hardware with more computational power, but the financial cost would be higher. Embedding the method in a low-cost hardware makes the demand management project feasible.

Fig. 8

Graphical analysis of the classification time for each sampling rate

Thus, according to the results obtained, the precision of the method in detecting and classifying PQ disturbances in different scenarios can be verified. The analyses can be used as a base to develop a demand-side management system. This system can be embedded in a hardware and work in a way that, if a disturbance is detected, an algorithm of demand response, as proposed in [39], can be notified and a scheduling of renewable charges can be executed so as to preserve the integrity of the user’s equipment. Moreover, the proposed approach has proven capable of obtaining satisfactory results for low sampling rates, even in situations where noise can mischaracterize the signal severely. These results can be used to choose the type of hardware, where a low-cost hardware can be used with a lower sampling rate, given its smaller computational cost, since the amount of information used is small, and consequently the processing cost will be smaller. Hence, it is possible to obtain a smart meter with an affordable price for the consumer, equipped with a demand-side management system based on the detection and classification of PQ disturbance.


The results obtained by the proposed approach are compared to the state of the art. Usually, the researchers use mathematical models for the generation and characterization of the PQ disturbances database, altering, randomly, for instance, the amplitude and occurring time of a disturbance in order to obtain a greater variation (within the limits of the disturbance). Thus, almost every work found in the literature uses a different database, which prevents a fair comparison. In addition, a data window of 10 cycles is used as input to the proposed methods, which, given the large amount of information, can cause a delay in the detection and classification and consequently in the decision making by the consumer. Finally, the analysis of the results in these works is done for only one sampling rate, and only for the accuracy metric.

However, [4] use the same database used in this work and perform a classification with a signal window of one cycle that moved point by point along the voltage signal. The main goal of the authors is to balance the computational cost with a precise classification, so they use the features extraction in the domain of time and frequency and algorithms with low computational cost (neural networks and decision trees). The approach proposed by [4] did not reach accuracy rates superior to 97.41% for none of the 8 synthetic disturbances considered with SNR 40dB and 35dB. It is also important to note that only eight disturbances were considered, and none of the tests were executed for SNR levels lower than 35dB, that mischaracterizes the signal severely and makes classification harder.

To diversify the comparison of the proposed approach with the state of the art, an experiment was performed where a data window of 10 cycles without noise was used as input. Thus, despite the database used different values, other works were used in this comparison. The works used for the comparison were selected as they used different deep learning techniques and approaches to model the problem of detecting and classifying PQ disturbances. Table 2 shows the comparisons.

Table 2 Comparative analysis for the signal without noise of 10 cycles

The authors in [25] perform the PQ disturbances classification process in two stages. In the first stage, the sag, interruption and swell disturbances are considered to be of the same type (C1), because the waveforms of these distortions are similar, so an autoencoder connected to the fully connected layer with softmax activation function is used to automatically extract and select the high-level features and then classify the disturbances respectively. If a disturbance is classified as C1, the second stage is executed. In the second stage, a set of parameters is used to differentiate the sag, interruption, and swell disturbances. The thresholds were obtained through particle swarm optimization. Despite the approach’s potential, it is important to note that the particle swarm optimization method used to perform threshold optimization to classify disturbances in case the resulting class determined by the autoencoder is C1, is a robust and time-consuming optimization technique that requires domain-specific knowledge to model.

In [24], a new approach is developed to detect and classify PQ disturbances based on single spectrum analysis, curvelet transform and deep convolutional neural networks. The developed approach makes use of the curvelet transform and single spectrum analysis to extract information from the PQ disturbance signals. This information was used as the input for a deep convolutional neural network to classify unique and complex PQ disturbances. This work, although relevant, does not benefit from one of the main techniques of deep learning in knowledge representation. The approach uses other techniques to extract features, neglecting the automatic feature extraction capabilities of convolutional neural networks.

In [31], the authors proposed a hybrid architecture, combining two deep learning architectures, called CNN–LSTM. Initially, two layers of CNN are used to learn spatial information, followed by a layer of LSTM to learn temporal information. The features generated by the max-pool layer of CNN are fed into the LSTM layer, followed by a dense layer with a softmax activation function. The performance of the proposed deep learning architecture is validated in synthetic and real signals. Although the model presents a satisfactory performance, the authors did not experiment with overlapping noise to evaluate the proposed model’s performance and did not consider overfitting.

As can be seen, this work’s approach differs from the related works, which do not consider the feasibility of embedding the method in hardware. The proposed approach uses windows of one cycle to do that since that value is more adequate for embedding the method for detecting and classifying PQ disturbances in hardware. Moreover, an analysis in different scenarios is performed investigating the effects of different sampling rates. The approach stands out mainly because it uses an automatic and integrated method for extracting, selecting, detecting, and classifying disturbances. Additionally, further evaluation of the results is performed using metrics of accuracy, precision, recall, and F1-Score. The efficacy of the proposed approach is evidenced by the experimentation performed over a database composed of synthetic signals (highly heterogeneous), which include 15 different types of disturbance and standard signals.

Fig. 9

Graphical analysis of the classification time for each sampling rate

The convolution layers extract several features, including higher abstraction level features [41], meaning it becomes a challenge to understand the extracted features learned by the deep neural network. Thus, to understand the model’s mechanism, feature visualization and analysis techniques were used.

To analyze the decision boundaries of the features extracted by the proposed approach, the t-Distributed Stochastic Neighbor Embedding (t-SNE) technique is used to analyze the multidimensional data [26]. All the disturbances considered are used for feature visualization. The results of the t-SNE-based feature visualization are shown in Fig. 9, in which it is possible to notice that the extracted features promote a decision boundary that facilitates the detection and classification of PQ disturbances.

Conclusion and Future Works

This work proposes an approach based on deep learning to detect and classify PQ disturbances. The proposed approach consists of using convolutional neural network with 1D convolution and LSTM networks to extract, select, detect and classify the features automatically and in a unified way. The method used as input a data window of one cycle that moves point by point until it covers the whole length of the signal. Even though this makes the problem more complex, since a smaller amount of information about the disturbances is given to the classifier, this approach is more compatible with hardware embedding. Also, different sampling rates were evaluated to test the method’s robustness receiving as input windows of 16, 32, and 64 samples.

From the results obtained, it is possible to note that the proposed methodology can efficiently detect and classify efficiently PQ disturbances. However, the performance of the classifier decreases as the noise interference in the input increases. These erroneous classifications are closely related to the fact that the disturbances present similar features in their waveforms. In spite of that, it is possible to observe that the methodology has a satisfactory performance even when faced with noise that mischaracterizes the signal and low sampling rates that provide little information about the disturbances.

As a future work, the authors intend to embed an integration of the proposed approach with a demand-side management algorithm in a low-cost hardware. This works in such a way that, after the detection of a disturbance, the demand response algorithm can be notified and measures can be taken to protect the equipment from damage.


  1. 1.

    Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, et al. Tensorflow: a system for large-scale machine learning. OSDI. 2016;16:265–83.

    Google Scholar 

  2. 2.

    Atems B, Hotaling C. The effect of renewable and nonrenewable electricity generation on economic growth. Energy Policy. 2018;112:111–8. https://doi.org/10.1016/j.enpol.2017.10.015. http://www.sciencedirect.com/science/article/pii/S0301421517306389.

  3. 3.

    Balouji E, Salor O. Classification of power quality events using deep learning on event images. In: 2017 3rd International conference on pattern recognition and image analysis (IPRIA), 2017. pp. 216–221. IEEE.

  4. 4.

    Borges FA, Fernandes RA, Silva IN, Silva CB. Feature extraction and power quality disturbances classification using smart meters signals. IEEE Trans Ind Inf. 2016;12(2):824–33.

    Article  Google Scholar 

  5. 5.

    Buduma N, Locascio N. Fundamentals of deep learning: designing next-generation machine intelligence algorithms. Massachusetts: O’Reilly Media Inc; 2017.

    Google Scholar 

  6. 6.

    Cho SH, Jang G, Kwon SH. Time-frequency analysis of power-quality disturbances via the gabor-wigner transform. IEEE Trans Power Deliv. 2010;25(1):494–9.

    Article  Google Scholar 

  7. 7.

    Chollet F, et al. Keras: Deep learning library for theano and tensorflow. URL: https://keras.io/k. 2015;7(8)

  8. 8.

    Commission IE, et al. Electromagnetic compatibility (emc)-part 4-30: Testing and measurement techniques-power quality measurement methods. IEC 61000-4-30, 2003

  9. 9.

    Commission IE, et al. Testing and measurement techniques—flickermeter—functional and design specifications. Tech Rep IEC. 2010;61000–4(15):1.

    Google Scholar 

  10. 10.

    Silva IRS, Rabêlo RAL, Rodrigues JJPC, Solic P, Carvalho A. A preference-based demand response mechanism for energy management in a microgrid. J Clean Prod 2020;255:120034. https://doi.org/10.1016/j.jclepro.2020.120034.

    Article  Google Scholar 

  11. 11.

    Decanini JG, Tonelli-Neto MS, Malange FC, Minussi CR. Detection and classification of voltage disturbances using a fuzzy-artmap-wavelet network. Electr Power Syst Res. 2011;81(12):2057–65.

    Article  Google Scholar 

  12. 12.

    Deng L, Yu D, et al. Deep learning: methods and applications. Found Trends Sig Process. 2014;7(3–4):197–387.

    MathSciNet  Article  Google Scholar 

  13. 13.

    Elbasuony GS, Aleem SHA, Ibrahim AM, Sharaf AM. A unified index for power quality evaluation in distributed generation systems. Energy. 2018;149:607–22.

    Article  Google Scholar 

  14. 14.

    Erişti H, Uçar A, Demir Y. Wavelet-based feature extraction and selection for classification of power system disturbances using support vector machines. Electr Power Syst Res. 2010;80(7):743–52.

    Article  Google Scholar 

  15. 15.

    Group IPW, et al. Recommended practice for monitoring electric power quality. Tech Rep 1994; 5

  16. 16.

    Hirsch A, Parag Y, Guerrero J. Microgrids: a review of technologies, key drivers, and outstanding issues. Renew Sustain Energy Rev. 2018;90:402–11.

    Article  Google Scholar 

  17. 17.

    Hooshmand R, Enshaee A. Detection and classification of single and combined power quality disturbances using fuzzy systems oriented by particle swarm optimization algorithm. Electr Power Syst Res. 2010;80(12):1552–611.

    Article  Google Scholar 

  18. 18.

    Jamali S, Farsa AR, Ghaffarzadeh N. Identification of optimal features for fast and accurate classification of power quality disturbances. Measurement. 2018;116:565–74.

    Article  Google Scholar 

  19. 19.

    Kapoor R, Gupta R, Jha S, Kumar R, et al. Detection of power quality event using histogram of oriented gradients and support vector machine. Measurement. 2018;120:52–755.

    Article  Google Scholar 

  20. 20.

    Kingma DP, Ba J. Adam: A method for stochastic optimization. 2014. arXiv preprint arXiv:1412.6980.

  21. 21.

    LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436.

    Article  Google Scholar 

  22. 22.

    Lee CY, Shen YX. Optimal feature selection for power-quality disturbances classification. IEEE Trans Power Deliv. 2011;26(4):2342–51.

    Article  Google Scholar 

  23. 23.

    Li X, Chen M, Wang Q. Quantifying and detecting collective motion in crowd scenes. IEEE Trans Image Process. 2020;29:5571–83.

    Article  Google Scholar 

  24. 24.

    Liu H, Hussain F, Shen Y, Arif S, Nazir A, Abubakar M. Complex power quality disturbances classification via curvelet transform and deep learning. Electr Power Syst Res. 2018;163:1–9.

    Article  Google Scholar 

  25. 25.

    Ma J, Zhang J, Xiao L, Chen K, Wu J. Classification of power quality disturbances via deep learning. IETE Tech Rev. 2017;34(4):408–15.

    Article  Google Scholar 

  26. 26.

    Maaten Lvd, Hinton G. Visualizing data using t-sne. J Mach Learn Res 2008;9:2579–2605.

  27. 27.

    Mahela OP, Shaik AG, Gupta N. A critical review of detection and classification of power quality events. Renew Sustain Energy Rev. 2015;41:495–505.

    Article  Google Scholar 

  28. 28.

    Masoum M, Jamali S, Ghaffarzadeh N. Detection and classification of power quality disturbances using discrete wavelet transform and wavelet networks. IET Sci Measure Technol. 2010;4(4):193–205.

    Article  Google Scholar 

  29. 29.

    Mishra S, Bhende C, Panigrahi B. Detection and classification of power quality disturbances using s-transform and probabilistic neural network. IEEE Trans Power Deliv. 2008;23(1):280–7.

    Article  Google Scholar 

  30. 30.

    Mohammadi M, Afrasiabi M, Afrasiabi S, Parang B. Detection and classification of multiple power quality disturbances based on temporal deep learning. In: 2019 IEEE international conference on environment and electrical engineering and 2019 IEEE industrial and commercial power systems Europe (EEEIC / I CPS Europe), 2019. pp. 1–5

  31. 31.

    Mohan N, Soman K, Vinayakumar R. Deep power: deep learning architectures for power quality disturbances classification. In: 2017 international conference on technological advancements in power and energy (TAP Energy), 2017. pp. 1–6. IEEE.

  32. 32.

    Osinga D. Deep learning cookbook: practical recipes to get started quickly. O Really 2018

  33. 33.

    Patterson J, Gibson A. Deep learning: a practitioner’s approach. Massachusetts: O’Reilly Media Inc; 2017.

    Google Scholar 

  34. 34.

    Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E. Scikit-learn: machine learning in python. J Mach Learn Res. 2011;12:2825–30.

    MathSciNet  MATH  Google Scholar 

  35. 35.

    Roldán-Blay C, Escrivá-Escrivá G, Roldán-Porta C. Improving the benefits of demand response participation in facilities with distributed energy resources. Energy. 2019;169:710–8.

    Article  Google Scholar 

  36. 36.

    Schmidhuber J. Deep learning in neural networks: an overview. Neural Networks. 2015;61:85–117.

    Article  Google Scholar 

  37. 37.

    Singh U, Singh SN. A new optimal feature selection scheme for classification of power quality disturbances based on ant colony framework. Appl Soft Comput. 2019;74:216–25. https://doi.org/10.1016/j.asoc.2018.10.017. http://www.sciencedirect.com/science/article/pii/S156849461830574X.

  38. 38.

    Sokolova M, Lapalme G. A systematic analysis of performance measures for classification tasks. Inf Process Manage. 2009;45(4):427–37.

    Article  Google Scholar 

  39. 39.

    Veras J, Silva I, Pinheiro P, Rabêlo R, Veloso A, Borges F, Rodrigues J. A multi-objective demand response optimization model for scheduling loads in a home energy management system. Sensors. 2018;18(10):3207.

    Article  Google Scholar 

  40. 40.

    Wang H, Wang P, Liu T. Power quality disturbance classification using the s-transform and probabilistic neural network. Energies. 2017;10(1):107.

    Article  Google Scholar 

  41. 41.

    Zhang W, Li C, Peng G, Chen Y, Zhang Z. A deep convolutional neural network with new training methods for bearing fault diagnosis under noisy environment and different working load. Mech Syst Sig Process. 2018;100:439 – 453 https://doi.org/10.1016/j.ymssp.2017.06.022. http://www.sciencedirect.com/science/article/pii/S0888327017303369

Download references

Author information



Corresponding author

Correspondence to Ricardo de A. L. Rabelo.

Ethics declarations

Conflict of Interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Rodrigues, W.L., Borges, F.A.S., de Carvalho Filho, A.O. et al. A Deep Learning Approach for the Detection and Classification of Power Quality Disturbances with Windowed Signals. SN COMPUT. SCI. 2, 64 (2021). https://doi.org/10.1007/s42979-020-00435-1

Download citation


  • Deep learning
  • Long short-term memory
  • Power quality
  • Convolutional neural networks