1 Introduction

In our life, electric energy plays a very important role to make efficient most of our activities. We always need electric energy because with evolution of the technology, most of the needs in daily life require energy and then the problem of energy management becomes one of the crucial problems that the humanity is facing to especially reducing electricity consumption and its cost. Heating, insulation, consumption, savings are some of the basics to consider while applying electrical devises in one’s house. It is important to learn how to reduce your cost bill in order to gain your comfort life with less electric consumption and low electric energy cost. All circumstances or time which we need for accredit energy, we have to use accession of internet on the thing’s devices and rising numbering of contrivance acute in the house. The internet of things in energy usage is very significance in the devices convenience on our house [1]. The Home electric energy system has referred to some hardware and/or software, algorithm that can provide feedback about our necessity at home, and/or also enable advanced control energy-using devices at house (Working Group). Durability issues represent the biggest challenges that society is confronted. Least 83% to the world’s most, electric energy, renewable energy (untenable) fossil meteorological, solar energy, and biomass is only about 2% sum above total [2]. On one hand a short range is preferred for energy efficient data transmission as a result of the nonlinear path loss ratio. [3]. In KNN, the distance of each test data point to all neighbors is calculated and ranked in rising command. The top ‘k’ distances are taken, and the most frequent class in that subset used to define the class of that data point. When two element compound and \( \alpha \) is picket out correctly, the last score is appointed to minimize a total distance between those two elements, whereas the party of series had done in minimal number possible of bunch [1]. This is repeated for all data points until all have been labelled. Distance is calculated using Euclidean Distance method. For “n” objects to be assigned into “k” cluster, total of “nk”„ distance computation will be performed [4].

$$ d = \sqrt {\left( {a_{1} - b_{1} } \right)^{2} } + \left( {a_{2} - b_{2} } \right)^{2} $$
(1)
$$ standardization\, distance{:}\, X_{z} = \frac{X - Min}{Max - Min} $$
(2)

1.1 Scope

We aim to make a deep analysis of the problem related to home energy management system, and get better understanding of electric energy management based on result of our analysis, relying on efficiency-energy with Regression and Classification in household.

2 Related Work

Different methods or strategies used in electric energy sector in order to decrease energy consumption of household equipment are investigated. Electricity consumption is steadily increasing since the 1990s, lately emerging as the second most used source of energy with a share of 17, 7%, only behind oil with 40, 8%. One of the leading factors for this growth in electricity demand is the change in the habits of energy consumption in domestic environments. In 2010, domestic consumption was responsible for 28% of the total electricity consumption among all sectors with an effective increase of 40% between 1990 and 2010 [5]. It should be accommodated to mention that the potency economies of cooperation in communications have not been exploited in previous works. In the literature of wireless communication base stations (BSs) to sleep through the thin delivery of traffic was the common draw near to save energy [6]. The techniques done by Medasani and Kim shows that eliciting membership function form data are one of the fundamental applications with Fuzzy set theory [7]. In the research done by Jamsandekar and Mudholkar, they proposed an algorithm such as fuzzy inference system (FIS) for classification of electric energy data. In their proposed work, they used a Genetic algorithm (GA) which is an optimal searching technique used for generation rules [6]. The reference energy disaggregation Data Set aeration (REDD) [2], presents a data set containing freely available power information on the use of several houses, which aims to advance search on energy ventilation. These kinds of data are used in testing or implement methods that can be applied to HMES in order to reduce energy consumption. In view of storage problem, stem segmentation technique is used to extract keyword for each document and to build stem set. It has been shown that this method can obtain higher storage efficiency [8]. There are many methods used to process big data. However, there still the problem of detecting which one is the most efficient in big data analysis. Daily the transfer information need network and you must secure your social media as that someone may hacked your information and your secret message must secure which means that your information have not damaged on the process way [9]. Big data play important role on our improving knowledge, must know that we must make our information private and secure as encrypt our system of information using cloud computing knowledge [2, 8, 10].

3 Data and Methodology

In order to efficiently conduct our study, we used data collected from Nanjing University of Information Science and Technology, International students’ dormitory number 4, room 322. The data collected concerned energy consumption such as telephone, heat water, heat water dispenser, air conditioner, iron. To achieve the goal on studying energy consumption in household, we used K-Nearest Neighbors (KNN) which is a non-parametric method mainly based on the analysis about big data. It is used for classification and regression to clarify the space among each point which is identified in clustering. The result obtained depends of the value gave by k, where K-NN algorithm used for classification or regression the distance. This method has been used in order to study how to reduce energy consumption in a household. In KNN, the distance of each test data point to all neighbors is calculated and ranked in ascending order. The top ‘k’ distances are taken, and the most frequent class in that subset are used to define the class of that data point. That step is repeated for all data points until all have been labelled.

Soumadip Ghosh et al. have proposed a genetic algorithm approach for finding frequent item sets which caters to positive and negative association mining [8]. In their result, they showed the rule of mining problem to find frequent item sets using their proposed GA based method, they found that it is a very simple and efficient one. Marmelstein proposed the method in which he tried to explore the different methods of using genetic algorithm with K- nearest neighbor algorithm to improve the classification accuracy and minimize the training error [9]. Soraj et al. proposed a Genetic fuzzy logic methods for the discovery of decision rules of datasets containing categorical and continuous attributes [10]. From the research done by Liu and Zhu, they found that among 2010 to 2030, the save energy increasing should be imprinted to a global energy require twice more than the one known today. According to their analysis and view the rate of the high technology today, their findings are about to be correct. Figure out this challenges, many modifications should be made regarding the existing electrical energy system [11]. Considering the findings from the work done by Liu et al., the alteration of climate had influence on energy electricity. Generally, communication networks and data centers are known to be the largest power consumers and Green House Gases (GHG) emitters among other ICTs, benefit from smart grid driven techniques to enhance energy savings and emission reductions [12] for a sustainable development. In view of storage problem, stem segmentation technique is used to extract keyword for each document and to build stem set. This method can obtain higher storage efficiency [13].

3.1 Appliance Modeling

Regression models are one of the simpler yet powerful analysis methods for understanding relationships in the data and generating predictions from them. This is normally done using the Least Squares Method, which attempts to fit a ‘line of best fit’ that minimizes the sum of squares of the vertical difference of each point from the line itself [14]. K-Nearest Neighbors (KNN) is the most common algorithm used. It’s a supervised learning technique, where given a data point, the algorithm will output a class membership for that point [13,14,15]. KNN can also be used for identifying outliers in data. The Fig. 1 shows a system which is composed by many devices used to get electric energy. Energy electricity system flows into the lights, appliance from the socket to our devices. We have tried a schematic Fig. 1 with some devices of home energy electricity system. With the many researchers which we had consult their research’s, we got that they have focus on important role plays by neural networks to previously their study on the features for analysis and extraction for Distant Supervised Relation. So they had a good reasonless such as that on the electric energy we need network if not the activities is not possible (DSRE) [16, 17].

Fig. 1.
figure 1

Formulary system house layout

3.2 KNN Classification and Regression Work

This method is used for processing of classification and regression in a given dataset such as electric energy analysis. The Kernel-Nearest Neighbor (KNN) normally use the classification and regression on their functionality. The output is a category of members. An object is classified by a majority votes by majority adjacent with the object assigned to the most popularly class in the K most related. Its neighbors can also be used for regression. Output value is obtained using production rerun (Fig. 2).

Fig. 2.
figure 2

Classification and regression

Steps by steps to computer KNN algorithm:

  • First of all, we determine the parameter k which is the number or proximity of nearest neighbor;

  • Computer the distance between patterns;

  • Spell outrun and firm the most proximity K-th based on firm minimal;

  • Put together patterns by category based on proximity most than nigh;

  • Majority sample of the category of nearby most nigh becomes values of prediction of the search. Weigh more similar house more than those less similar in list of K-NN.

3.3 Algorithm Overview

We prefer to use K-Nearest Neighbor algorithm (k-NN) that is one of non-parametric method used for classifying data or regress data because it has been widely used by other authors and is also chosen to be one of the most important techniques in home energy management system analysis [18,19,20,21].

$$ \hat{Y}q = \frac{{C_{qNN1} Y_{qNN1} + C_{qNN2} Y_{qNN2} + C_{qNN3} Y_{qNN3} + \ldots \ldots + C_{qNNK} Y_{qNNK} }}{{\mathop \sum \nolimits_{J = 1}^{K} C_{qNNJ} }} $$
(3)

For the static modeling regression analysis, it has been set a series of statistical processes in order to estimate the relationship among devices. K-Nearest Neighbor algorithm is a way to classify target with attributes to its nearest neighbor in the Learning set. In K-NN method, the K-Nearest Neighbors are considered [22,23,24]. Below is formula used to calculate a distance function of the variable:

$$ Euclidean = \sqrt {\mathop \sum \limits_{i = 1}^{k} \left( {x_{i} - y_{i} } \right)^{2} } $$
(4)
$$ Minkowski = \left( {\mathop \sum \limits_{i = 1}^{k} \left( {\left| {x_{i} - y_{i} } \right|} \right)^{q} } \right)^{{1_{q} }} $$
(5)
$$ Distance \,Hamming{:}\, D_{H} = \mathop \sum \limits_{i = 1}^{k} \left| {x_{i} - y_{i} } \right|\left\{ {\begin{array}{*{20}c} {x = y\mathop \Rightarrow \limits^{{}} D = 0} \\ {X \ne Y\mathop \Rightarrow \limits^{{}} D = 1} \\ \end{array} } \right. $$
(6)

The problem that we have been resolving in logistic regression, the variable need depends on binary variable (True/False, Yes/No), for example in classification problems. It can be highlighted that logistic regression can be used even if the dependent and independent variables in our model do not have a linear relationship (Fig. 3).

Fig. 3.
figure 3

Classification of problem regression

4 Validation with Experimental Data from Room 322, #4 Building of NUIST

The dataset used to validate our experimentation has been personally collected from someone’s room, East campus of Nanjing University of Information Science and Technology precisely old dorm room number 322 for a period of 3 days averaging from October 21th to 24th 2018. We tried to analyze different household equipment and studied which devices use high energy consumption.

4.1 Result Found from Experiment

Most of our experimentations have been done using MATLAB software. Thus, all figures used in this study are obtained from the analysis of data electric energy consumption using MATLAB. The electricity was downloaded from different devices and from different sockets (Fig. 4).

Fig. 4.
figure 4

Curve of data _ power

The Fig. 5 shows the variation of power consumption within each period of 10 min. Then we found that from 12:08 to 12:10 the power consumption is very high. The question that comes out is why the energy consumption was suddenly getting higher and higher. We noticed that, when someone is heating water or cook something to eat, the energy consumption becomes very high at the time when proceed to finish. During our experimentation after 12:10, it can be seen a sudden decrease like a latent period which resumed around 12:20. During that period, the power consumption became very high. The energy consumption follows a sinusoidal curve. The Fig. 5 shows the result obtained from the data collected on October 24th in 2018 in the room of the East campus in Nanjing University of Information Science and Technology.

Fig. 5.
figure 5

Curve of data-power2

In the Fig. 6, we found that the variation starts after 10 min. The data were collected on October 24th, 2018. The result has shown that from the starting time 12:00, the variation starts after 10 min and we observe a sequence of increasing and decreasing phenomena of energy consumption. It can be seen that the heat water energy consumption differs from the previous device. However, at the end of the collecting period, the figure shows us that curve of variation still on the same level of temperature.

Fig. 6.
figure 6

Curve of data-power3 (heat water)

The Fig. 7 shows a histogram obtained while measuring the heat water energy consumption. Three phases can be identified. At the beginning, the power consumption is low but around 40% of heating, the energy consumption becomes very high then decreases for a certain latent period. The energy consumption increases again at the end of the heat process when the water is at around 85% of its boiling level. At the end of the boiling phase there is no more power consumption.

Fig. 7.
figure 7

Histogram of heat water

4.2 Simulation and Comparison

In any home energy management system analysis, it is necessary to make a simulation of the data for a given house to be simulated to other houses in order to make a general and perspective analysis and establish some assumptions. This model simulation has been run with MATLAB software. The Data that we collected from the model during real-time execution or normal simulation, and stored to a variable gave us the schemes or figures of stream signal whom will be a result of our simulation. MATLAB workspace has some functions very important when we need to do analyze of data or if we need the plotting functions in comparison about variation and for visualization of our purposes. The Fig. 8 shows the simulation output.

Fig. 8.
figure 8

Stream signal from Simulink

5 Conclusion

Energy consumption and energy storage are among vital issues for human society and industry. For states, energy independence is strategic and economically essential. For individual and businesses, energy must be available on demand, without any sudden interruption. Any breakdown of energy supply has a high economic and social cost and negative impact in terms of health and safety. The Energy Storage Service will help to uncover the revenue streams and business opportunities most relevant to any project. It provides a foundation in the economics, market landscape and technology advancements that is essential to formulating innovative strategies in the energy storage market. It has been demonstrated that the energy consumption depends on the categories of electric devices, electrical appliance, the allocation and the number of electric equipment in the house. The energy consumption variation depends also on the way it is used. Some homes used more household electric equipment while others are using few electric equipment. We have shown that for example the telephone and heat water are not consuming the same energy.