Analysis of the Efficiency-Energy with Regression and Classification in Household Using K-NN

Sun, Mingxu; Liu, Xiaodong; Mbonihankuye, Scholas

doi:10.1007/978-3-030-24265-7_31

Analysis of the Efficiency-Energy with Regression and Classification in Household Using K-NN

Conference paper
First Online: 11 July 2019

1686 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11633))

Abstract

This paper aims to study energy consumption in a house. Home energy management system (HEMS) has become very important, because energy consumption of a residential sector accounts for a significant amount of total energy consumption. However, a conventional HEMS has some architectural limitations among dimensional variables reusability and interoperability. Furthermore, the cost of implementation in HEMS is very expensive, which leads to the disturbance of the spread of a HEMS. Therefore, this study proposes an Internet of Things (IoT) based HEMS with lightweight photovoltaic (PV) system over dynamic home area networks (DHANs), which enables the construction of a HEMS to be scalable reusable and interoperable. The study suggests a technique for decreasing cost of energy that HEMS is using and various perspectives in system. The method that proposed is K-NN (K-Nearest Neighbor) which helps us to analyze the classification and regression datasets. This paper has the result from the data relevant in October 2018 from some buildings of Nanjing University of Information Science and Technology. That dataset allowed us to make analysis of electric energy consumption of each home equipment used and to make a simulation of the energy needed for each apparatus. Finally, we succeeded to find the algorithm which is suitable for efficiency-electric energy.

Download conference paper PDF

1 Introduction

In our life, electric energy plays a very important role to make efficient most of our activities. We always need electric energy because with evolution of the technology, most of the needs in daily life require energy and then the problem of energy management becomes one of the crucial problems that the humanity is facing to especially reducing electricity consumption and its cost. Heating, insulation, consumption, savings are some of the basics to consider while applying electrical devises in one’s house. It is important to learn how to reduce your cost bill in order to gain your comfort life with less electric consumption and low electric energy cost. All circumstances or time which we need for accredit energy, we have to use accession of internet on the thing’s devices and rising numbering of contrivance acute in the house. The internet of things in energy usage is very significance in the devices convenience on our house [1]. The Home electric energy system has referred to some hardware and/or software, algorithm that can provide feedback about our necessity at home, and/or also enable advanced control energy-using devices at house (Working Group). Durability issues represent the biggest challenges that society is confronted. Least 83% to the world’s most, electric energy, renewable energy (untenable) fossil meteorological, solar energy, and biomass is only about 2% sum above total [2]. On one hand a short range is preferred for energy efficient data transmission as a result of the nonlinear path loss ratio. [3]. In KNN, the distance of each test data point to all neighbors is calculated and ranked in rising command. The top ‘k’ distances are taken, and the most frequent class in that subset used to define the class of that data point. When two element compound and $ \alpha $ is picket out correctly, the last score is appointed to minimize a total distance between those two elements, whereas the party of series had done in minimal number possible of bunch [1]. This is repeated for all data points until all have been labelled. Distance is calculated using Euclidean Distance method. For “n” objects to be assigned into “k” cluster, total of “nk”„ distance computation will be performed [4].

$$ d = \sqrt {\left( {a_{1} - b_{1} } \right)^{2} } + \left( {a_{2} - b_{2} } \right)^{2} $$

(1)

$$ standardization\, distance{:}\, X_{z} = \frac{X - Min}{Max - Min} $$

(2)

1.1 Scope

We aim to make a deep analysis of the problem related to home energy management system, and get better understanding of electric energy management based on result of our analysis, relying on efficiency-energy with Regression and Classification in household.

2 Related Work

Different methods or strategies used in electric energy sector in order to decrease energy consumption of household equipment are investigated. Electricity consumption is steadily increasing since the 1990s, lately emerging as the second most used source of energy with a share of 17, 7%, only behind oil with 40, 8%. One of the leading factors for this growth in electricity demand is the change in the habits of energy consumption in domestic environments. In 2010, domestic consumption was responsible for 28% of the total electricity consumption among all sectors with an effective increase of 40% between 1990 and 2010 [5]. It should be accommodated to mention that the potency economies of cooperation in communications have not been exploited in previous works. In the literature of wireless communication base stations (BSs) to sleep through the thin delivery of traffic was the common draw near to save energy [6]. The techniques done by Medasani and Kim shows that eliciting membership function form data are one of the fundamental applications with Fuzzy set theory [7]. In the research done by Jamsandekar and Mudholkar, they proposed an algorithm such as fuzzy inference system (FIS) for classification of electric energy data. In their proposed work, they used a Genetic algorithm (GA) which is an optimal searching technique used for generation rules [6]. The reference energy disaggregation Data Set aeration (REDD) [2], presents a data set containing freely available power information on the use of several houses, which aims to advance search on energy ventilation. These kinds of data are used in testing or implement methods that can be applied to HMES in order to reduce energy consumption. In view of storage problem, stem segmentation technique is used to extract keyword for each document and to build stem set. It has been shown that this method can obtain higher storage efficiency [8]. There are many methods used to process big data. However, there still the problem of detecting which one is the most efficient in big data analysis. Daily the transfer information need network and you must secure your social media as that someone may hacked your information and your secret message must secure which means that your information have not damaged on the process way [9]. Big data play important role on our improving knowledge, must know that we must make our information private and secure as encrypt our system of information using cloud computing knowledge [2, 8, 10].

3 Data and Methodology

In order to efficiently conduct our study, we used data collected from Nanjing University of Information Science and Technology, International students’ dormitory number 4, room 322. The data collected concerned energy consumption such as telephone, heat water, heat water dispenser, air conditioner, iron. To achieve the goal on studying energy consumption in household, we used K-Nearest Neighbors (KNN) which is a non-parametric method mainly based on the analysis about big data. It is used for classification and regression to clarify the space among each point which is identified in clustering. The result obtained depends of the value gave by k, where K-NN algorithm used for classification or regression the distance. This method has been used in order to study how to reduce energy consumption in a household. In KNN, the distance of each test data point to all neighbors is calculated and ranked in ascending order. The top ‘k’ distances are taken, and the most frequent class in that subset are used to define the class of that data point. That step is repeated for all data points until all have been labelled.

Soumadip Ghosh et al. have proposed a genetic algorithm approach for finding frequent item sets which caters to positive and negative association mining [8]. In their result, they showed the rule of mining problem to find frequent item sets using their proposed GA based method, they found that it is a very simple and efficient one. Marmelstein proposed the method in which he tried to explore the different methods of using genetic algorithm with K- nearest neighbor algorithm to improve the classification accuracy and minimize the training error [9]. Soraj et al. proposed a Genetic fuzzy logic methods for the discovery of decision rules of datasets containing categorical and continuous attributes [10]. From the research done by Liu and Zhu, they found that among 2010 to 2030, the save energy increasing should be imprinted to a global energy require twice more than the one known today. According to their analysis and view the rate of the high technology today, their findings are about to be correct. Figure out this challenges, many modifications should be made regarding the existing electrical energy system [11]. Considering the findings from the work done by Liu et al., the alteration of climate had influence on energy electricity. Generally, communication networks and data centers are known to be the largest power consumers and Green House Gases (GHG) emitters among other ICTs, benefit from smart grid driven techniques to enhance energy savings and emission reductions [12] for a sustainable development. In view of storage problem, stem segmentation technique is used to extract keyword for each document and to build stem set. This method can obtain higher storage efficiency [13].

3.1 Appliance Modeling

Regression models are one of the simpler yet powerful analysis methods for understanding relationships in the data and generating predictions from them. This is normally done using the Least Squares Method, which attempts to fit a ‘line of best fit’ that minimizes the sum of squares of the vertical difference of each point from the line itself [14]. K-Nearest Neighbors (KNN) is the most common algorithm used. It’s a supervised learning technique, where given a data point, the algorithm will output a class membership for that point [13,14,15]. KNN can also be used for identifying outliers in data. The Fig. 1 shows a system which is composed by many devices used to get electric energy. Energy electricity system flows into the lights, appliance from the socket to our devices. We have tried a schematic Fig. 1 with some devices of home energy electricity system. With the many researchers which we had consult their research’s, we got that they have focus on important role plays by neural networks to previously their study on the features for analysis and extraction for Distant Supervised Relation. So they had a good reasonless such as that on the electric energy we need network if not the activities is not possible (DSRE) [16, 17].

3.2 KNN Classification and Regression Work

This method is used for processing of classification and regression in a given dataset such as electric energy analysis. The Kernel-Nearest Neighbor (KNN) normally use the classification and regression on their functionality. The output is a category of members. An object is classified by a majority votes by majority adjacent with the object assigned to the most popularly class in the K most related. Its neighbors can also be used for regression. Output value is obtained using production rerun (Fig. 2).

Steps by steps to computer KNN algorithm:

First of all, we determine the parameter k which is the number or proximity of nearest neighbor;
Computer the distance between patterns;
Spell outrun and firm the most proximity K-th based on firm minimal;
Put together patterns by category based on proximity most than nigh;
Majority sample of the category of nearby most nigh becomes values of prediction of the search. Weigh more similar house more than those less similar in list of K-NN.

3.3 Algorithm Overview

We prefer to use K-Nearest Neighbor algorithm (k-NN) that is one of non-parametric method used for classifying data or regress data because it has been widely used by other authors and is also chosen to be one of the most important techniques in home energy management system analysis [18,19,20,21].

$$ \hat{Y}q = \frac{{C_{qNN1} Y_{qNN1} + C_{qNN2} Y_{qNN2} + C_{qNN3} Y_{qNN3} + \ldots \ldots + C_{qNNK} Y_{qNNK} }}{{\mathop \sum \nolimits_{J = 1}^{K} C_{qNNJ} }} $$

(3)

For the static modeling regression analysis, it has been set a series of statistical processes in order to estimate the relationship among devices. K-Nearest Neighbor algorithm is a way to classify target with attributes to its nearest neighbor in the Learning set. In K-NN method, the K-Nearest Neighbors are considered [22,23,24]. Below is formula used to calculate a distance function of the variable:

$$ Euclidean = \sqrt {\mathop \sum \limits_{i = 1}^{k} \left( {x_{i} - y_{i} } \right)^{2} } $$

(4)

$$ Minkowski = \left( {\mathop \sum \limits_{i = 1}^{k} \left( {\left| {x_{i} - y_{i} } \right|} \right)^{q} } \right)^{{1_{q} }} $$

(5)

$$ Distance \,Hamming{:}\, D_{H} = \mathop \sum \limits_{i = 1}^{k} \left| {x_{i} - y_{i} } \right|\left\{ {\begin{array}{*{20}c} {x = y\mathop \Rightarrow \limits^{{}} D = 0} \\ {X \ne Y\mathop \Rightarrow \limits^{{}} D = 1} \\ \end{array} } \right. $$

(6)

The problem that we have been resolving in logistic regression, the variable need depends on binary variable (True/False, Yes/No), for example in classification problems. It can be highlighted that logistic regression can be used even if the dependent and independent variables in our model do not have a linear relationship (Fig. 3).

4 Validation with Experimental Data from Room 322, #4 Building of NUIST

The dataset used to validate our experimentation has been personally collected from someone’s room, East campus of Nanjing University of Information Science and Technology precisely old dorm room number 322 for a period of 3 days averaging from October 21th to 24th 2018. We tried to analyze different household equipment and studied which devices use high energy consumption.

4.1 Result Found from Experiment

Most of our experimentations have been done using MATLAB software. Thus, all figures used in this study are obtained from the analysis of data electric energy consumption using MATLAB. The electricity was downloaded from different devices and from different sockets (Fig. 4).

The Fig. 5 shows the variation of power consumption within each period of 10 min. Then we found that from 12:08 to 12:10 the power consumption is very high. The question that comes out is why the energy consumption was suddenly getting higher and higher. We noticed that, when someone is heating water or cook something to eat, the energy consumption becomes very high at the time when proceed to finish. During our experimentation after 12:10, it can be seen a sudden decrease like a latent period which resumed around 12:20. During that period, the power consumption became very high. The energy consumption follows a sinusoidal curve. The Fig. 5 shows the result obtained from the data collected on October 24th in 2018 in the room of the East campus in Nanjing University of Information Science and Technology.

In the Fig. 6, we found that the variation starts after 10 min. The data were collected on October 24th, 2018. The result has shown that from the starting time 12:00, the variation starts after 10 min and we observe a sequence of increasing and decreasing phenomena of energy consumption. It can be seen that the heat water energy consumption differs from the previous device. However, at the end of the collecting period, the figure shows us that curve of variation still on the same level of temperature.

The Fig. 7 shows a histogram obtained while measuring the heat water energy consumption. Three phases can be identified. At the beginning, the power consumption is low but around 40% of heating, the energy consumption becomes very high then decreases for a certain latent period. The energy consumption increases again at the end of the heat process when the water is at around 85% of its boiling level. At the end of the boiling phase there is no more power consumption.

4.2 Simulation and Comparison

In any home energy management system analysis, it is necessary to make a simulation of the data for a given house to be simulated to other houses in order to make a general and perspective analysis and establish some assumptions. This model simulation has been run with MATLAB software. The Data that we collected from the model during real-time execution or normal simulation, and stored to a variable gave us the schemes or figures of stream signal whom will be a result of our simulation. MATLAB workspace has some functions very important when we need to do analyze of data or if we need the plotting functions in comparison about variation and for visualization of our purposes. The Fig. 8 shows the simulation output.

5 Conclusion

Energy consumption and energy storage are among vital issues for human society and industry. For states, energy independence is strategic and economically essential. For individual and businesses, energy must be available on demand, without any sudden interruption. Any breakdown of energy supply has a high economic and social cost and negative impact in terms of health and safety. The Energy Storage Service will help to uncover the revenue streams and business opportunities most relevant to any project. It provides a foundation in the economics, market landscape and technology advancements that is essential to formulating innovative strategies in the energy storage market. It has been demonstrated that the energy consumption depends on the categories of electric devices, electrical appliance, the allocation and the number of electric equipment in the house. The energy consumption variation depends also on the way it is used. Some homes used more household electric equipment while others are using few electric equipment. We have shown that for example the telephone and heat water are not consuming the same energy.

References

Kamoto, K.M., Liu, Q., Liu, X.: Unsupervised energy disaggregation of home appliances. In: Sun, X., Chao, H.-C., You, X., Bertino, E. (eds.) ICCCS 2017. LNCS, vol. 10602, pp. 398–409. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68505-2_34
Chapter Google Scholar
Kolter, J.Z., Johnson, M.J.: REDD : a public data set for energy disaggregation research. In: SustKDD Work, CA, USA, pp. 1–4 (2011)
Google Scholar
Singh, P., Yadav, R.: Energy efficient and delay based on PSO with LST algorithm for wireless sensor network. In: International Conference on Computational Science and Engineering, vol. 6, pp. 2094–2100 (2016)
Google Scholar
Paper, C.: Extensive survey on k-means clustering using mapreduce in datamining. In: International Conference on Electronics and Communication Systems, pp. 2–5 (2016)
Google Scholar
Pereira, L., Quintal, F., Goncalves, R., Nunes, N.J.: SustData: a public dataset for ICT4S electric energy research. In: Proceedings of the 2014 Conference ICT for Sustainability, pp. 359–368 (2014)
Google Scholar
Jamsandekar, S.S., Mudholkar, R.R.: Fuzzy inference rule generation using genetic algorithm variant. IOSR J. Comput. Eng. 17, 9–16 (2015)
Google Scholar
Medasani, S., Kim, J.: An overview of membership function generation techniques for pattern recognition. Int. J. Approx. Reason. 19, 391–417 (1998)
Article MATH MathSciNet Google Scholar
Liu, Y., Peng, H., Wang, J.: Verifiable diversity ranking search over encrypted outsourced data. Comput. Mater. Contin. 55, 37–57 (2018)
Article Google Scholar
Meng, R., Rice, S.G., Wang, J., Sun, X.: A fusion steganographic algorithm based on faster R-CNN. Comput. Mater. Contin. 55(1), 1–16 (2018)
Google Scholar
Wu, C., Zapevalova, E., Chen, Y., Li, F.: Time optimization of multiple knowledge transfers in the big data environment. Comput. Mater. Contin. 54(3), 269–285 (2018)
Google Scholar
Ghosh, S., Biswas, S., Sarkar, D., Sarkar, P.P.: Mining frequent itemsets using genetic algorithm. Int. J. Artif. Intell. Appl. 1(4), 133–143 (2010)
Google Scholar
Marmelstein, E.: Application of genetic algorithms to data mining. In: MAICS-97 Proceedings, pp. 53–57 (1997)
Google Scholar
Prabhat, N.: A genetic-fuzzy algorithm to discover fuzzy classification rules for mixed attributes datasets. Int. J. Comput. Appl. 34, 15–22 (2011)
Google Scholar
Liu, Y., Qiu, B., Fan, X., Zhu, H., Han, B.: Review of smart home energy management systems. Energy Procedia 104, 504–508 (2016)
Article Google Scholar
Erol-Kantarci, M., Mouftah, H.T.: Energy-efficient information and communication infrastructures in the smart grid: a survey on interactions and open issues. IEEE Commun. Surv. Tutor. 17, 179–197 (2015)
Article Google Scholar
Rogozhnikov, A.: Machine Learning in High Energy Physics. University of Landon, Queen Mary (2015)
Google Scholar
Wang, Z., Ling, C.: On the geometric ergodicity of metropolis-hastings algorithms for lattice Gaussian sampling. IEEE Trans. Inf. Theory 64, 738–751 (2018)
Article MATH MathSciNet Google Scholar
Li, D., Zhang, G., Xu, Z., Lan, Y., Shi, Y.: Modelling the roles of cewebrity trust and platform trust in consumers’ propensity of live-streaming: an extended TAM method. Comput. Mater. Contin. 55, 137–150 (2018)
Google Scholar
Zeng, D., Dai, Y., Li, F., Sherratt, R.S., Wang, J.: Adversarial learning for distant supervised relation extraction. Comput. Mater. Contin. 55, 121–136 (2018)
Google Scholar
Kornaropoulos, E.M., Tsakalides, P.: A novel kNN classifier for acoustic vehicle classification based on alpha-stable statistical modeling. In: IEEE Workshop on Statistical Signal Processing Proceedings, pp. 1–4 (2009)
Google Scholar
Chen, Q., Li, D., Tang, C.K.: KNN matting. IEEE Trans. Pattern Anal. Mach. Intell. 35(9), 2175–2188 (2013)
Article Google Scholar
Bhatia, N., Vandana: Survey of nearest neighbor techniques. (IJCSIS) Int. J. Comput. Sci. Inf. Secur. 8(2), 302–305 (2010)
Google Scholar
García, C., Gómez, I.: Algoritmos de aprendizaje: knn & kmeans. Univ. Carlos III Madrid, pp. 1–8 (2006)
Google Scholar
Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit. 40(7), 2038–2048 (2007)
Article MATH Google Scholar
Parry, R.M., et al.: k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction. Pharmacogenomics J. 10(4), 292–309 (2010)
Article Google Scholar

Download references

Acknowledgement

This work has received funding from the European Union Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement no. 701697, Major Program of the National Social Science Fund of China (Grant No. 17ZDA092), Basic Research Programs (Natural Science Foundation) of Jiangsu Province (BK20180794), 333 High-Level Talent Cultivation Project of Jiangsu Province (BRA2018332) and the PAPD fund.

Author information

Authors and Affiliations

School of Electrical Engineering, University of Jinan, Jinan, China
Mingxu Sun
School of Computing, Edinburgh Napier University, 10 Colinton Road, Edinburgh, EH10 5DT, UK
Xiaodong Liu
Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET), Nanjing University of Information Science and Technology, Nanjing, 210044, China
Scholas Mbonihankuye
School of Computer and Software, Nanjing University of Information Science and Technology, Nanjing, 210044, China
Scholas Mbonihankuye

Authors

Mingxu Sun
View author publications
You can also search for this author in PubMed Google Scholar
Xiaodong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Scholas Mbonihankuye
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Scholas Mbonihankuye .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Xingming Sun
Nanjing University of Information Science and Technology, Nanjing, China
Zhaoqing Pan
Purdue University, West Lafayette, IN, USA
Elisa Bertino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, M., Liu, X., Mbonihankuye, S. (2019). Analysis of the Efficiency-Energy with Regression and Classification in Household Using K-NN. In: Sun, X., Pan, Z., Bertino, E. (eds) Artificial Intelligence and Security. ICAIS 2019. Lecture Notes in Computer Science(), vol 11633. Springer, Cham. https://doi.org/10.1007/978-3-030-24265-7_31

Download citation

DOI: https://doi.org/10.1007/978-3-030-24265-7_31
Published: 11 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-24264-0
Online ISBN: 978-3-030-24265-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics