Keywords

1 Introduction

A wireless sensor network (WSN) is a wireless network consists of interconnected set of sensor nodes. Each sensor node consists of a trans-receiver, battery, a memory of small size and a low capacity processor. The low cost sensor nodes and easy deployment techniques of sensor networks have led to the use of wireless sensor network in various applications like military activities, health care, disaster management, traffic analysis and many more. In such remote monitoring system, a large number of sensor nodes are randomly deployed within the remote places. This results in more than one node sensing the same event. All these nodes try to send redundant data to the server using multiple paths leading to a huge amount of energy drainage. Energy preservation is the high requirement of mobile nodes in this system. Thus it is obvious that the flat routing schemes have the tendency to have excessive data redundancy and hence leads to poor network lifetime. Hierarchical cluster based approach is taken into consideration to enhance the network performance and save battery power of sensor nodes. In a cluster, node whose priority is higher than other nodes in the network is called dominating node or cluster head. The responsibility of a cluster head is to gather the data from other member nodes of the cluster, reduce the redundant data and send the aggregated data to the sink node. Locations of the nodes need to be calculated in order to make the cluster. Global Positioning System (GPS) can be used to find the position of the nodes. The high deployment cost of GPS and high consumption of energy to find the location have forced the researchers to design other localization algorithms. DVHop [24] is an example of such a localization algorithm. This can be used to find location of the sensor nodes and thus create the clusters according to those locations. Cluster head has to forward the aggregated data to the sink node to inform about some specific event. Greedy forwarding [12] is a technique by which aggregated data can be forwarded from cluster head to sink node.

In greedy forwarding technique, the next hop will be selected by a locally optimal greedy choice of the forwarding cluster head. The locally optimal choice means the neighbour node, which is geographically closest to the packet destination will be the next hop node. Figure 1 describes an example of greedy next hop choice.

Fig. 1.
figure 1

Greedy forwarding example

In Fig. 1, node X and node D are cluster head and sink node respectively. Let it consider that the distance between Y and D is least among the neighbors of X. Thus the packet is forwarded from X to Y and the process repeats until it reaches to D.

In Fig. 2, we observe the problems arising due to greedy forwarding algorithm. It is obvious, that node x is closer to D than its two neighbors w and y. Hence x will not be able to choose any of the paths (x→y→z→D) or (x→w→v→D) and unable to send the packet directly to D as it is out of the radio range of x. No neighbors are there in the intersection area of x’s circular radio range and the circle about D of radius |xD|. This shaded region (in Fig. 2) can be termed as void and the problem of packet drop from node x is known as void problem.

Fig. 2.
figure 2

Void problem

In this paper the proposed Localization based Anti-Void Clustering Approach (LAVCA) for Energy Efficient Routing in WSN has used DVHop technique to find the location of the nodes, minimize the error of that calculated location, form cluster with the help of those calculated locations. The entire process saves a lot of energy consumption. The data is being sent to the sink node by using greedy forwarding technique. The void problem is being reduced by using rolling ball technique to reduce the data loss and increase the delivery rate.

The rest of the paper is organized as follows: Sect. 2 describes some of the related works, Sect. 3 includes the proposed methodology, Sect. 4 shows the simulation result, Sect. 5 is the concluding part and references are included in Section (References).

2 Related Work

WSNs have been deployed in remote areas for many of the applications. In remote areas recharge of battery is almost impossible and thus energy saving is the major concern for the researchers in this field. The other issues related to WSN are—security [3, 18], sensor localization [4], network lifetime [5], and sink mobility [7].

The grouping of nodes is very important to reduce data redundancy and thus to avoid wastage of energy. Hierarchical or cluster-based routing [8, 10, 14, 15] are well-known techniques to group the nodes into multiple clusters. Based on the nature of sensors constituting the network, hierarchical clustering may be homogenous or heterogeneous. The homogeneous network is consists of same type of sensor nodes. Some of routing protocols in this group are: LEACH [6], EECA [9] and HEED [23]. In heterogeneous network, two types of sensor nodes are there-sensor nodes with normal resource and sensor nodes with richer resource in terms of more battery power and memory. EEHC [13] and DEEC [21] are two examples of heterogeneous schemes. In general, cluster based routings seem to suffer from excessive computational overheads due to frequent cluster head updation. The locations of nodes are required to make the cluster. Global positioning system (GPS) is one of the options of localizing nodes. In WSN, a huge number of sensor nodes are normally deployed. Thus it is very expensive and increases consumption of energy to use GPS in each node of the network. Hence a localization algorithm with lower computational cost, limited power consumption and less hardware requirement is a challenging task for WSN [16]. Two types of localization algorithms have been proposed–range based and range free algorithms.

Range based algorithms [11] have used the distance estimation information for the purpose of localization. The accuracy of localization is higher in these algorithms. The deployment cost is increased for the use of additional hardware in order to measure the distance for large scale networks. However, the orientation information or distance between nodes is not required in range free algorithms. Many range free algorithms like Centroid, Amorphous, Approximate Point-In Triangle test, distance vector hop (DV-Hop) [24], have been designed for cost effectiveness and simplicity. The good coverage quality and feasibility makes DV-Hop most popular among other mentioned algorithms.

Once the locations of the nodes are calculated and clusters are being formed, the aggregated data should be sent to the sink node. Greedy forwarding (GF) algorithm [12] is one of the well known algorithms to send the data from cluster head to sink node. This algorithm states that the forwarding node will forward the packet via one hop neighbor [2]. The process will be repeated until the destination is reached. This technique does not incur additional overhead cost and is proven as efficient to reduce energy consumption. In this approach local minima or void problem [12] may arise. The void problem refers to the situation where one node will not be able to forward the data packet to the next hop as no other node exists that has shorter distance to destination node than itself. This problem may create black hole within the network and cause packet drops and huge amount of energy drainage.

Routing algorithms designed to resolve the void problem are categorized in two groups- non graph based scheme [17] and graph based scheme [12, 19, 22]. The authors of [20] introduces BOUNDHOLE algorithm to detect the holes and find an alternative route to the destination. This algorithm is used to separate the boundary of the holes and routes the packets according to greedy forwarding method [12]. The major problem of this algorithm is the false boundary detection, which increases the probability of falling into a loop. This may take a longer routing delay and wastage of great amount of energy causing degradation of the performance. The false boundary detection problem of BOUNDHOLE approach has been reduced by the author of Greedy anti-void routing (GAR) [12] protocol. It introduces a rolling ball method. The rolling ball is hinged at the node affected by the void problem and rotates anti-clock wise with R/2 radius. The node that is closer to the destination node and intersects with the rolling ball first, will be the next hop node. The process repeats until the data packet reaches to the destination node. GAR performs better than BOUNDHOLE, still due to the visit of unnecessary nodes, GAR causes higher energy consumption.

It is clear from the state of the art study that many algorithms have been designed to implement energy efficient routing protocol with the help of cluster based approach. In WSN, the dense deployment of sensor nodes increases data redundancy, which causes a great amount of energy wastage. Thus grouping of the nodes of a particular region into a cluster, send all the sensed data through the cluster head to the sink node is better option for energy efficiency. The positions of the nodes are required to know in order to make the cluster. Though, GPS can help to find the position, it is economically difficult to attach a GPS with all the sensor nodes. The GPS based positioning system is also infeasible in remote places with coverage problem. Thus some localization technique is required to know the positions of the sensor nodes. The novelty of this paper is, the location of the nodes have been calculated in order to make the cluster without using GPS, thus reducing the deployment as well as overhead cost. Instead of making cluster throughout the network, use of event based clustering helps to save from a great amount of energy wastage.

3 Proposed Work

The proposed approach has used DVHop based localization technique [24] to know the positions of the sensor nodes, followed by creation of cluster with the help of those positions and choose the cluster head. Greedy forwarding method is used to forward the aggregated data to the sink node. Rolling ball technique is used to avoid Anti-void problem. The module wise description is given in the following sub sections.

3.1 Location Discovery of Individual Nodes

The deployment of sensor nodes should be done in such a way that a fewer number of nodes will have the Global Positioning System (GPS) and the rest of the nodes do not have that system. The nodes enabled with GPS are aware of their location and are called anchor nodes. The nodes without a GPS system use DVHop [24] technique to find their location with the help of anchor nodes. It is economically difficult to attach GPS system in all the sensor nodes. The broadcasting of position information from all the nodes also leads to a great amount of energy depletion. Thus to minimize energy consumption during location discovery, the DVHop technique is used for most of the nodes. The proposed logic involves three steps to calculate the location of a node.

In the first step, as in DVHop location information of hop count and anchor nodes are broadcasted by the beacon packets. Each node maintains a table (xi, yi, hopi) for every anchor node located in the position (xi, yi) and the minimum number of hops from that ith anchor node is hopi. In case of multiple received packets, the least hop count value to a particular anchor node will be settled as the hop count value of the table. This mechanism helps to all the nodes in the network to obtain minimum hop count value from every anchor node.

In the second step, average size for one hop \( ( {\text{E}}_{{{\text{HopSize}}_{\text{i}} }} ) \) is calculated for an anchor node, with respect to other anchor nodes as in Eq. (1).

$$ {\text{E}}_{{{\text{HopSize}}_{\text{i}} }} = \left( {\sum\nolimits_{{{\text{i}} \ne {\text{j}}}} {\sqrt {({\text{x}}_{\text{i}} - {\text{x}}_{\text{j}} )^{2} + ({\text{y}}_{\text{i}} - {\text{y}}_{\text{j}} )^{2} } } } \right) /\left( {\sum\nolimits_{{{\text{i}} \ne {\text{j}}}} {{\text{h}}_{{{ \hbox{min} }_{\text{ij}} }} } } \right) $$
(1)

Where, (xi, yi) and (xj, yj) are the coordinates of anchor node i and j, hij is the minimum number of hops between nodes i and j. Once hop size is calculated, anchor nodes broadcast its hop size in the network by the use of flooding. The unknown node ‘u’ (the location information of which is unknown) saves the first arrived message (hop-size) after receiving the hop-size information and then transmits to neighbors. In this way, most nodes receive hop size of the nearest anchor node. The distance (distua) between an unknown node ‘u’ and anchor node ‘a’ is calculated as in Eq. (2).

$$ {\text{dist}}_{\text{ua}} = {\text{E}}_{{{\text{HopSize}}_{\text{i}} }} \times {\text{hop}}_{\text{ua}} $$
(2)

Where, HopSizei is the hopsize between the unknown node ‘u’ and its nearest anchor node i, hopua is the minimum number of hops between anchor node ‘a’ and unknown node ‘u’.

In the final step, polygon method is used to estimate the location of unknown nodes. Let us assume that, (x, y) is the location of unknown node u, (xi, yi) is the location of ith anchor node, and di is the distance between the unknown node u and anchor node i. Therefore, distance of unknown node u from n number of anchor nodes is given by Eq. (3).

$$ \left\{ {\begin{array}{*{20}c} {(x - x_{1} )^{2} + (y - y_{1} )^{2} = d_{1}^{2} } \\ {(x - x_{2} )^{2} + (y - y_{2} )^{2} = d_{2}^{2} } \\ . \\ . \\ . \\ {(x - x_{n} )^{2} + (y - y_{n} )^{2} = d_{n}^{2} } \\ \end{array} } \right. $$
(3)

Subtraction of first equation from the last will generate the following equation-

$$ \left\{ {(x - x_{1} )^{2} - (x - x_{n} )^{2} } \right\} - \left\{ {\left( {y - y_{1} } \right)^{2} - \left( {y - y_{n} } \right)^{2} } \right\} = d_{1}^{2} - d_{n}^{2} $$
(4)

Equation 4 can be simplified as follows-

$$ 2{\text{x}}\left( {{\text{x}}_{1} - x_{n} } \right) + 2y\left( {{\text{y}}_{1} - y_{n} } \right) = x_{1}^{2} + y_{1}^{2} - x_{n}^{2} - y_{n}^{2} + {\text{d}}_{\text{n}}^{2} - {\text{d}}_{1}^{2} $$
(5)

Equation 5 can be represented in matrix form as in Eq. 6.

$$ {\text{AX}} = {\text{B}} $$
(6)

Where, A, X and B are given as:

$$ A = \left[ {\begin{array}{*{20}c} {2(x_{1} - x_{n} )} & {2(y_{1} - y_{n} )} \\ {2(x_{2} - x_{n} )} & {2(y_{2} - y_{n} )} \\ \vdots & \vdots \\ {2(x_{n - 1} - x_{n} )} & {2(y_{n - 1} - y_{n} )} \\ \end{array} } \right]B = \left[ {\begin{array}{*{20}c} {x_{1}^{2} + y_{1}^{2} - x_{n}^{2} - y_{n}^{2} + d_{n}^{2} - d_{1}^{2} } \\ {x_{2}^{2} + y_{2}^{2} - x_{n}^{2} - y_{n}^{2} + d_{n}^{2} - d_{1}^{2} } \\ \vdots \\ {x_{n - 1}^{2} + y_{n - 1}^{2} - x_{n}^{2} - y_{n}^{2} + d_{n}^{2} - d_{n - 1}^{2} } \\ \end{array} } \right] $$

and, \( {\text{X}} = \left[ {\begin{array}{*{20}c} {\begin{array}{*{20}c} x \\ y \\ \end{array} } \\ \end{array} } \right] \).

Thus, Location of unknown node u can be calculated by using the lst square method as in Eq. 7.

$$ {\text{X}} = ({\text{A}}^{ '} {\text{A}})^{ - 1} {\text{A}}^{ '} {\text{B}} . $$
(7)

where, \( {\text{A}}^{ '} \). represents the transpose of matrix A.

3.2 Cluster Formation and Cluster Head Selection

All the nodes calculate their locations by the technique discussed in the previous section. The locations are used to form the cluster and select the cluster head. In this protocol, the nodes will not be involved in cluster formation process in order to save energy. Only the nodes involved in sensing an event are used to form the cluster. These nodes are known as active nodes. All the active nodes send their location to other active nodes. Let the locations of the active nodes are (xi, yi), where i = 1, 2,…, n. The location of the centroid (xc, yc) of these active nodes can be calculated as—

$$ {\text{x}}_{\text{c}} = \frac{{\mathop \sum \nolimits_{{{\text{i}} = 1}}^{\text{n}} {\text{x}}_{\text{i}} }}{n} $$
(8)
$$ {\text{y}}_{\text{c}} = \frac{{\mathop \sum \nolimits_{{{\text{i}} = 1}}^{\text{n}} {\text{y}}_{\text{i}} }}{n} $$
(9)

The cluster is formed as a circle, the radius of which is the distance between the centroid (xc, yc) and the farthest active node as this circle includes all the active nodes. The cluster head should the node, the location of which is nearer to the centroid and the remaining energy is higher than other members of the cluster. In this regard every member node calculates their competition bid value (CV) to compete as a candidate for cluster head selection process as in Eq. (9).

$$ {\text{CV}} = {\text{k}} \times \frac{{{\text{E}}_{\text{Ri}} }}{{{\text{d}}_{\text{ci}} }} , \forall i = 1,2, \ldots ,n $$
(10)

Where, n number of member nodes are there in the cluster, ERi is the remaining energy of the ith node and dci is the distance of node i from the centroid. Each node sends their CV value to other member nodes of the cluster. The node with highest CV value declares itself as a cluster head.

3.3 Greedy Forwarding of Aggregated Data to the Sink Node

Here we are trying to forward the aggregated data to the sink node, reducing the void problem in an energy efficient way. This is achieved by using greedy forwarding technique. In this forwarding technique the network is represented by a set of sensor nodes N = {Ni | ∀ i}. The locations pertaining to nodes of set N can be represented by the set P = {PNi | PNi = (xNi, yNi), ∀ i}. D = {D(PNi, R) | ∀i} is the set of closed disks defining the transmission ranges of N, where D(PNi, R) = {x | ||x-PNi|| ≤ R, ∀ x ∈R2}. The center of the closed disk is PNiand R represents the radius of the transmission range for each node Ni. Hence the network model can be represented by a unit disk graph (UDG) as G(P, E), where the edge set E = {Eij | Eij = (PNi, PNj), PNi ∈ D(PNj, R), ∀ i ≠ j}. The neighbor table for each Ni is defined as–

$$ {\text{T}}_{\text{Ni}} = \, \{ \left[ {{\text{ID}}_{\text{Nk}} ,{\text{P}}_{\text{Nk}} } \right] \, |{\text{ P}}_{\text{Nk}} \in \,{\text{D}}\left( {{\text{P}}_{\text{Ni}} ,{\text{R}}} \right),\forall \,{\text{k }} \ne {\text{ i}}\} $$

where, IDNkis the designated identification number for the node Nk. In greedy forwarding algorithm it is assumed that the source node NS is aware of the location of the destination node ND. The next hop is selected to forward the data packet from TNS. Two conditions have to be satisfied for the next hop selection as- (1) which has the shortest distance from the destination node ND among the nodes in TNS, and (2) is located closer to ND, compared to the distance from NS and ND. The process continued until the destination is reached. In this technique void problem may arise when a forwarding node will not have any suitable neighbor to forward the data packet. Then all the incoming packets will be dropped at that node.

The proposed LAVCA protocol is designed in such a way that void problem can be resolved. The rolling ball concept is used to perform the task. The technique is depicted in Fig. 3, where, the source node NS wants to send the packet to destination node ND. NS chooses the next hop node to N1 as per greedy forwarding algorithm. The void problem occurs at node N1. To solve the problem a circle is formed, the center point of which is S1 and the radius is R/2 where the transmission range is R. The circle is hinged at N1 and starts anti clockwise rolling until a node has been encountered by the boundary of the circle (N4 in the example of Fig. 3). Thus the data packet is moved from N1 to N4, where a new circle will be formed of equal size, which is centered at s2 and hinged at node N4. The counterclockwise rolling procedure finds node N5 as next hop node. The process repeats until the node N7 is reached, which is considered to have a smaller distance to the destination node ND than that of N1 to ND. At node N7, the conventional greedy forwarding scheme is resumed. Thus the resulting path becomes NS, N1, N4, N5, N6, N7, N8, and ND. The algorithm of this forwarding technique is as follows—

Fig. 3.
figure 3

Construction of routing path with resolving void problem

4 Simulation Result

The performance of LAVCA protocol is simulated by the tool NS2. The performance of LAVCA is compared with CASER [1], EEHC [13] and DEEC [21]. The simulation parameters are mentioned in Table 1.

Table 1. List of parameters

The process of data collection from member nodes of a cluster by the cluster head, aggregate and encrypt that and forward that to the sink node is known as a round. The nodes with energy value which is below a threshold value, is known as dead node. The number of dead nodes is obtained after completion of each round. Figure 4 traces the rate of increase in the number of dead nodes for six rounds. The existing routing logics CASER, EEHC, and DEEC are also simulated to obtain the number of dead nodes. CASER uses a grid based routing protocol where the next adjacent grid will be selected based on probability value. This probability value is calculated based on average residual energy of the grid. DEEC uses a probability value based on the ratio of residual energy of a node to that of the total network for selection of a node as cluster head. It also predicts that equal amount of energy will be lost at each round. The algorithm EEHC has also considered the residual energy of each node as the only parameter for selecting the cluster head. Proposed protocol LAVCA makes the cluster in circle form and chooses the cluster head nearer by the centroid position of that cluster. Additionally the remaining energies of the member nodes, which are nearby to centroid position are also considered. Thus the selected cluster head will be having more residual energy and also located in a well-connected position. LAVCA finds the location of the node by DVHop method instead of get it from GPS system. Thus it decreases the depletion of energy at the time of cluster formation. Hence the graph of Fig. 4 shows better result in case of LAVCA, as more nodes die in CASER, EEHC, and DEECover the same number of rounds.

Fig. 4.
figure 4

Number of dead nodes vs. number of rounds

The number of packets successfully delivered to the base station is known as throughput. As the load increases with time, the throughput is also increasing. The network will be congested after a certain amount of time, which leads to decrease the throughput. Figure 5 shows that LAVCA is controlling the congestion more efficiently and the decrease in throughput is less than the algorithms CASER, EEHC, and DEEC.

Fig. 5.
figure 5

Throughput vs. load

LAVCA also works better in a dense network. Initially it takes a little bit of more time for a dense network to create different clusters and to select the cluster heads. Once all the clusters are formed LAVCA gives better delivery rate than CASER, EEHC and DEEC as shown in Fig. 6.

Fig. 6.
figure 6

Density vs. delivery rate

5 Conclusion

In this paper, the proposed protocol LAVCA has used DVHop technique to find the location of the nodes instead of using GPS. This approach reduces both the deployment cost and energy consumption to find the location of the nodes after deployment. In LAVCA, event based cluster formation reduces data redundancy and improves the performance in terms of energy efficiency. Greedy forwarding technique is used to forward the aggregated data from cluster head to sink node. Void problem is minimized by using rolling ball technique to increase the delivery rate. The simulation results show that LAVCA performs better in terms of number of dead nodes, throughput and delivery rate with compare to CASER, EEHC and DEEC.