1 Introduction

The basic assumption of mobile ad hoc networks (MANETs) is the existence of an end-to-end path from source to destination [1, 2]. Unfortunately, the application of MANETs in the scenarios of sparse node distribution does not satisfy this assumption. Delay-tolerant networks are a class of emerging wireless networks, in which most of the time there does not exist an end-to-end path from source to destination [3]. Many real networks fall into this category, such as wildlife tracking and habitat monitoring sensor networks [4], vehicular sensor networks [5], underwater sensor networks [6], satellite networks [7], military networks [8], and so on. Therefore, sparse MANETs are classified into DTNs to obviate the gap between MANETs research and real applications. In some of DTNs, for examples, pocket switch networks and vehicular sensor networks, the behaviors of their nodes are controlled by human beings. Consequently, the mobility of nodes in these DTNs is the same as that of humans, and the contact feature between nodes also matches up with the relationship between humans. Social network represents the relationships between humans. Thus, the routing problem in DTNs can be regarded as a message dissemination challenge in social networks. Social network consists of various communities [1, 9, 10]. A community is a group of individuals who have higher contact ratio among themselves than strangers do. Moreover, individuals are more inclined to share information with others that come from the same community. Generally, this phenomenon is called clustering in [1, 1012]. For the application of DTNs in social network scenarios, community awareness should be helpful for message dissemination. In reality, a community consists of the individuals who dwell in the same region or those who have common interests. In this paper, the notion refers to the former case.

The challenge on multicast routing is that it is difficult to find suitable relays for a group of specific receivers for DTNs. In this paper, multicasting in DTNs is treated from the social network perspective. Pan et al. [12] focuses on two key social metrics: Community and Centrality, as are also concerned in our work. What are more, we concern whether the source and destination come from the same community or not as well. Specially, we also pay attention to contact probability, the historical contact information between a node and its neighbors, in our work. The main contributions of this paper are:

  1. (1)

    Community aware forwarding schemes. In consideration of clustering phenomenon, we propose a hierarchical multicast routing algorithm (ENCAR) for the application of DTNs in social scenarios. Because social based mobility models have community structure, the multicasting schemes in our work are community aware. For a message, ENCAR forwards it with two cases differently: (A) the source and destination come from the same community; and (B) the source and destination come from different communities. For case A, the dissemination of messages is restrained in the community, while ENCAR takes its priority to deliver messages to their destination communities for case B.

  2. (2)

    Novel utility function based on contact probability and centrality analysis. Unlike some social-based routing algorithms which only focus on centrality analysis, ENCAR also takes into consideration contact ratio and the future contact probability between a node and its neighbors. The performances of ENCAR are theoretically analyzed in terms of message delivery ratio, end-to-end delay and average hops, the influence of contact ratio is revealed on routing performance and also verified by the simulation results.

  3. (3)

    Community aware mobility model. To simulate the mobility of individuals in social networks, we develop the random waypoint mobility model (RWP) and propose a community aware version, called Community Based Random Way Point Mobility Model(CB_RWP). In CB_RWP, its mechanism can simulate clustering phenomenon. Moreover, we prove that the encounter process of nodes is a Poisson process, and the inter-contact time between any pair of nodes follows an exponential distribution in CB_RWP. This is further validated by \(\chi ^{2}\) test in our simulation.

The rest of this paper is organized as follows: Sect. 2 reviews the existing multicast routing algorithms and social network based routing algorithms for DTNs, and introduces the analysis methods of social networks; Sect. 3 describes the construction of egocentric network, utility based multicasting strategy and ENCAR in detail; Sect. 4 theoretically analyzes the performance of ENCAR in terms of message delivery ratio, average end-to-end delay and average hop count; Sect. 5 describes CB_RWP and shows simulation results. Finally, this paper is concluded in Sect. 6.

2 Related work

2.1 Multicasting and social network based routing algorithms

Pioneer work on the multicasting problem in DTNs is studied by Zhao et al. [14]. Three new semantic models, including Temporal Membership (TM), Temporal Delivery (TD) and Current-Member Delivery (CMD), are defined in [14]. Ye et al. [15] introduce three basic multicasting strategies for DTNs, which are unicast-based multicast (U-Multicast), static-tree-based multicast (ST-Multicast) and dynamic-tree-based multicast (DT-Multicast). U-Multicast is the simplest multicasting strategy. However, to ensure delivery ratio, lots of message duplicates will be created, and routing overhead will be heavy. In ST-Multicast, for a certain message, its source node sets up a multicasting tree according to collected global information, and the tree will not be adjusted during the process of message dissemination. The topology of DTNs always varies with time, so the multicasting tree may be disrupted. In DT-Multicast, global information is constantly collected by relay nodes to update multicasting tree. Compared to U-Multicast and ST-Multicast, DT-Multicast is more adaptive to topology variation. However, in DT-Multicast, it is difficult for nodes to collect global information, and the routing overhead is larger. Therefore, DT-Multicast is inappropriate for real scenarios despite of its good performance in simulation.

An encounter-based multicasting scheme called EBMR is proposed in [16]. EBMR is a delivery-predictability scheme, it allows nodes to cache the messages until a good next-hop node can be found to relay them to the destinations, it uses fewer hops for message delivery and can achieve higher delivery ratio while maintaining high data transmission efficiency. Le et al. [17] propose a two-level single-copy multicast routing strategy to optimize both the computing resource usage and the delivery ratio. This scheme minimizes the transmission cost by bundling multiple multicast receivers into a single copy of the messages, and forwarding it to an encounter node that has high delivery probabilities to those multicast receivers. Recently, in [18], a novel multicast polymorphic epidemic routing scheme is presented. This scheme exploits the sociality-aided adaptive infection recovery strategies, and can increase the probability that information is delivered to the intended destinations while reducing the overhead of message replication.

Some works treat the routing in DTNs from a social network perspective. Social network analysis is introduced in these works to measure the centrality of nodes. SimBet [11] is a utility based unicast algorithm. Its utility function is constructed based on similarity, betweenness and tie strength analysis. In SimBet, message carrier selects next hop after comparing its utility value with that of encounter nodes. Gao et al. [13, 19] introduce contact feature in social networks: the inter-contact time between any pair of nodes follows an exponential distribution for routing. Setting the expected message delivery ratio, Gao et al. map the routing to the knapsack problem, that is, how to achieve the expected delivery ratio with the minimum number of relay nodes. By solving the knapsack problem, the best forwarding path for a certain message can be obtained. In their single-data multicasting (SDM), this strategy achieves well performed results. Li et al. [9] evaluate the influence of node selfishness on the performance of two-hop relaying and epidemic relaying in DTNs. For a specific node, two categories of selfishness are defined: individual selfishness and social selfishness. Simulation results show that routing scheme performance varies with node selfish behavior, and epidemic relaying is more susceptible than two-hop relaying. Zhang et al. [20] regard the routing problem in DTNs as a message dissemination challenge between social communities. Interests, environment and background constitute the attribute of a node, and are recorded in a vector. In their opinion, similarity between a pair of nodes is just the closeness of their attribute vectors. Xiao et al. [21] propose a distributed optimal community-aware opportunistic routing (CAOR) algorithm. CAOR builds home-aware communities, each home only forwards its messages to the node in its optimal relay set, it outperforms Bubble Rap and SimBet in terms of delivery rate and average delay. The surveys of social-based routing in DTNs are given in [22, 23]. From the above statement, we can see that most of the social network based routing algorithms focus on unicasting for DTNs, and these schemes also take rarely community awareness into consideration for multicasting even though social based mobility models have community structure [31].

2.2 Social network analysis

2.2.1 Centrality measures

Centrality indicates the importance of a node in a social network. Generally, there exist the following centralities: (1) Degree centrality, which is measured as the number of direct ties that involve a given node; (2) Closeness centrality. It is evaluated on the geodesic distance \(d(v_{i},v_{j})\), that is, the shortest indirect path from node \(v_{i}\) to \(v_{j}\); (3) Similarity centrality, which indicates the future collaborations between a pair of nodes. The similarity centrality between \(v_{i}\) and \(v_{j}\) can be estimated as \(C_{s}(v_{i},v_{j})=\frac{N(v_{i})\cap N(v_{j})}{N(v_{i})\cup N(v_{j})}\), where \(N(v_{i})\) and \(N(v_{j})\) are the sets of the neighbors of \(v_{i}\) and \(v_{j}\) respectively; (4) Betweenness centrality. Sociocentric betweenness reflects the intermediary location of a node along indirect relationships linking other nodes. The sociocentric betweenness of \(v_{i}\) is \(C_{B}(v_{i})=\sum \nolimits _{j=1}^{n}\sum \nolimits _{k=1}^{j-1}\frac{g_{jk}(v_{i})}{g_{jk}}\), where \(g_{jk}\) is the sum of geodesic paths linking \(v_{j}\) and \(v_{k}\), \(g_{jk}(v_{i})\) is the number of those geodesic paths that include \(v_{i}\). More detailed description of these centralities can be found in [23, 24].

A node with high betweenness has the capacity of facilitating or limiting the interaction between the nodes it links [12, 25]. Betweenness is always adopted as a metric to measure the ability of nodes. However, from the above description, it can be learnt that global information about the network is needed to estimate node sociocentric betweenness. The collection of global information is a difficulty for the nodes in sparse social networks, thus, sociocentric betweenness is inappropriate for practical application. To address this issue, egocentric network and egocentric betweenness are proposed. Egocentric networks are defined as networks consisting of an actor (ego node) together with its alters (neighbors), and all the links among these nodes. The distance from the actor to any alter is one hop. As a result, egocentric betweenness is the betweenness centrality of ego in an ego network. Some works compare egocentric betweenness with sociocentric betweenness in different networks and conclude that egocentric betweenness has a significant linear correlation with sociocentric betweenness. Everett and Borgatti [26] discusses the relationship between egocentric and sociocentric betweenness in detail, and concludes that sociocentric betweenness can be replaced by egocentric betweenness in practical applications. The egocentric betweenness of node \(v_{i}\) is denoted as \(C_{B}^{'}(v_{i})\), and its estimation is given in [25]. It is clear that global information is not needed during the estimation.

2.2.2 Node mobility

DTNs are characterized by intermittent connectivity where communication largely depends on the mobility pattern of the participating nodes and mobility influences the performance of opportunistic routing algorithms [31]. Gao et al. [13] perform an experiment to study the mobility of nodes in social network scenarios. Two experimental traces collected from realistic social network scenarios are adopted. Each trace records the contacts in corporate environments. Users who carry bluetooth devices periodically discover their peers and record contacts. The chosen traces cover a large diversity of environments, from university campuses (MIT Reality) to conference sites (Infocom), with experimental periods from a few days (Infocom) to several months (MIT Reality). Finally, Gao et al. observe that for most of the contacted node pairs in the two selected traces, the pairwise inter-contact time is exponentially distributed. To validate this observation, Gao et al. do \(\chi ^2\) hypothesis test on each node pair. The results show that most of the node pairs pass the test under different significance levels. Thus, Gao et al. conclude that the encounter process between a pair of nodes is Poisson process and the inter-contact time t is exponentially distributed, i.e., \(f(t)=\lambda \times e^{-\lambda \cdot t}\), where \(\lambda\) is the contact ratio for a pair of nodes and can be estimated as \(\lambda = \frac{n}{\sum \nolimits _{i=1}^{n}T_{i}}\), \(T_{1}, T_{2}, \ldots , T_{n}\) are the inter-contact time samples. Consequently, for a specific pair of nodes, the contact probability during T can be represented as \(F(T) = 1 - e^{-\lambda \cdot T}\).

Since numerical studies have focused on the inter-contact time patterns between encounter nodes in mobile ad hoc networks(MANETs) [27], vehicular ad hoc networks(VANETs) [28], and DTNs [29, 30], in which the authors all find that the empirical distributions of inter-contact times tend to conform with well-fitted exponential curves, which lay the foundation of our algorithm and validate the start point of our mechanism.

3 ENCAR: Egocentric network focused community aware multicast routing

In this section, we propose an egocentric network focused community aware multicast routing (ENCAR), which is for the application of DTNs in social network scenarios. As mentioned before, centrality indicates the importance of a node in a social network, and a node with higher centrality is more popular [32]. ENCAR is utility based, and its utility function involves social network centrality analysis and contact probability between two nodes. In multicasting, a message has a group of receivers, which are randomly distributed in different communities. Since clustering phenomenon exists in social networks, community awareness is also considered in ENCAR. Here, there are two forwarding strategies to deal with the cases where the source and destination of a message come from the same community and different communities individually. In ENCAR, a message carrier searches its forwarding paths in its egocentric network. The quality of a forwarding path is measured by the contact probability from message carrier to its corresponding destination. For a specific message, the centrality of its carrier and contact probability of the best forwarding path constitute the utility value. By comparing the utility value of its carrier with that of encounter nodes, the message carrier determines whether it selects the encounter node as the next hop or not. In this paper, similarity and egocentric betweenness are utilized as centrality measures. Firstly, the construction of egoentric network for a specific node is described, and secondly, the utility based multicast forwarding scheme is analyzed. Finally, the forwarding strategy of ENCAR is described in detail.

Some notations used in this paper are summarized in Table 1.

Table 1 Description of some notations

3.1 Construction of egocentric network

In ENCAR, similarity and egocentric betweenness are adopted to set up the utility function. Consequently, the construction of egocentric network becomes the prerequisite. From [25], it can be learnt that the egocentric network around an actor contains not only the contacts between the actor and alters, but also the contacts between those alters. For a node in DTNs, it can derive the information about its neighbors (alters), as they have encountered before. However, obtaining the contact information between its neighbors is a challenge for the node. To address this issue, we propose the concepts of primary neighbors (PNs) and secondary neighbors (SNs) of a node. For node \(v_{i}\), its PNs are the nodes that have encountered \(v_{i}\), while its SNs are the nodes that have contacted its PNs. So, the neighbors of \(v_{i}\) can be represented as \(N(v_{i})=\{N_{P}(v_{i})\bigcup N_{S}(v_{i})\}\), where \(N_{P}(v_{i})\) is the PNs set of \(v_{i}\), \(N_{S}(v_{i})\) is the SNs set of \(v_{i}\). Shown as Fig. 1, \(N_{P}(v_{i}) = \{v_{1}, v_{2}, v_{3}, v_{4}, v_{10}\}\), \(N_{S}(v_{i}) = \{N_{P}(v_{1}), N_{P}(v_{2}), N_{P}(v_{3}), N_{P}(v_{4}), N_{P}(v_{10})\} = \{\{v_{4}, v_{5}, v_{10}\}, \{v_{6}, v_{7}, v_{10}\}, \{v_{9}\}, \{v_{1}, v_{5}, v_{8}\}, \{v_{1}, v_{2}\}\}\).

Correspondingly, we can divide the information that \(v_{i}\) owns into two parts: (A) Primary Information (PI), which is a set of contact ratio between \(v_{i}\) and its PNs; (B) Secondary Information (SI), which is a set of contact ratio between PNs and SNs of \(v_{i}\). Table 2 shows the PI of \(v_{i}\). When \(v_{i}\) encounters \(v_{1}\), they exchange PI. After \(v_1\) transmits its PI to \(v_i\), \(v_i\) adds PI derived from \(v_{1}\) into its SI. Similarly, the same thing goes for \(v_1\). According to its PI and SI, \(v_{i}\) can set up the egocentric network with itself acting as the ego node. It can be learnt that nodes in the egocentric network except \(v_i\) all come from PNs.

Fig. 1
figure 1

The neighbors of node \(v_{i}\)

Table 2 The PI of node \(v_{i}\)

Theoretically, \(v_{i}\) can derive all the SI of \(v_{1}\). However, it is not necessary in our view. Firstly, \(v_{i}\) owns enough information to organize the egocentric network; secondly, the value of information is constrained by time. With the increase of intermediate nodes, the value of information decreases, which is widespread in social networks.

3.2 Utility based multicast routing strategy

Multicasting aims to deliver a message to a group of receivers. This is a challenge in DTNs because of the deficiency of end-to-end paths. Specially, the utility based multicast routing strategy has attracted a lot of attention [3335]. Destination awareness is utilized in this forwarding scheme to guide message dissemination. Figure 2 illustrates the details of this forwarding scheme. Let \(v_{s}\) denote the source node of message M, whose targeted receivers are \(V_{d}(M)=\{v_{d1}, v_{d2}, \ldots , v_{d6}\}\). When \(v_{s}\) encounters \(v_{a}\), for each destination in \(V_{d}(M)\), \(v_{s}\) compares its utility value with that of \(v_{a}\) to determine whether \(v_{a}\) is a suitable relay. For a certain destination, only when the utility value of \(v_{a}\) is larger than that of \(v_{s}\), \(v_{s}\) selects \(v_{a}\) as the relay node for the destination. If \(v_{a}\) is the suitable relay from the perspective of destinations \(v_{d1}\) to \(v_{d5}\), \(v_{s}\) will forward a copy of M (denoted as \(M_{a}\)) to \(v_{a}\), initialize the destination set of \(M_{a}\) as \(V_{d}(M_{a})=\{v_{d1}, v_{d2}, \ldots , v_{d5}\}\), and adjust the destination set of M to \(V_{d}(M)=\{v_{d6}\}\). At the same time, \(v_{a}\) adds \(M_{a}\) into its forwarding list \(S_{f}(a)\). For \(v_{d6}\), if \(v_{b}\) is an ideal relay, \(v_{s}\) will forward a copy of M (denoted as \(M_{b}\)) to \(v_{b}\), set \(V_{d}(M_{b})\) to \(V_{d}(M_{b})=\{v_{d6}\}\), and adjust \(V_{d}(M)\) to \(V_{d}(M)=\{\}\). \(v_{b}\) adds \(M_{b}\) into its forwarding list \(S_{f}(b)\). Once appropriate relay nodes are found for all the destinations of M, M is deleted from the forwarding list of \(v_{s}\). The above process is repeated hop-by-hop until all the targeted destinations receive a copy of M. The detailed description of the utility based multicasting strategy is given in Algorithm 1.

figure g
Fig. 2
figure 2

Utility based multicasting strategy

3.3 Forwarding schemes of ENCAR

From the description of utility based multicast forwarding scheme, it can be seen that the estimation of utility value is a key factor. Focusing on the application of DTNs in social network scenarios, we adopt social network centrality analysis and contact probability between nodes to set up the utility function, and propose an egocentric network focused community aware multicast routing (ENCAR). Considering clustering phenomenon, ENCAR differently copes with the cases where the source and destination of a message come from the same community and different communities. If \(v_{s}\) is the source node of message M, \(v_{d}\) denotes one of the destination nodes, \(C_{m}(v_{s})\) represents the community that \(v_{s}\) comes from, and \(C_{m}(v_{d})\) is the community that \(v_{d}\) comes from. The forwarding schemes of ENCAR are as follows:

  • Scheme A: \(C_{m}(v_{s})=C_{m}(v_{d})\)

According to clustering phenomenon, the relationship between individuals that come from the same community is more intimate than those that come from different communities. This leads to the unequal contact opportunities between these individuals. ENCAR is community aware, therefore, if \(C_{m}(v_{s})=C_{m}(v_{d})\), from the perspective of \(v_{d}\), the dissemination of M should be restrained in \(C_{m}(v_{s})\). That is, all the relay nodes should come from \(C_{m}(v_{s})\). As is mentioned at the beginning of this section, message carrier searches the best forwarding path in its egocentric network, and the quality of a forwarding path can be measured by the contact probability along this path. Similarity and egocentric betweenness are also adopted to set up the utility function. However, for a specific node, we notice that the calculation of similarity and egocentric betweenness is based on its neighbor set, which may contain the nodes coming from other communities. Thus, if the dissemination of M is restricted in \(C_{m}(v_{s})\), the egocentric network around a message carrier should be modified into one consisting of nodes that come from \(C_{m}(v_{s})\). To distinguish between them, we call the modified egocentric network as the intra-community egocentric network, and the former one as the inter-community egocentric network.

Let \(v_{A1}\), \(v_{A3}\), and \(v_{A8}\) denote the nodes coming from community A, \(v_{A3}\) be a destination of message M, and \(v_{A1}\) be the current message carrier. If \(v_{A1}\) encounters \(v_{A8}\), \(v_{A1}\) needs to determine whether \(v_{A8}\) is a suitable relay for M from the perspective of \(v_{A3}\). According to the intra-community egocentric network around \(v_{A1}\), let \(\mathbb {N}_{P}^{'}(v_{A1})\) be the set of primary neighbors (PNs) of \(v_{A1}\) that only come from \(C_{m}(v_{s})\) (obviously, \(\mathbb {N}_{P}^{'}(v_{A1}) \subseteq N_{P}(v_{A1})\)). Similarly, let \(\mathbb {N}_{P}^{'}(v_{A8})\) be the set of PNs of \(v_{A8}\) coming from \(C_{m}(v_{s})\). There are the following cases:

  1. 1.

    if \(v_{A3}\in \mathbb {N}_{P}^{'}(v_{A1})\wedge v_{A3}\notin \mathbb {N}_{P}^{'}(v_{A8})\), \(v_{A1}\) retains M;

  2. 2.

    if \(v_{A3}\notin \mathbb {N}_{P}^{'}(v_{A1})\wedge v_{A3}\in \mathbb {N}_{P}^{'}(v_{A8})\), \(v_{A1}\) selects \(v_{A8}\) as the next hop without utility value comparison;

  3. 3.

    if \((v_{A3}\in \mathbb {N}_{P}^{'}(v_{A1})\wedge v_{A3}\in \mathbb {N}_{P}^{'}(v_{A8}))\quad \vee \quad (v_{A3}\notin \mathbb {N}_{P}^{'}(v_{A1})\wedge v_{A3}\notin \mathbb {N}_{P}^{'}(v_{A8}))\), \(v_{A1}\) compares its utility value with that of \(v_{A8}\).

The forwarding strategies for case (1) and (2) are obvious and they do not need to be discussed. For case (3), we describe the forwarding strategy for \(v_{A3}\in \mathbb {N}_{P}^{'}(v_{A1})\wedge v_{A3}\in \mathbb {N}_{P}^{'}(v_{A8})\) firstly. The intra-community egocentric network around \(v_{A1}\), \(v_{A8}\) is shown as Fig. 3. It can be observed that there are 4 loop-free paths from \(v_{A1}\) to \(v_{A3}\). They are \(r_{1} = \{v_{A1}, v_{A3}\}\), \(r_{2} = \{v_{A1}, v_{A9}, v_{A3}\}\), \(r_{3} = \{v_{A1}, v_{A4}, v_{A9}, v_{A3}\}\), and \(r_{4} = \{v_{A1}, v_{A6}, v_{A4}, v_{A9}, v_{A3}\}\). The question is: which one is the best? Though the hop count of path \(r_{1}\) is the least, from the perspective of contact probability from \(v_{A1}\) to \(v_{A3}\), \(r_{1}\) may not be the best forwarding path. With the contact probability between two nodes as the metric, the best forwarding path is searched in the egocentric network of a node. Let \(T_{tl}\) be the time to live of message M. For path \(r_{2}\), \(v_{A1}\) needs to wait for \(t_{1,9}\) to encounter \(v_{A9}\), and \(v_{A9}\) needs to wait for \(t_{9,3}\) to encounter \(v_{A3}\), thus, the delay along \(r_{2}\) can be represented as \(t_{r_{2}}=t_{1,9}+t_{9,3}\), which should satisfy \(t_{r_{2}}\le T_{tl}\). Let \(\lambda _{1,9}\) denote the contact ratio between \(v_{A1}\) and \(v_{A9}\), \(\lambda _{9,3}\) be the contact ratio between \(v_{A9}\) and \(v_{A3}\). According to the node mobility mentioned in Sect. 2.2, the inter-contact time between a pair of nodes follows an exponential distribution. Thus, the PDF (Probability Distribution Function) of \(t_{1,9}\) is \(f(t_{1,9})=\lambda _{1,9}\times e^{-\lambda _{1,9}\cdot t_{1,9}}\), while the PDF of \(t_{9,3}\) is \(f(t_{9,3})=\lambda _{9,3}\times e^{-\lambda _{9,3}\cdot t_{9,3}}\). Clearly, the PDF of \(t_{r_{2}}\) can be formulated as

$$\begin{aligned} \begin{aligned} f(t_{r_{2}})&=f(t_{1,9})\otimes f(t_{9,3}) \\&=\frac{\lambda _{9,3}}{\lambda _{9,3}-\lambda _{1,9}}\times f(t_{1,9})+ \frac{\lambda _{1,9}}{\lambda _{1,9}-\lambda _{9,3}}\times f(t_{9,3}) \end{aligned} \end{aligned}$$
(1)

Therefore, for path \(r_{2}\), the contact probability from \(v_{A1}\) to \(v_{A3}\) within \(T_{tl}\) can be estimated as

$$\begin{aligned} \begin{aligned} p(t_{r_{2}}\le T_{tl})&=\int _{0}^{T_{tl}}f(t_{r_{2}})dt \\&=\frac{\lambda _{9,3}}{\lambda _{9,3}-\lambda _{1,9}}\times \left( 1-e^{-\lambda _{1,9}\cdot T_{tl}}\right) + \frac{\lambda _{1,9}}{\lambda _{1,9}-\lambda _{9,3}}\times \left( 1-e^{-\lambda _{9,3}\cdot T_{tl}}\right) \end{aligned} \end{aligned}$$
(2)

Since the quality of a path can be measured by the contact probability along the path, the best forwarding path should provide the maximum contact probability. Let \(P_{max}(v_{A1}, v_{A3})\) be the maximum contact probability from \(v_{A1}\) to \(v_{A3}\), \(P_{max}(v_{A8}, v_{A3})\) denotes the maximum contact probability from \(v_{A8}\) to \(v_{A3}\). If \(P_{max}(v_{A8}, v_{A3}) < P_{max}(v_{A1}, v_{A3})\), \(v_{A1}\) can refuse to select \(v_{A8}\) as the next hop for M. However, it is a rough strategy, and may miss suitable relay nodes. Since the similarity centrality indicates future collaborations between a pair of nodes, it is unwise for the forwarding scheme to ignore similarity centrality. For example, if \(P_{max}(v_{A1}, v_{A3})=0.50\), \(P_{max}(v_{A8}, v_{A3})=0.48\), but \(C_{s}(v_{A1}, v_{A3})=0.3\), \(C_{s}(v_{A8}, v_{A3})=0.6\), in our view, \(v_{A1}\) should select \(v_{A8}\) as the next hop. As a result, the utility function for \(v_{A3}\in \mathbb {N}_{P}^{'}(v_{A1})\wedge v_{A3}\in \mathbb {N}_{P}^{'}(v_{A8})\) is

$$\begin{aligned} U_{i}(v_{d})=P_{max}(v_{i}, v_{d})+ C_{s}(v_{i}, v_{d}) \end{aligned}$$
(3)

From Eq. (3), it can be observed that there exists a special case, where \(P_{max}(v_{i}, v_{d}) = 1.0\) and \(C_{s}(v_{i}, v_{d}) = 0.0\). In our opinion, it is reasonable. From the introduction to similarity centrality (\(C_{s}(v_{i}, v_{j})\)) in Sect. 2.2, it can be known that \(C_{s}(v_{i}, v_{d}) = 0.0\) means \(\mathbb {N}_{P}^{'}(v_{i})\cap \mathbb {N}_{p}^{'}(v_{d}) = \{\}\), i.e., there are no common primary neighbors coming from \(C_{m}(v_{s})\) between \(v_{i}\) and \(v_{d}\). \(P_{max}(v_{i}, v_{d}) = 1.0\) means that \(v_{i}\) will contact with \(v_{d}\) certainly and directly. \(v_{i}\) will deliver the message to \(v_{d}\) by itself.

Fig. 3
figure 3

Intra-community egocentric network

As for the case \(v_{A3}\notin \mathbb {N}_{P}^{'}(v_{A1})\wedge v_{A3}\notin \mathbb {N}_{P}^{'}(v_{A8})\), since the destination \(v_{A3}\) does not exist in the intra-community egocentric network around \(v_{A1}\) and \(v_{A8}\), forwarding paths can not be found. Betweenness centrality indicates the ability of a node to facilitate or limit interactions between the nodes it links. A node with high betweenness plays the role of traffic hub in social networks. Consequently, in this case, \(v_{A1}\) compares its egocentric betweenness which is based on intra-community egocentric network with that of \(v_{A8}\), and the utility function is

$$\begin{aligned} U_{i}(v_{d})=C_{B}^{'}(v_{i}) \end{aligned}$$
(4)

The forwarding strategy for Scheme A is named as egocentric network focused forwarding scheme. Its detailed operation is described in Algorithm 2.

figure h

Scheme B: \(C_{m}(v_{s})\ne C_{m}(v_{d})\)

If \(v_{s}\) and \(v_{d}\) come from different communities, from the perspective of \(v_{d}\), message M should be delivered to \(C_{m}(v_{d})\) firstly, and further relayed as the case \(C_{m}(v_{s})= C_{m}(v_{d})\). Let \(v_{i}\) be a carrier of M. When \(v_{i}\) encounters \(v_{j}\), there exist the following cases:

  1. 1.

    if \(v_{i}\in C_{m}(v_{d}) \wedge v_{j}\in C_{m}(v_{d})\), focusing on intra-community egocentric network and operating as Algorithm 2;

  2. 2.

    if \(v_{i}\in C_{m}(v_{d}) \wedge v_{j}\notin C_{m}(v_{d})\), \(v_{i}\) retains M;

  3. 3.

    if \(v_{i}\notin C_{m}(v_{d}) \wedge v_{j}\in C_{m}(v_{d})\), \(v_{i}\) selects \(v_{j}\) as the next hop without utility value comparison;

  4. 4.

    if \(v_{i}\notin C_{m}(v_{d}) \wedge v_{j}\notin C_{m}(v_{d})\), focusing on inter-community egocentric network and operating as Algorithm 2.

The forwarding strategy for Scheme B is named as community aware forwarding strategy and its detailed operation is given in Algorithm 3.

figure i

4 Performance analysis

In multicasting, a message has a group of destinations which are randomly distributed in different communities. As described in Sect. 3, ENCAR contains two forwarding schemes to cope with the cases where the source and destination of a message come from the same community or different communities individually. Thus, the performance of ENCAR will also be different in the two cases.

Case A: \(C_{m}(v_{s})=C_{m}(v_{d})\)

In this case, from the perspective of \(v_{d}\), the dissemination of message M is restrained in \(C_{m}(v_{s})\). Let \(r(M)=\{v_{s}, v_{1}, \ldots , v_{i}, \ldots , v_{n}, v_{d} \}\) denote the forwarding path, and \(v_{i}\) (\(i = 1, 2, \ldots , n\)) be the relay node. As is mentioned in Sect. 2.2, the inter-contact time between a pair of nodes follows an exponential distribution. We denote \(\lambda _{i}\) as the contact ratio between \(v_{i-1}\) and \(v_{i}\). If \(t_{i}\) is the inter-contact time between \(v_{i-1}\) and \(v_i\), the PDF of \(t_{i}\) can be formulated as \(f(t_{i})=\lambda _{i}\times e^{-\lambda _{i}\cdot t_{i}}\). It is found that there are n relay nodes in r(M), the end-to-end delay can be represented as \(w_{n}=\sum \nolimits _{i=1}^{n+1}t_{i}\), and the PDF of \(w_{n}\) can be formulated as

$$\begin{aligned} f(w_{n})=f(t_{1})\otimes f(t_{2})\otimes \cdots \otimes f(t_{n+1}) \end{aligned}$$
(5)

According to the property of convolutions, Eq. (5) can be written as

$$\begin{aligned} f(w_{n})=\sum \limits _{i=1}^{n+1}C_{i}^{(n+1)}\times f(t_{i}) \end{aligned}$$
(6)

where \(C_{i}^{(n+1)}=\prod \nolimits _{j=1,j\ne i}^{n+1}\frac{\lambda _{j}}{\lambda _{j} - \lambda _{i}}\). The delivery ratio of M within \(T_{tl}\) can be represented as

$$\begin{aligned} \begin{aligned} p_{d}^{n}(w_{n}<T_{tl})&=\int _{0}^{T_{tl}}f(w_{n})dt=\sum \limits _{i=1}^{n+1}C_{i}^{(n+1)}\times \left( \int _{0}^{T_{tl}}f(t_{i})dt_{i}\right) \\&=\sum \limits _{i=1}^{n+1}C_{i}^{(n+1)}\times \left( 1-e^{-\lambda _{i}\cdot T_{tl}}\right) ,\quad n\le (m_{s}-2) \end{aligned} \end{aligned}$$
(7)

where \(m_{s}\) is the number of nodes in \(C_{m}(v_{s})\).

Here the total delay of successful deliveries, not the whole delay of paths, is the major concern. Since the paths on which there are no messages delivered, or unsuccessfully delivered if any, are of little importance, it makes no sense in computing their delay. If the delivery ratio of message M within \(T_{tl}\) is given, the average delivery ratio, end-to-end delay and hop count in this case are

$$\begin{aligned} \begin{aligned}&P_{s,d}^{1}=\sum \limits _{n=0}^{m_{s}-2}p_{d}^{n}\\&W_{s,d}^{1}=\sum \limits _{n=0}^{m_{s}-2}w_{n}\times p_{d}^{n}\\&H_{s,d}^{1}=\sum \limits _{n=0}^{m_{s}-2}n\times p_{d}^{n} \end{aligned} \end{aligned}$$
(8)

Case B: \(C_{m}(v_{s})\ne C_{m}(v_{d})\)

In this case, for message M, from the perspective of \(v_{d}\), M should be delivered to \(C_{m}(v_{d})\) firstly, then, disseminated as \(C_{m}(v_{s}) = C_{m}(v_{d})\). Therefore, the worst dissemination is that except \(v_{s}\) and \(v_{d}\), all the other nodes in the network become the relay of M. Suppose that there are n relay nodes on the forwarding path, the end-to-end delay from \(v_{s}\) to \(v_{d}\) can be represented as \(w_{n}=\sum \nolimits _{i=1}^{n+1}t_{i}\), and the delivery ratio can be formulated as

$$\begin{aligned} \begin{aligned} p_{d}^{n}(w_{n}<T_{tl})=\sum \limits _{i=1}^{n+1}C_{i}^{(n+1)}\times \left( 1-e^{-\lambda _{i}\times T_{tl}}\right) ,\quad n\le (m_{all}-2) \end{aligned} \end{aligned}$$
(9)

where \(m_{all}\) is the total number of nodes in network. The average delivery ratio, end-to-end delay, and hop count in this case are

$$\begin{aligned} \begin{aligned}&P_{s,d}^{2}=\sum \nolimits _{n=0}^{m_{all}-2}p_{d}^{n}\\&W_{s,d}^{2}=\sum \nolimits _{n=0}^{m_{all}-2}w_{n}\times p_{d}^{n}\\&H_{s,d}^{2}=\sum \nolimits _{n=0}^{m_{all}-2}n\times p_{d}^{n} \end{aligned} \end{aligned}$$
(10)

Let N be the number of targeted receivers, C denote the number of communities. Because of the random distribution of receivers, the average number of the receivers that are in \(C_{m}(v_{s})\) is \(N_{1}=\frac{N}{c}\), while the average number of the receives that are in other communities is \(N_{2}=N\times (1-\frac{1}{c})\). Considering the two cases comprehensively, for message M, the average delivery ratio, end-to-end delay, and hop count can be formulated as

$$\begin{aligned} \begin{aligned}&P(M)=\frac{\left( \sum \nolimits _{i=1}^{N_{1}}P_{s,i}^{1}+ \sum \nolimits _{j=1}^{N_{2}}P_{s,j}^{2}\right) }{N}\\&W(M)=\frac{\left( \sum \nolimits _{i=1}^{N_{1}}W_{s,i}^{1}+ \sum \nolimits _{j=1}^{N_{2}}W_{s,j}^{2}\right) }{N}\\&H(M)=\frac{\left( \sum \nolimits _{i=1}^{N_{1}}H_{s,i}^{1}+ \sum \nolimits _{j=1}^{N_{2}}H_{s,j}^{2}\right) }{N} \end{aligned} \end{aligned}$$
(11)

5 Simulation and results analysis

The performance of ENCAR is evaluated on Opportunistic Network Environment (ONE) [36], which is designed specially for DTNs. The importance of mobility model in the simulation for routing algorithms is apparent [3739]. Social based mobility models have community structure and nodes have highly correlated movement patterns [31]. The problem is that no mobility models in ONE can match the node mobility in social network scenarios. It is necessary to make an abstraction of real social network scenarios for the simulation. Therefore, we also present a community based random way point mobility model called CB_RWP to evaluate the performance of the proposed algorithm. CB_RWP can simulate clustering phenomenon in social networks and matches the node mobility in social network scenarios. The performance of ENCAR is evaluated on CB_RWP and is compared with that of Epidemic [40], EBMR [16], STBR [15], DTBR [15], and SDM [13]. The default parameters of EBMR are set as [16], while the default parameters for SDM are defined in [13]. The performance metrics are as follows:

  1. (1)

    Message delivery ratio

    $$\begin{aligned} P_{d} = \frac{N_{d}}{N_{c}} \end{aligned}$$

    where \(N_{d}\) is the number of delivered messages, \(N_{c}\) is the sum of created messages in a run of simulation.

  2. (2)

    Routing overhead

    $$\begin{aligned} O_{d} = \frac{N_{r} - N_{d}}{N_{d}} \end{aligned}$$
    (12)

    where \(N_{r}\) is the sum of relayed times for all created messages in a run of simulation.

  3. (3)

    Average end-to-end delay

    $$\begin{aligned} W_{d} = \frac{\sum \nolimits _{\alpha = 1}^{N_{d}} W_{\alpha }}{N_{d}} \end{aligned}$$

    where \(\sum \nolimits _{\alpha = 1}^{N_{d}} W_{\alpha }\) represents the sum of end-to-end delay for all delivered messages.

  4. (4)

    Average hop count

    $$\begin{aligned} H_{d} = \frac{\sum \nolimits _{\alpha = 1}^{N_{d}}H_{\alpha }}{N_{d}} \end{aligned}$$

    where \(\sum \nolimits _{\alpha = 1}^{N_{d}} H_{\alpha }\) is the sum of hop count for all delivered messages.

The default simulation parameters are set as is listed in Table 3.

Table 3 Default simulation parameters

5.1 CB_RWP: Community based random way point mobility model

CB_RWP can simulate clustering phenomenon in social networks. The working mechanism of CB_RWP is illustrated in Fig. 4. There are 4 communities: A, B, C and D, which represent the districts or villages in a region. To describe the mobility of CB_RWP, we choose node \(v_{A1}\), which comes from community A, as an example. When \(v_{A1}\) moves, its motion can be divided into two cases:

  • if the current position of \(v_{A1}\) is in the range of community A, it will stay in A with probability \(P_{s}\) or departs from A to other communities with probability \(1-P_{s}\);

  • if \(v_{A1}\) is not in community A, it will return to A with probability \(P_{s}\) or stay in other communities with probability \(1-P_{s}\).

\(P_s\) is called staying probability, and (\(1-P_{s}\)) is named as departing probability. In social networks, \(P_{s}\) is always larger than \(1-P_{s}\). Thus, nodes coming from the same community contact more frequently than those coming from different communities. As is known, in RWP, node randomly chooses its next waypoint in the simulation area. Thus, whether a node can stay in its current community or not is not only determined by \(P_s\), but also is related to the distance between its current waypoint and the newly chosen waypoint. Without loss of generality, we take node \(v_{A1}\) in community A as an example. Let the minimum distance between node \(v_{A1}\) and the boundry of A be \(r_{min}\), and the maximum distance be \(r_{max}\), as is shown in Fig. 4. If the distance between node \(v_{A1}\) and its next waypoint is D, we have the following cases: 1) if \(0\le D \le r_{min}\), node \(v_{A1}\) will definitely stay in community A no matter how \(P_s\) is valued; 2) if \(r_{min}< D \le r_{max}\), whether node \(v_{A1}\) will stay in A or depart from A is determined by \(P_s\); and 3) if \(D > r_{max}\), node \(v_{A1}\) will certainly depart from A even if \(P_s\) is set to be 1. By this mechanism, CB_RWP simulates clustering phenomenon. In reality, a community consists of the individuals who have common interests or dwell in the same region. However, in DTNs, a contact between two nodes depends on the encounter chance. Similar to [10], a community mentioned here is a clustering of individuals with both common interest and geo-location where individuals mostly contact each other. Because CB_RWP is based on geo-location of nodes, its communities can easily be detected. According to its geo-location, each node can find its own communities. In addition, in social networks, communities are not isolated from others, there always exist some overlapped areas. As is illustrated in Fig. 4, the central overlapped part represents the public place, such as supermarkets or central parks, where nodes will have more opportunities to communicate with nodes which lie in the other communities, so communication traffic here is always heavier than in other places.

Fig. 4
figure 4

An illustration of CB_RWP

ENCAR utilizes the node mobility (mentioned in Sect. 2.2) where the inter-contact time between a pair of nodes follows an exponential distribution. CB_RWP is proposed to evaluate the performance of ENCAR, so CB_RWP should match the node mobility. We theoretically prove that CB_RWP possesses this feature, and further validate the proof by \(\chi ^{2}\) hypothesis tests. The results of the validation show that under the significance level \(\alpha = 0.95\), the average acceptance ratio is higher than \(90\%\); under the significance level \(\alpha = 0.90\), the average acceptance ratio is higher than \(80\%\). A formal description of our proof and validation is given in “Appendix”.

In general, STBR and DTBR choose end-to-end delay as the metric to set up multicasting tree. However, the performance of these multicasting algorithms are both evaluated on CB_RWP, which presents the mobility of individuals in social networks. To compare the performance of these algorithms objectively, we also introduce the node mobility in social network scenarios to these algorithms. For example, we change the metric for STBR and DTBR. In our simulations, STBR and DTBR choose the contact probability towarding destinations as the metric to set up multicasting tree, which consists of the forwarding paths that posses maximum contact probability from source (or relay nodes in DTBR) to destinations.

5.2 Impact of staying probability \(P_{s}\) on routing performance

It is obvious that the larger \(P_{s}\) is, the less freedom nodes have. In our simulation, \(P_{s}\) is increased from 0.1 to 0.9 with 0.2 as the increment. Figure 5 shows the impact of \(P_{s}\) on routing performance. It can be observed that with the increase of \(P_{s}\), message delivery ratio decreases almost with all the algorithms, while the average end-to-end delay increases. This is the influence of \(P_{s}\). With the increase of \(P_{s}\), nodes prefer staying in their communities to departing from them. As a result, the contact opportunity among nodes decreases naturally, and finally leads to the ascent of end-to-end delay. However, the duration of simulation is so long that every node still has many opportunities to enter other communities when \(P_{s}\) is set to 0.9. As a matter of course, this is the reason why the variation of performance in these algorithms are unconspicuous with the increase of \(P_{s}\). From the simulation results, it can be seen that EBMR performs poorly. The reason may be that the forwarding scheme of EBMR does not match the characteristics of social network scenarios, and the initialization of parameters is the same for all nodes. Among these algorithms, routing overhead of Epidemic is the highest because of the contact and infect forwarding strategy.

Fig. 5
figure 5

Impact of \(P_{s}\) on routing performance. a Staying probability versus message delivery ratio. b Staying probability versus routing overhead. c Staying probability versus average end-to-end delay. d Staying probability versus average hop count

The performance of STBR is close to that of DTBR, but the routing overhead of DTBR is much smaller, so the simulation result does not accord with the analysis in [15]. From the description of performance metrics, it can be known that the routing overhead is defined only according to the expect-received multicasting messages,regardless of the messages that are used to collect global information. Hence, this situation is also reasonable, and the routing overhead of STBR or DTBR can be much smaller than that of Epidemic. Similarly, in ENCAR, a node exchanges PI with its PNs (The description of PNs and PI are given in Sect. 3.1) to construct egocentric network, which may result in the increase of routing overhead. Generally, the exchanging of PI among nodes is an implementation problem, which is similar to the exchanging of summary vectors in Epidemic. In ENCAR, every node constantly maintained its PI and when two nodes encounter, they just exchange PI directly. Besides, in sparse DTNs, every node only has a few neighbors, and the maximum overhead of exchanging PI is smaller than 0.8% in comparison with the overhead of relaying all the messages, which is negligible. What is more, the routing overhead given in Eq. (12) defined in ONE is just related to delivered messages. Therefore, this part of overhead is not considered in our simulation. From the forwarding mechanism of SDM, we know that SDM tries to guarantee message delivery ratio by setting the expected delivery ratio, while ENCAR aims to achieve a good performance with not only message delivery ratio but also other performance metrics. As a result, SDM may perform better than ENCAR on message delivery ratio, but ENCAR may have a better performance on other evaluation metrics. This is verified by the simulation results. From Fig. 5, it can be observed that the performance of ENCAR is similar to that of SDM, SDM outperforms in message delivery ratio and end-to-end delay, while ENCAR achieves lower routing overhead.

5.3 Impact of community overlapped ratio \(R_{o}\) on routing performance

In social networks, communities are widely distributed and most of them are not isolated from others. Overlapped part always exists between some communities. As is illustrated in Fig. 4, the central part represents the overlapped area, in which nodes contact with each other more frequently than in other parts. Thus, traffic is heavier in overlapped place. The overlapped ratio is defined as \(R_{o}=\frac{S_{o}}{S}\), where \(S_{o}\) is the area of overlapped part as aforementioned, and S is the area of the simulation scenario. \(R_{o}\) varies from 0.0 to 1.0 in our simulation. Clearly, any community is isolated from others if \(R_{o}\) is set to 0.0, while all the communities mix with each other when \(R_{o}\) is set to 1.0. As is described above, nodes in the overlapped part will have a higher probability to communicate with nodes in the other communities, which corresponds to the condition that the larger the \(S_o\), the more the nodes in the overlapped part. As a result, it will definitely increase the traffic through the whole network. As is shown in Fig. 4, the default overlapped ratio is 0.04 in CB_RWP. In our simulation, \(R_o\) is increased from \(0\%\) to \(100\%\) nonlinearly. In our view, the rise of \(R_{o}\) has double effects on the performance of these algorithms. On one hand, with the increase of \(R_o\), communities tend to mix together. It means that the whole network tends to be one community. Obviously, in such case, contact opportunities among the nodes coming from different communities is greater than in the cases where \(R_{o}\) is low, which is helpful to deliver messages. But on the other hand, as the buffer size of every node is limited, buffer resource of some nodes will be exhausted more easily when messages are disseminated more frequently, and we can image that some messages will be abandoned. Thus, the increase in \(R_o\) may be an adverse factor for message delivery ratio. Due to the offset between positive effect and negative effect, variation of \(R_{o}\) may have a little influence on the performance of these algorithms. This is supported by Fig. 6, in which we can see that the performance of every algorithm does not change a lot with the increase of \(R_o\).

Fig. 6
figure 6

Impact of \(R_{o}\) on routing performance. a Overlapped ratio versus message delivery ratio. b Overlapped ratio versus routing overhead. c Overlapped ratio versus average end-to-end delay. d Overlapped ratio versus average hop count

5.4 Impact of event interval \(T_{e}\) on routing performance

Load of a network is closely related to event interval. If \(T_{e}\) is short, lots of messages will be generated, and the network load will be heavy. In our simulation, we increase \(T_{e}\) from 10s to 90s with 20s as the increment. The impact of \(T_{e}\) on routing performance is illustrated in Fig. 7. It can be observed that ENCAR performs stably. This indicates that ENCAR is effective for the variation of \(T_{e}\). Among these algorithms, the performance of EBMR is the poorest. The message delivery ratio of EBMR is always lower than 0.3, no matter how \(T_{e}\) varies. The reason is that EBMR is evaluated on CB_RWP, which is a social community based mobility model, but the utility function of EBMR does not adopt social network analysis, and is updated for all nodes with the same initial parameters. It does not match the characteristics of social network scenarios. As aforementioned, for searching forwarding path in STBR and DTBR, we modify the metric to contact probability, which is calculated based on the contact ratio between two nodes. In social networks, contact ratio between a pair of nodes can reflect relationship of two nodes accurately. In our view, this is the reason why STBR and DTBR can perform well.

Fig. 7
figure 7

Impact of \(T_{e}\) on routing performance. a Event interval versus message delivery ratio. b Event interval versus routing overhead. c Event interval versus average end-to-end delay. d Event interval versus average hop count

5.5 Impact of buffer size \(S_{b}\) on routing performance

Buffer size restricts the maximum number of carried messages in a node. Buffer of a node will overflow quickly if \(S_{b}\) is very small. As a result, congestion will occur and some messages will be discarded. With the increase of buffer size, a node can carry more messages, and the probability for a message to be discarded will decrease. In our simulation, \(S_b\) is increased from 1M to 9M with 2M as the increment. Figure 8 shows the impact of \(S_{b}\) on routing performance, we can see that the performance of ENCAR hardly changes with the variation of buffer size. Hence, we can assert that ENCAR has a strong ability to forward messages and deal with congestion. In addition, the performance of other algorithms also tend to be stable when the buffer size is larger than 3M, a situation which indicates that CB_RWP is effective for message dissemination. Thus, the rationality of CB_RWP is also supported by this result.

Fig. 8
figure 8

Impact of \(S_{b}\) on routing performance. a Buffer size versus message delivery ratio. b Buffer size versus routing overhead. c Buffer size versus average end-to-end delay. d Buffer size versus average hop count

5.6 Impact of node density on routing performance

Both the variation of node number and network area will affect node density. The ascent of node density can be regarded as an advantage for message dissemination. With the increase of node density, contact opportunity among nodes will also rise, and messages can be delivered more easily. However, from the perspective of buffer resource, the increase of node density will be a disadvantage. Since DTNs are buffer resource limited, the buffer size of every node is finite. With the increase of contact among nodes, messages are disseminated more frequently, and as a result, buffer resource of some popular nodes will be exhausted more quickly. Finally, some messages will be discarded. This will lead to the decrease of message delivery ratio. Moreover, from the definition of routing overhead, it can be known that routing overhead will be higher when message-relayed time increases. Hence, increase of nodes also has a negative impact on routing overhead. In our simulation, we increase the node density from \(2.5/(1000\times 1000\,\mathrm{{m}}^2)\) to \(12.5/(1000\times 1000\,\mathrm{{m}}^2)\) with 2.5 as the increment. From Fig. 9, we observe that with the increase of node density, the routing overhead of these algorithms increases. It matches the theoretical analysis. However, the anomaly is that the average end-to-end delay of EBMR and Epidemic descend with the ascent of node density. In our view, the reason is that with the increase of node density, the network load is so heavy for EBMR and Epidemic that only the messages geographically adjacent to their destinations are delivered successfully. For example, at time t, a new message is generated by \(v_{i}\), \(v_{j}\) is a destination of the message. Almost at the same time, \(v_{i}\) encounters \(v_{j}\). Then, the new generated message is delivered to \(v_{j}\) quickly. It means that the new generated message is geographically adjacent to its destination. Due to the short waiting time of these delivered messages, their average end-to-end delay is also short.

Fig. 9
figure 9

Impact of node density on routing performance. a Node density versus message delivery ratio. b Node density versus routing overhead. c Node density versus average end-to-end delay. d Node density versus average hop count

5.7 Impact of time to live \(T_{tl}\) on routing performance

In our simulations, every message is constrained by \(T_{tl}\). A message will be discarded if its \(T_{tl}\) expires. In multicasting, though some copies of a certain message may exist in the network, they have the same \(T_{tl}\), which is subtracted by relay nodes. Messages will live a long time if their \(T_{tl}\) is large, as a result, the buffer in message carrier will be exhausted quickly and network load will become heavy. In our simulation, \(T_{tl}\) is increased from 15m to 240m nonlinearly. The impact of \(T_{tl}\) on routing performance is shown in Fig. 10. It can be seen that the performance of ENCAR, SDM and EBMR tend to be stable with the increase of \(T_{tl}\). Every algorithm, except Epidemic, performs poorly when \(T_{tl}\) is set to 15m. It is due to the fact that these algorithms are not based on the contact and infect forwarding scheme. Compared with Epidemic, messages are disseminated more slowly. Thus, \(T_{tl}\) is too short for these algorithms to deliver messages successfully, when it is set to 15 m. A lot of messages are discarded halfway.

Fig. 10
figure 10

Impact of \(T_{tl}\) on routing performance. a Time to live versus message delivery ratio. b Time to live versus routing overhead. c Time to live versus average end-to-end delay. d Time to live versus average hop count

6 Conclusion

In some social network scenarios, for examples, disaster response networks and military pocket switch networks, where portable devices are carried by disaster rescuers and soldiers, relaying messages consumes a signicant amount of energy for nodes. However, mobile nodes in these DTNs typically have small buffer storage and limited battery reserves, and replacing/recharging the batteries of drained nodes is usually infeasible or expensive. For routing algorithms, it is important to reduce routing overhead in nodes which have limited energy and small buffer storage. Generally, low routing overhead can decrease the number of relaying messages and the size of unnecessary buffer occupation, and improve the efficiencies of energy and buffer of nodes.

In this paper, we investigate the multicasting issue for the application of DTNs in sparse social network scenarios, and propose an egocentric network focused community aware multicast routing (ENCAR). ENCAR is a hierarchical routing algorithm, which considers clustering phenomenon in social networks and differently deals with the cases whether the source and destination of a message come from the same community or not. To optimize the selection of relay nodes for multicasting messages, ENCAR develops the utility based forwarding strategy, and introduces the node mobility in social network scenarios where the inter-contact time between any pair of nodes follows an exponential distribution for the estimation of contact probability between two nodes. The establishment of utility function is a challenge for utility based forwarding strategy. Social network centrality analysis and contact probability are utilized to set up the utility function for ENCAR. To simulate the performance of ENCAR, we also propose a community based random way point mobility model called CB_RWP, whose correctness is theoretically proven and further effectiveness is validated by \(\chi ^{2}\) test, in our work. The performance of ENCAR is mathematically analyzed, and evaluated on the ONE simulator. Compared to Epidemic, EBMR, SDM, STBR, and DTBR, the performance of ENCAR is close to that of SDM, but better than other algorithms. ENCAR achieves lower routing overhead, while SDM outperforms in message delivery ratio.