# A *K* self-adaptive SDN controller placement for wide area networks

- 364 Downloads
- 4 Citations

## Abstract

As a novel architecture, software-defined networking (SDN) is viewed as the key technology of future networking. The core idea of SDN is to decouple the control plane and the data plane, enabling centralized, flexible, and programmable network control. Although local area networks like data center networks have benefited from SDN, it is still a problem to deploy SDN in wide area networks (WANs) or large-scale networks. Existing works show that multiple controllers are required in WANs with each covering one small SDN domain. However, the problems of SDN domain partition and controller placement should be further addressed. Therefore, we propose the spectral clustering based partition and placement algorithms, by which we can partition a large network into several small SDN domains efficiently and effectively. In our algorithms, the matrix perturbation theory and eigengap are used to discover the stability of SDN domains and decide the optimal number of SDN domains automatically. To evaluate our algorithms, we develop a new experimental framework with the Internet2 topology and other available WAN topologies. The results show the effectiveness of our algorithm for the SDN domain partition and controller placement problems.

## Keywords

Software-defined networking (SDN) Controller placement*K*self-adaptive method

## CLC number

TP393## 1 Introduction

Historically, control plane functions in traditional networks have been tightly coupled to the data plane. The software-defined networking (SDN) concept (Kirkpatrick, 2013) has caused a paradigm shift in communication networks, which allows the separation of control and data planes, i.e., moving complex functions from devices in a network to sophisticated dedicated controller instances. The most popular example of SDN is OpenFlow (McKeown *et al.*, 2008), where an OpenFlow controller defines rules for switches on how to handle packets. Thus, the controller placement problems are becoming increasingly important.

*C*

_{1}and

*C*

_{2}be placed in a WAN and which controller,

*C*

_{1}or

*C*

_{2}, should be selected by the OpenFlow switch

*S*

_{1}? These are still open questions and have attracted much attention recently.

A large-scale network is usually partitioned into several small ones due to numerous reasons, e.g., privacy, scalability, incremental deployment, and security (Xie *et al.*, 2012; Lin *et al.*, 2013). For SDN partitioning, a large network is likely to be divided into multiple SDN domains. Each SDN domain runs one controller, such as Floodlight (http://www.projectfloodlight.org/floodlight/). An SDN domain can be a sub-network in a data center (DC), an enterprise network, or an autonomous system (AS). In this study, we consider the ‘best’ controller placement that minimizes propagation delays and improves reliability in a WAN partitioned into multiple AS domains.

*et al.*(2012). The authors examined the impacts of placements on average-latency and worst-case latency on real topologies. However, they treated WAN as a whole rather than as multiple SDN domains and ignored the reliability of each controller. While propagation latency is certainly a significant design metric, we argue that reliability and load balancing design are also essential parts for operational SDNs. Heller

*et al.*(2012) assumed that nodes are always assigned to their nearest controller using latency as the metric. In average-latency placement, the number of nodes per controller is imbalanced and ranges from 3 to 13 when the number of controllers is 4 (Fig. 2). The more nodes a controller has to control, the heavier the load on that controller will be. From Fig. 2, we can see the imbalance between controller 2 and controller 3. With regard to controller failure tolerance, Hock

*et al.*(2013) optimized the placement of controllers, called Pareto-based optimal controller placement (POCO). However, their method causes the inter-controller broadcast storm and needs time to reassign nodes. Heller

*et al.*(2012) and Hock

*et al.*(2013) assumed that the mapping between a switch and a controller is configured dynamically, as in ElastiCon (Dixit

*et al.*, 2013). The dynamic allocation can improve the scalability and reliability of the SDN deployed in a LAN, but it is not suitable for a WAN. Usually, the propagation latency is larger than the queuing delay in the network, and the dynamic mapping between a switch and a remote controller will significantly affect the response time of the WAN. Moreover, switch migrations are complex tasks with more overhead.

Motivated by these analyses, the SDN domain partition problem for a WAN has been studied (Xiao *et al.*, 2014). We first use spectral clustering to partition the WAN into several SDN domains, each with its own controller, similar to the domain name system (DNS). A single controller can be enough to manage a small network, and the backup controller can reduce the impact of failure of a single controller. There are at least four reasons why we adopt the divide-and-conquer philosophy: (1) It facilitates load balancing and ensures reliability in the SDN infrastructure; (2) The partition of SDN domains can help reduce inter-controller broadcast storm, especially in large-scale WANs; (3) There are no latencies in reassigning nodes to their new controller because of static allocation; (4) It fits the layered model of a WAN and is easy for maintenance and expansion. In contrast, an over-complicated controller plane is hard to achieve and maintain.

Although the placement algorithm (Xiao *et al.*, 2014) may obtain the SDN domain partition results, the number of SDN domains (*K*) needs to be set manually. In this study, we focus on finding a *K* self-adaptive SDN controller placement for WAN and exploiting the structure of eigenvectors to determine automatically the number of SDN domains.

*et al.*(2014), the major contributions of this study are listed as follows:

- 1.
We propose a

*K*self-adaptive SDN controller placement for a WAN based on the matrix perturbation theory. - 2.
We propose an alternative approach which relies on the structure of eigenvectors to estimate the optimal number of SDN domains.

Experimental results show that our methods can solve the SDN controller placement problem and determine the number of SDN domains automatically.

## 2 Related work

Currently, there are mainly two categories of controllers in the SDN control plane: single controller and distributed controllers.

### 2.1 Single controller

Examples of the single controller include Floodlight (http://www.projectfloodlight.org/flood-light/), Maestro (Cai *et al.*, 2010), SNAC (http://groups.geni.net/geni/raw-attachment/wiki/GEC9Demo-Summary/), Trema (http://trema.github.io/trema/), etc. Floodlight is an enterprise-class, Apache-licensed, Java-based OpenFlow controller. It is supported by a community of developers, including a number of engineers from Big Switch Networks. NOX is a typical example of controller realization, aiming to simplify the management of switches in enterprise networks. Its constituent components, control granularity, switch abstraction, and basic operation are discussed in a NOX-based network. Beacon (Erickson, 2013) is a fast, cross-platform, modular, Java-based OpenFlow controller which supports both event-based and threaded operations. Shalimov *et al.* (2013) showed that Beacon is a pretty good controller. Cai *et al.* (2010) proposed Maestro, which keeps the simple programming model for programmers and exploits parallelism in every corner with additional throughput optimization techniques. These physically centralized control planes can be adapted for DCs but are not suitable for wide multi-technology multi-domain networks.

Recently, the concept of physically distributed SDN control plane has been proposed, including DISCO (Phemius *et al.*, 2014), Onix (Koponen *et al.*, 2010), HyperFlow (Tootoonchian and Ganjali, 2010), DIFANE (Yu *et al.*, 2010), Devolved (Tam *et al.*, 2011), and ElastiCon (Dixit *et al.*, 2013). Kreutz *et al.* (2015) found that most distributed controllers offer weak consistency semantics; i.e., data updates on distinct nodes will eventually be updated on all controller nodes. This implies that there is a period of time, in which distinct nodes may read different values (old value or new value) for the same property. On the other hand, the controller will take more time to communicate with other controllers and switch in WANs with long propagation delay, aggravating the system performance significantly.

### 2.2 Distributed controllers

Two key problems in SDNs with distributed controllers are: (1) how to obtain a global view of the entire network at each controller so as to maintain a consistent network state, and (2) how to deploy an optimal number of controllers such that the best performance can be achieved.

To address the first problem, Yin *et al.* (2012) have proposed the inter-SDN (SDNi) domain protocol, which acts as an interface mechanism to coordinate the behaviors of SDN controllers in the SDN domains. However, SDNi still lacks a semantic network model and an ontology-based model to ensure the extensibility of its transport mechanisms and syntax. Lin *et al.* (2013) have proposed the east-west bridge solution to enable different controllers from different vendors to work together. They have deployed this solution with two use cases to four SDN networks, such as Internet2 in the USA and CERNET in China. In this study, we focus on the controller placement problem in WANs. We assume that the first problem has been solved perfectly, and the controller working in each SDN domain can exchange information.

The second problem is about controller placement. Heller *et al.* (2012) over-simplified the problem by modeling it as a facility or a warehouse location problem, in which only the latency of transmission from nodes to their controllers was considered and the WAN topology was treated as a whole rather than as multiple SDN domains. These lead to heavy load or failure at some controllers near the switches with intensive traffic. In view of the characteristics of the traditional WAN, a divide-and-conquer philosophy is desired for the deployment of SDNs in WANs. A large WAN is always partitioned into several small SDN domains to ensure stability, privacy, management, security, and so on. Therefore, it is necessary to develop a method to address these challenges for the SDN controller placement problem in a WAN. In this study, we focus on using a *K* self-adaptive SDN controller placement to partition a WAN topology into several small SDN domains, as well as on placing controllers to achieve low latency and high reliability in each SDN domain.

## 3 Problem description and system model

In this section, we briefly introduce the definition of SDN domain partition for a WAN and discuss the optimization placement metrics we intend to study.

### 3.1 Problem description

WAN is a network that covers a broad area, including many regions or countries. In the SDN, the controller acts as an information collector and operator for its managed switches. In this regard, the response time between the switch and controller significantly affects the performance of the SDN. Furthermore, the response time of the controller is determined by the propagation delay and the controller’s load. For example, as shown in Fig. 2, the propagation delay between Houston and Nashville is about 5.01 ms, and the time delay between Houston and El Paso is about 5.44 ms. In average-latency placement, Heller *et al.* (2012) considered only the propagation delay, so the switches in Houston were assigned to the third controller deployed in Nashville but not the second controller deployed in El Paso. In general, the queuing delay of a network is much longer than the propagation latency. From Fig. 2, we can see the imbalance between the second and third controllers. When the third controller is overloaded, the queuing delay in the network is longer than the propagation latency and is rising steadily. Therefore, Heller *et al.* (2012) simply assigned switches to their closest controller, which may lead to controller overload and instability.

- 1.
Balanced partition

Load balancing and reliability are two important indicators of controller performance. Tootoonchian

*et al.*(2012) focused on controller performance and found the limitations of a controller’s service ability. With enough delay and overload of the controller, real-time tasks become infeasible, while others may slow down unacceptably. By partitioning the WAN into several small balanced SDN domains, the service ability of a controller with fewer and balanced nodes will be improved greatly, and the inter-controller broadcast storm will be reduced sharply, which will greatly reduce the queuing delay. - 2.
Propagation latency

After considering the reliability of partition, network latency is certainly a significant design metric in long-propagation-delay WAN. Network latency includes four parts: propagation latency, processing latency, queuing latency, and transmission latency. For WANs, the propagation latency is longer than the other latencies, and the effect of the other latencies is so small that it can be ignored. Regardless of the exact form, in the case of WAN, the propagation delay affects the controller’s ability to respond to network events. Based on Heller

*et al.*(2012), we also narrow our focus to propagation latency and select it as a significant design metric. We assume that propagation latency is the response time of the controllers, and the ‘best’ placement must ensure that the latency of each SDN domain is the minimum.

We need to find a placement solution to balance the load and reduce the latency. In the next subsection, the quantitative analysis of our placement with a global optimization goal will be proposed.

### 3.2 System model

We model the network as a graph, *G*(*S, E*). The node set *S* represents the nodes in the network topology, i.e., the OpenFlow switches deployed to the different cities, and the edge set *E* represents the network links between the cities. We partition *G* into *K* subgraphs, namely, SDN domains *N*_{ i } (*i* = 1, 2,…,*K*).

**Definition 1**If we partition

*G*into

*K*subgraphs, namely,

*N*

_{ i }(

*i*= 1, 2,…,

*K*), then

*N*

_{ i }can be defined as

*N*

_{ i }(

*S*

_{ i },

*E*

_{ i }). Clustering the nodes in

*S*is equivalent to partitioning the set of vertices

*S*into mutually disjoint subsets

*S*

_{1},

*S*

_{2}, …,

*S*

_{ K }according to some similarity measure, namely,

*K*is the number of SDN domains and

*S*

_{ i }denotes the

*i*th SDN domain. The nodes in

*S*are ordered according to the cluster they are in:

We want to find a partition of the SDN domains such that the edges in different clusters have a very low weight (which means that OpenFlow switches in different clusters are dissimilar from each other) and the edges within a cluster have a high weight (which means that OpenFlow switches within the same cluster are similar to each other). Furthermore, controller must be placed in the clustering center to ensure the maximum performance of the sub-network. Obviously, we want many edges within clusters and few edges between clusters. In addition to the minimum cut requirement, we require that the partition be as balanced as possible. This is a typical data clustering problem using a graph model. Inspired by previous work on spectral clustering (Shi and Malik, 2000; Wauthier *et al.*, 2012; Mall *et al.*, 2013; Liu *et al.*, 2014), we propose our methods to solve the SDN partition problem, which can provide balanced partitions and average the load of each controller.

*w*

_{ ij }, is a function of the similarity between switches

*s*

_{ i }and

*s*

_{ j }. The weighted adjacency matrix of the graph is

*= (*

**W***w*

_{ ij })

_{i,j=1,2,…,n}. Inspired by the ‘

*N*

_{cut}’ proposed by Shi and Malik (2000), we can obtain some balanced SDN domains that minimize similarity between sets and maximize similarity within a set, satisfying the following partition objective function:

This objective function favors balanced SDN domains and minimizes the number of domain edges, which results in balanced switches and links in each SDN domain.

*et al.*, 2012). Let our placement model be

*C*is a given placement solution and dist(

*s*,

*c*

_{ i }) represents the shortest path from node

*s*∈

*N*

_{ i }to node

*c*

_{ i }∈

*C*.

The key idea behind is to first identify the partitions with balanced cuts, and then assign the controller location to the center of each partition, which has the shortest paths to all switches in the same SDN partition. Clearly, we can use the function to find the ‘best’ placement solution *C* from the set of all possible placements, along with the minimum objective, which can balance the load and reduce the latency.

In the following section, we introduce an approximation algorithm to optimize the problem by using spectral clustering.

## 4 Controller placement algorithm

Our approach for SDN controller placement is based on concepts from the spectral graph theory. The core idea is to use matrix theory and linear algebra to study the properties of the similarity matrix * W* and the Laplacian matrix. All the related theories and the idea of using eigenvectors of the Laplacian for finding partitions of graphs can be traced in Shi and Malik (2000), Wauthier

*et al.*(2012), Mall

*et al.*(2013), and Liu

*et al.*(2014). In this section, we first introduce our methods for building the similarity matrix

*and the Laplacian matrix*

**W***. This is the first and the most important step of the spectral clustering algorithm. Then the*

**L***K*self-adaptive method is proposed to decide the optimal number of SDN domains automatically, which can help achieve the partition objective. Lastly, we describe the whole placement algorithm based on the spectral theory. To achieve the placement objective, we use the

*k*-means method to cluster the nodes and select the center of each domain as the controller location.

### 4.1 Similarity function

In recent years, spectral clustering has become one of the most popular modern clustering algorithms, and it has been applied in machine learning, text summarization, social networks, etc. The success of such algorithms depends heavily on the choice of the similarity matrix * W*. From the analysis of the propagation delay of WAN topology, we tend to select the propagation delay as the weight of the similarity matrix.

*G*, the switches

*s*

_{1},

*s*

_{2}, …,

*s*

_{ n }can be deployed in the nodes of the WAN topology, and their similarities

*w*

_{ ij }can be measured according to the similarity function (which is symmetric and non-negative):

*s*

_{ i }(lat

_{ i }, lon

_{ i }) and

*s*

_{ j }(lat

_{ j }, lon

_{ j }) represent the latitude and longitude of points

*s*

_{ i }and

*s*

_{ j }, respectively,

*α*= ∣lat

_{ i }− lat

_{ j }∣,

*β*= ∣lon

_{ i }− lon

_{ j }∣,

*V*

_{ c }is the speed of light propagation in optical fibers, and the radius of the Earth is 6378.137 km. We denote the corresponding similarity matrix by

*= (*

**W***w*

_{ ij })

_{i,j=1,2,…,n}, which can be used to evaluate the propagation latencies between the nodes.

*= [*

**L***L*

_{ ij }] for the SDN domain partition is defined, where

In SDN partitioning, the spectral decomposition of * L* can be used to approximately minimize SDN

_{cut}, which tries to achieve balanced SDN domains in terms of the size.

### 4.2 *K* self-adaptive method

Although spectral clustering has many advantages and impressive performances, one of the common shortcomings is that the cluster number must be decided in advance. Some scholars have proposed different adaptive spectral clustering algorithms (Zelnik-Manor and Perona, 2004; Wang *et al.*, 2007). From the overview of their analyses, every data point can be regarded as an attribute sequence made up of all its attribute values. In this way, the similarity between any two points can be measured by the balanced closeness degree of the attribute sequences. Since the calculation of the balanced closeness degree does not need extra parameters, the impact of the parameters is eliminated. However, the methods in Zelnik-Manor and Perona (2004) and Wang *et al.* (2007) have higher cost and time complexities. In this section, we propose an approach that relies on the structure of the eigenvectors to automatically determine the optimal number of SDN domains. Based on the matrix perturbation theory (Bach and Jordan, 2003; Tian *et al.*, 2007; von Luxburg, 2007; Rebagliati and Verri, 2011) and *k*-way partition (Ng *et al.*, 2001), the difference between the *k*th and (*k* + 1)th eigenvalues is called ‘eigengap’, which can be used directly to perform clustering.

*∈ ℝ*

**W**^{n×n}. Let

*λ*

_{1}≥

*λ*

_{2}≥ … ≥

*λ*

_{ k }≥ … ≥

*λ*

_{ n }be its eigenvalues, and

**x**_{1},

**x**_{2}, …,

**x**_{ k }, …,

**x**_{ n }the associated eigenvectors. For simplicity, we would like to call

*λ*

_{1}≥

*λ*

_{2}≥ … ≥

*λ*

_{ k }the first

*k*largest eigenvalues of

*and*

**W***x*

_{1},

*x*

_{2},…,

*x*

_{ k }the first

*k*largest eigenvectors of

*. Matrix*

**W***can be decomposed into the following form:*

**W***λ*

_{1},

*λ*

_{2}, …,

*λ*

_{ n }) is a diagonal matrix with nonnegative singular eigenvalues in descending order along the diagonal, that is,

*λ*

_{1}≥

*λ*

_{2}≥ … ≥

*λ*

_{ n }≥ 0;

*= (*

**X**

**x**_{1},

**x**_{2}, …,

**x**_{ n }) is a matrix formed by stacking the eigenvectors of

*in columns.*

**W***be a matrix in subspace*

**M***r*which is spanned by the columns of

*, i.e.,*

**X***= (*

**M**

**x**_{1},

**x**_{2}, …,

**x**_{ r }). The vectors

**M**_{ i }(

*i*= 1, 2,…,

*n*) can be defined as the rows of the truncated matrix

*, as follows:*

**M**We construct a matrix from * M* by reforming each of

*’s rows to have unit length, such as*

**M***P*

_{ ij }=

*M*

_{ ij }/

*σ*mentioned in Eq. (10). Under the above conditions, we can obtain the following result (Tian

*et al.*, 2007):

**Theorem 1**Let

*λ*

_{1}≥

*λ*

_{2}≥ … ≥

*λ*

_{ n }be the eigenvalues of matrix

*,*

**W**

**x**_{1},

**x**_{2},…,

**x**_{ k }be the first

*k*eigenvectors of

*satisfying Eq. (8), respectively. Let*

**W***= (*

**M**

**x**_{1},

**x**_{2},…,

**x**_{ k }). Form the matrix

*from*

**P***by reforming each of*

**M***’s rows to have unit length and let \(P = [p_1^{\rm{T}},p_2^{\rm{T}}, \ldots ,p_n^{\rm{T}}]\),where*

**M**

**p**_{ i }is the

*i*th row vector of

*. Then*

**P***can help obtain the clusterings. After obtaining*

**W***λ*, we can calculate the eigengap vectors as follows:

*K*by analyzing the eigengap vectors as follows, derived from the matrix perturbation theory mentioned above:

Based on Eq. (13), the number of SDN domains can be determined by the associated eigengap values. Given a network topology, we can infer automatically the suitable number of SDN domains by exploiting the structure of the eigenvectors.

### 4.3 Spectral clustering placement algorithm

Now we would like to state our self-adaptive spectral clustering algorithm for the SDN controller placement problem in WAN. The whole algorithm is outlined in Algorithm

.As shown in Algorithm * W* is constructed by Eq. (6) (line 3). Then we use the eigengap to discover the clustering stability and decide the ‘best’ partition number

*K*automatically (lines 4–6). After obtaining the optimal number of SDN domains (

*K*), we can calculate the Laplacian matrix

*and the first*

**L***k*eigenvectors of

*(lines 7–8). Next, we construct a new sub-vector*

**L***corresponding to the first*

**V***k*eigenvectors (lines 9–12). Finally, we use the

*k*-means algorithm to cluster the points into each partition and obtain the center of each partition (lines 13–14). The

*k*-means algorithm can achieve a good placement metric (Eq. (4)). Our self-adaptive spectral clustering algorithm does not need to pre-specify the number of SDN domains, and it can obtain automatically the optimal number of SDN domains by calculating the eigengap, which are proved by the following experiments.

In Algorithm *O*(*n*^{3}) operations, where *n* is the number of nodes in the topology. Computing the first *k* eigenvectors of the Laplacian matrix takes *O*(*n*^{3}) operations and the *k*-means algorithm takes *O*(*n*) operations. Thus, the total cost of our algorithm is *O*(*n*^{3}). This becomes impractical for SDN domain partition where *n* is the number of network nodes and *K* is obtained by repeated experiments. Fortunately, our *K* self-adaptive algorithm can be computed only once. By contrast, the common spectral clustering algorithm needs to repeat experiments to obtain the optimized *K*, and each experiment takes *O*(*n*^{3}) operations. Obviously, our algorithm is more efficient than the other algorithm.

We implemented a Matlab-based framework to compute the spectral clustering placement results. Fig. 3 shows an SDN domain partition plan based on the spectral clustering algorithm. We can see that the OS3E topology is partitioned into four SDN domains equally when *K* = 4. Among these SDN domains, the controllers will be placed in the nodes that are labeled as stars. As expected, our spectral clustering algorithm meets the requirements of the metrics mentioned in Section 3. From Fig. 3, we can see that the four SDN domains have almost the same size, and that the controller location is close to each clustering center, which meet the balanced partition and average propagation latency metrics.

In the following section, we compare the performance of our placement with other placements mentioned in Heller *et al.* (2012), and design a set of advanced testing scenarios to verify it.

## 5 Experiments

To evaluate the controllers, we ran Beacon controller software as the WAN SDN controller with the recommended settings, which is a multi-thread Java-based controller. We relied on the latest available sources of Beacon version 1.04 (April 2014). We chose Beacon because it has better performance than the other controllers (Shah *et al.*, 2013; Shalimov *et al.*, 2013).

We ran cbench instances on multiple modes of the cluster to emulate the switches. As shown in Fig. 4, each cbench instance in our experiments emulated a single OpenFlow switch, and all of these instances sent OpenFlow packet-in messages to a single controller. The cbench instances were connected to the controller with 100 Mb/s interconnects.

To emulate a real network for WAN propagation latencies, we used most nodes of the cluster for running cbench instances. The number of cbench nodes was varied for different experiments, which depended on the metric being calculated. Each cbench emulated a single OpenFlow switch sending packet-in messages to the controller at uniform rates with different delay times. The delay times were calculated by the propagation latencies between two nodes in the WAN topology mentioned above.

*et al.*(2012). The results are shown in Fig. 5.

Fig. 5 shows three placements for *K* =1 and *K* = 4. The higher density of nodes in the northeast of the US relative to the west leads to a different optimal set of locations for different metrics. The spectral clustering placement is most similar to the average-latency placement and completely different from the worst-case-latency placement. For example, all the controllers of spectral clustering and average-latency placement should go in Chicago when *K* = 1, which balances the high density of the east coast cities with the low density of cities in the west. The different ways produce the same result. However, to minimize the worst-case latency for *K* = 1, the controller should go in Kansas City instead, which is near the center of the topology. As expected, the spectral clustering placement is most similar to the average-latency placement when *K* =4. By using the mini-max clustering principle, spectral clustering placement can combine latency with performance. The worst-case-latency placement is defined as the maximum node-to-controller propagation delay and is proved to be the least effective method among the three. Thus, we will consider only spectral clustering and average-latency placement in subsequent sections.

*et al.*, 2014) may obtain the SDN domain results, it needs to set the number of SDN domains

*K*manually. We suggested the approach mentioned in Section 4 to discover the number of SDN domains by analyzing the eigenvectors. The approach leads to a self-adaptive spectral clustering placement. To evaluate the performance of the approach, we applied it to the OS3E topology. Fig. 6 shows the optimal number of SDN domains

*K*, which is indicated by the point corresponding to the highest eigengap. From Fig. 6, we can see that the corresponding eigengap is maximized when

*K*=4. The results are in agreement with the experimental results obtained by setting

*K*manually, and some results of setting

*K*manually are shown in Table 1. From Table 1, it can be seen that each controller has the best balanced nodes when

*K*=4. Thus, this approach can determine the optimal number of clusters automatically for spectral clustering placement.

The numbers of nodes of SDN domains with different *K*’s

| | | | | | | |
---|---|---|---|---|---|---|---|

3 | 14 | 12 | 8 | ||||

4 | 7 | 9 | 10 | 8 | |||

5 | 7 | 8 | 3 | 8 | 8 | ||

6 | 7 | 7 | 4 | 7 | 1 | 8 | |

7 | 7 | 5 | 2 | 2 | 3 | 7 | 8 |

To test the effectiveness of our solution, we presented a comparative performance analysis of spectral clustering and average-latency placements. We designed a set of advanced testing scenarios and conducted experiments under many different settings and metrics, which allow us to get a deeper insight into the WAN controller performance issues. All experiments were performed with Beacon and cbench. We ran each experiment five times and took the average number as the result.

### 5.1 Latency

An important ability of the OpenFlow controller is that it can process the incoming packet-in messages as fast as possible, which we call latency. To measure the controller latencies of the three placements, cbench instances were run in latency mode, in which they generated a packet-in message and waited for a response from the controller before the next packet-in message was sent, and then it counted the total number of responses per second. We kept a cbench instance emulating a single switch, and made many cbench instances send packet-in messages to their controllers with different numbers of connected hosts. Depending on the metric being calculated, the number of cbench instances was varied for different experiments.

For the latency experiments, each test consisted of 500 loops with each lasting 100 ms. The first loop and the last loop were considered as controller warm-up and cool-down, respectively, whose results were discarded. Each test used 100 to 100 000 unique media access control (MAC) addresses (representing emulated end hosts). We kept one worker thread and progressively increased the host density.

### 5.2 Throughput

One of the main objectives for a good controller placement is to minimize the latencies between nodes and controllers in SDN. However, considering only latencies is not sufficient. A placement should also satisfy performance and reliability constraints. In this experiment, we evaluated the effect of controllers in the two placements on the throughput performance, which is the ability to handle a large amount of control traffic. All cbench instances were kept in throughput mode, under which cbench continuously sends packet-in messages to Beacon over a period of time. Our focus in this subsection is to study the average throughput of each controller with different numbers of connected hosts.

The number of hosts in the SDN domain has immaterial influence on the performance of most of the controllers under test. Controller 1 for spectral clustering placement decreased its throughput from 4.3 million to 4.0 million flows per second with 106 hosts. However, in average-latency placement, the performance of controller 3 went down significantly when more hosts were connected. In addition, its controller 2 had the lowest throughput among all controllers. This is caused by the specific details of average-latency placement, namely, the imbalance of its SDN domain partitioning.

From Fig. 10, it can be seen that spectral clustering placement shows better performance than average-latency placement because of its SDN domain partitioning. We also see an unstable trend in throughput with an increasing number of hosts for average-latency placement.

The performance of an SDN controller is defined by two characteristics: latency and throughput (Shalimov *et al.*, 2013). The goal of SDN controller placement is to obtain the minimum latency and the maximum throughput for each controller. Based on this, we find that spectral clustering placement is more effective than the others.

*K*on the network’s performance, we tested the average throughput under the wide traces (http://mawi.wide.ad.jp/mawi) with different values of

*K*. As shown in Fig. 11, the placement performed significantly well and the average throughput changed more gently with a growing number of hosts when

*K*= 4, which agrees with the conclusions drawn from Table 1. From Fig. 11, we can also see that the average throughputs of other placements dropped rapidly because of their imbalanced nodes.

### 5.3 Reliability

Reliability is the ability of the controller to work normally over a long period under an average workload. To evaluate the reliability, we measured the number of failures during a long time period under a given heavy workload. In this experiment, we kept a constant number of eight worker threads for each controller, but increased the total number of packetin messages from the cbench instances running on each node. In our test case, we used 1 000 000 unique MAC addresses per switch for the stress tests, and each switch sent OpenFlow packet-in messages at rates varying from 1000 to 10 000 requests per second. All tests were run for 24 h and the number of errors was recorded during the test. By error, we mean either a failure to receive a reply from the controller or an input/output (I/O) error from the Beacon buffer.

The experiments have shown that most of the controllers successfully coped with the test load, except the third controller for average-latency placement. The third controller for average-latency placement dropped 53 241 567 messages and closed 179 connections. For average-latency placement, the third controller’s failures were caused by serving too many nodes, which leads to the instability of average-latency placement. We also found that the controller was unstable when it served more than 11 nodes in our tests. Compared with the deployment in LAN, the reliability of the controller deployed in WAN declined greatly.

To verify the applicability and effectiveness of spectral clustering placement, we expanded our analysis to more topologies in the Internet Topology Zoo (Knight *et al.*, 2011). The Internet Topology Zoo covers a diverse range of geographic areas, network sizes, and topologies. The graphs in the Zoo do not conform to any single model, and can be used to verify the applicability of our approach. In most cases, we can easily obtain the balanced cut by using spectral clustering placement. We also find that the correct number of clusters is important for spectral clustering placement. When the network has more than 100 nodes, prior knowledge of the number of clusters is required.

## 6 Conclusions

In this paper, we have proposed a *K* self-adaptive SDN controller placement for WAN. Our approach is based on partitioning a large network into several small SDN domains by using the spectral clustering placement algorithm. To maximize the reliability of the controller and to minimize the latency of WAN, we have presented the metrics for spectral clustering placement. We have suggested exploiting the structure of the eigenvectors to determine automatically the number of SDN domains. As a result, a self-adaptive spectral clustering algorithm based on the matrix perturbation theory has been proposed. After presenting a test framework with Beacon and cbench, the ideas and mechanisms were illustrated by using the Internet2 OS3E topology. We conducted experiments under many different settings and metrics. Experimental results showed that self-adaptive placement is good at solving the SDN controller placement problem and determining the number of SDN domains automatically.

However, we noted that understanding the overall SDN controller placement remains an open research problem. The placement is likely a complex function of the topology, metric, and the value of *K*. Our approach presented in this paper is just a first step towards the SDN domain partition. In future work, we expect to expand our analysis to other network latencies.

## References

- Bach, F.R., Jordan, M.I., 2003. Learning Spectral Clustering. Technical Report, No. UCB/CSD-03-1249. University of California at Berkeley, USA.Google Scholar
- Cai, Z., Cox, A.L., Ng, T.S.E., 2010. Maestro: a system for scalable OpenFlow control. Technical Report, TR10-08. Rice University, USA.Google Scholar
- Dixit, A., Hao, F., Mukherjee, S.,
*et al.*, 2013. Towards an elastic distributed SDN controller.*ACM SIGCOMM Comput. Commun. Rev.*,**43**(4): 7–12. http://dx.doi.org/10.1145/2491185.2491193CrossRefGoogle Scholar - Erickson, D., 2013. The beacon OpenFlow controller.
*Proc. 2nd ACM SIGCOMM Workshop on Hot Topics in Software Defined Networking*, p.13–18. http://dx.doi.org/10.1145/2491185.2491189CrossRefGoogle Scholar - Gude, N., Koponen, T., Pettit, J.,
*et al.*, 2008. NOX: towards an operating system for networks.*ACM SIGCOMM Comput. Commun. Rev.*,**38**(3): 105–110. http://dx.doi.org/10.1145/1384609.1384625CrossRefGoogle Scholar - Heller, B., Sherwood, R., McKeown, N., 2012. The controller placement problem.
*Proc. 1st Workshop on Hot Topics in Software Defined Networks*, p.7–12. http://dx.doi.org/10.1145/2342441.2342444CrossRefGoogle Scholar - Hock, D., Hartmann, M., Gebert, S.,
*et al.*, 2013. Paretooptimal resilient controller placement in SDN-based core networks.*Proc. 25th Int. Teletraffic Congress*, p.1–9. http://dx.doi.org/10.1109/ITC.2013.6662939Google Scholar - Kirkpatrick, K., 2013. Software-defined networking.
*Commun. ACM*,**56**(9): 16–19. http://dx.doi.org/10.1145/2500468.2500473CrossRefGoogle Scholar - Knight, S., Nguyen, H.X., Falkner, N.,
*et al.*, 2011. The Internet topology zoo.*IEEE J. Sel. Areas Commun.*,**29**(9): 1765–1775. http://dx.doi.org/10.1109/JSAC.2011.111002CrossRefGoogle Scholar - Koponen, T., Casado, M., Gude, N.,
*et al.*, 2010). Onix: a distributed control platform for large-scale production networks. Proc. OSDI, p.1–14.Google Scholar - Kreutz, D., Ramos, F.M.V., Veríssimo, P.E.,
*et al.*, 2015. Software-defined networking: a comprehensive survey.*Proc. IEEE*,**103**(1): 14–76. http://dx.doi.org/10.1109/JPROC.2014.2371999CrossRefGoogle Scholar - Lin, P., Bi, J., Wang, Y., 2013. East-west bridge for SDN network peering.
*Proc. 2nd CCF Int. Conf. of China*, p.170–181. http://dx.doi.org/10.1007/978-3-642-53959-6_16Google Scholar - Liu, N., Lu, Y., Tang, X.J.,
*et al.*, 2014. Study on automatically determining the optimal number of clusters present in spectral co-clustering documents and words.*J. Chin. Comput. Syst.*,**35**(3): 610–614 (in Chinese).Google Scholar - Mall, R., Langone, R., Suykens, J.A.K., 2013. Self-tuned kernel spectral clustering for large scale networks.
*Proc. IEEE Int. Conf. on Big Data*, p.385–393. http://dx.doi.org/10.1109/BigData.2013.6691599Google Scholar - McKeown, N., Anderson, T., Balakrishnan, H.,
*et al.*, 2008. OpenFlow: enabling innovation in campus networks.*ACM SIGCOMM Comput. Commun. Rev.*,**38**(2): 69–74. http://dx.doi.org/10.1145/1355734.1355746CrossRefGoogle Scholar - Ng, A.Y., Jordan, M.I., Weiss, Y., 2001. On spectral clustering: analysis and an algorithm. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (Eds.). Advances in Neural Information Processing Systems 14, p.849–856.Google Scholar
- Phemius, K., Bouet, M., Leguay, J., 2014. DISCO: distributed multi-domain SDN controllers.
*Proc. IEEE Network Operations and Management Symp.*, p.1–4. http://dx.doi.org/10.1109/NOMS.2014.6838330Google Scholar - Rebagliati, N., Verri, A., 2011. Spectral clustering with more than K eigenvectors.
*Neurocomputing*,**74**(9): 1391–1401. http://dx.doi.org/10.1016/j.neucom.2010.12.008CrossRefGoogle Scholar - Shah, S.A., Faiz, J., Farooq, M.,
*et al.*, 2013. An architectural evaluation of SDN controllers.*Proc. IEEE Int. Conf. on Communications*, p.3504–3508. http://dx.doi.org/10.1109/ICC.2013.6655093Google Scholar - Shalimov, A., Zuikov, D., Zimarina, D.,
*et al.*, 2013. Advanced study of SDN/OpenFlow controllers.*Proc. 9th Central & Eastern European Software Engineering Conf*. in Russia, Article 1. http://dx.doi.org/10.1145/2556610.2556621Google Scholar - Shi, J., Malik, J., 2000. Normalized cuts and image segmentation.
*IEEE Trans. Patt. Anal. Mach. Intell.*,**22**(8): 888–905. http://dx.doi.org/10.1109/34.868688CrossRefGoogle Scholar - Tam, A.S.W., Xi, K., Chao, H.J., 2011. Use of devolved controllers in data center networks.
*Proc. IEEE Conf. on Computer Communications Workshops*, p.596–601. http://dx.doi.org/10.1109/INFCOMW.2011.5928883Google Scholar - Tian, Z., Li, X., Ju, Y., 2007. Spectral clustering based on matrix perturbation theory.
*Sci. China Ser. F*,**50**(1): 63–81. http://dx.doi.org/10.1007/s11432-007-0007-8MathSciNetCrossRefGoogle Scholar - Tootoonchian, A., Ganjali, Y., 2010. HyperFlow: a distributed control plane for OpenFlow.
*Proc. Int. Network Management Conf. on Research on Enterprise Networking*, p.1–6.Google Scholar - Tootoonchian, A., Gorbunov, S., Ganjali, Y.,
*et al.*, 2012. On controller performance in software-defined networks.*Proc. 2nd USENIX Conf. on Hot Topics in Management of Internet, Cloud, and Enterprise Networks and Services*, p.1–6.Google Scholar - von Luxburg, U., 2007. A tutorial on spectral clustering.
*Stat. Comput.*,**17**(4): 395–416. http://dx.doi.org/10.1007/s11222-007-9033-zMathSciNetCrossRefGoogle Scholar - Wang, L., Bo, L.F., Jiao, L.C., 2007. Density-sensitive spectral clustering.
*Acta Electron. Sin.*,**35**(8): 1577–1581 (in Chinese).Google Scholar - Wauthier, F.L., Jojic, N., Jordan, M.I., 2012. Active spectral clustering via iterative uncertainty reduction.
*Proc. 18th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining*, p.1339–1347. http://dx.doi.org/10.1145/2339530.2339737Google Scholar - Xiao, P., Qu, W., Li, Z., 2014. The SDN controller placement problem for WAN.
*Proc. IEEE/CIC Int. Conf. on Communications in China*, p.220–224. http://dx.doi.org/10.1109/ICCChina.2014.7008275Google Scholar - Xie, H., Tsou, T., Lopez, D.,
*et al.*, 2012. Software-Defined Networking Efforts Debuted at IETF 84. Available from http://www.internetsociety.org/articles/softwaredefined-networking-efforts-debuted-ietf-84.Google Scholar - Yin, H., Xie, H., Tsou, T.,
*et al.*, 2012. SDNi: a Message Exchange Protocol for Software Defined Networks (SDNS) across Multiple Domains. Available from https://tools.ietf.org/html/draft-yin-sdn-sdni-00.Google Scholar - Yu, M., Rexford, J., Freedman, M.J.,
*et al.*, 2010. Scalable flow-based networking with DIFANE.*ACM SIGCOMM Comput. Commun. Rev.*,**40**(4): 351–362. http://dx.doi.org/10.1145/1851275.1851224CrossRefGoogle Scholar - Zelnik-Manor, L., Perona, P., 2004. Self-tuning spectral clustering.
*In*: Saul, L.K., Weiss, Y., Bottou, L. (Eds.), Advances in Neural Information Processing Systems 17, p.1601–1608.Google Scholar