1 Introduction

Recently, the Industry 4.0 has emerged as a new paradigm able to radically transform the organizations’ production and business in a myriad of sectors beside the smart manufacturing one [1, 2]. The core of the paradigm that justifies also its generality and viability on different markets is the concept of Cyber-physical Systems (CBSs), i.e. the strict integration between physical elements and computational data enabled by the recent advances on the Internet of Things (IoT) [1]. Hence, the ability to collect, aggregate and analyze sensor data is crucial for the growth of the Industry 4.0 model. At the same time, today’s IoT is a chaotic environment characterized by heterogeneous hardware devices, network protocol stacks and data formats. The current fragmentation can significantly increase the deployment costs, since collected data can remain largely inaccessible in an integrated way unless investing significant manual effort [2]. At the same time, interoperability can represent an opportunity for next-generation IoT applications: the McKinsey report in [3] quantifies in 40% the additional market value that might be provided by achieving full interoperabilìty among IoT ecosystems.

Among the several approaches proposed so far in order to address IoT interoperability problems, the Web of Things (WoT) has gained considerable attention, thanks to the popularity and well-known unifying nature of the Web [4, 5]. Differently from other stack-oriented solutions (e.g. 6LoWPAN), WoT-based approaches propose to achieve system interoperability at the application layer, abstracting from the sensing and communication technologies: in a first approximation, Things are represented as Web resources, and all the interactions toward and between Things are mapped over Representational State Transfer (REST) services [5]. At the same time, given the lack of a reference architecture, several different WoT frameworks have been proposed in the literature (e.g. [6,7,8,9]), introducing further fragmentation and the consequential need of devising ad-hoc solutions for the system integration. Breaking the deadlock, the World Wide Web Consortium (W3C) has recently proposed a reference architecture of the WoT [10] that formally describes the interfaces allowing IoT devices and services to communicate with each other, regardless of their underlying implementation. In the W3C WoT vision, everything can be considered a Thing and to this purpose, each Thing is associated to a Thing Descriptor (TD) providing general metadata as well as the interactions, data model, and security mechanisms of a Thing [10]. In addition, a TD can be serialized and semantically annotated via the JSON-LD language, hence representing a uniform model to enable Machine-to-Machine (M2M) communication toward a Thing and enabling several semantic features, like for instance the Thing Discovery (TD). The generality of the WoT architecture makes it suitable for all those scenarios characterized by the need of aggregating data from multiple, heterogeneous sources, like the Industry 4.0. However, due also to its recent appearance, few implementations and test-bed of the W3C WoT have been described so far in the literature [11, 12].

In this paper, we attempt to fill such gap, by describing the design and implementation of a WoT testbed, consisting of a monitoring system of a generic production site that must retrieve and process sensor data from heterogeneous devices using different wireless access technologies (i.e. Wi-Fi, 802.15.4/Zigbee, BLE). The overall goal is to devise mash-up applications able to orchestrate the sensing operations over the target scenario regardless of the network protocols and hardware, hence decoupling the rationale of the monitoring process (e.g. minimal scenario coverage) from its implementation (i.e. the technology used to query the sensor). More specifically, we introduce three main contributions in this study:

  • First, we describe how the scenario can be modeled within the WoT W3C framework. We associate one Thing to each sensing device, and one Thing to the sensor network, by defining the metadata of each. Moreover, we discuss how the components of the WoT W3C architecture have been made concrete in our application.

  • Second, we describe the design and implementation of mash-up applications aimed to orchestrate the sensing operations on the target scenario. We considered four different sensing policies, aimed to balance the coverage of the scenario with the network performance (e.g. delay, packet delivery ratio and energy). All the policies are in charge of dynamically selecting the sensors to query at each instant in order to maximize the policy-specific metric: to this purpose, given the dinamicity of the environment, we employ the Reinforcement Learning (RL) framework [17] to optimally balance the exploration-exploitation tasks.

  • Third, we report a subset of the experimental results from the WoT testbed. We investigate the performance of the sensing mash-up applications with respect to the policy goal (e.g. delay), and the convergence over time. Moreover, we show the benefit introduced by the WoT architecture in terms of adaptive design, i.e. the possibility to dynamically switch the sensing policies over time without re-configuring the communication infrastructure, and the overhead introduced by the WoT components.

The rest of the paper is structured as follows. Section 2 reviews the WoT W3C architecture, and its recent applications. Section 3 introduces the test-bed, and the modeling of the network components within the WoT W3C architecture. Section 4 introduces the mash-up policies. Section 5 presents a subset of the experimental results. Conclusions and future works follow in Sect. 6.

2 Related Works

Since 2007, when the concept of WoT appeared in the literature, several research studies have explored how to interconnect IoT devices through standard Web technologies. This has also lead to a proliferation of WoT frameworks and architectures, which are quite different in terms of WoT-based interaction patterns supported and functional goals addressed. For instance, the authors of [6] review more than twenty WoT frameworks on the basis of twelve elements which are taken as key components of the WoT. Although the REST paradigm is considered the reference solution to implement WoT-oriented services (e.g. [7]), alternatives to the HTTP protocol have been considered: for instance, CoAP-based architectures are proposed, among others, in [8] and [9]. A generic model supporting interoperability and mash-up operations from different hubs is proposed in [4]: here, the authors warn about the proliferation of WoT tools, and advocate for the need of standard solutions.

Fig. 1.
figure 1

The abstract architecture of W3C WoT (source: [10]). The image shows the internal blocks and the multiple interactions patterns among the Things.

To this purpose, the W3C WoT group started its activities on 2015 with the goal of defining a reference WoT set of standards, enabling interoperability among different IoT systems. In this paper, we refer to the W3C WoT draft presented in [10]. In brief, the W3C WoT architecture is composed of four main blocks:

  • Thing: this is an entity that can be semantically represented. Using the W3C words: “A Thing is the abstraction of a physical or virtual entity... This entity can be a device, a logical component of a device, a local hardware component, or even a logical entity such as a location” [10].

  • WoT Thing Description (TD): this represents the metadata of the Thing, including its interactions, data models, communication protocols and security mechanisms. By default, the TD is serialized with the JSON-LD language and following the Properties, Actions, and Events paradigm.

  • WoT Binding Templates: this is the metadata describing the communication strategies that the Thing is able to implement. For instance, a possible strategy could be the following: Machine-to-Machine (M2M) over the MQTT protocol, with TLS security mechanism enabled.

  • WoT Scripting API: this is a WoT interface allowing scripts to perform main operations on a Thing, like adding properties, reading properties, or retrieving its TD.

All the blocks above are implemented within a software runtime named Servient, which can indifferently act as a Server or as a Client. In the first case, the Servient is said to host and expose Things, i.e. it takes the TD as input and creates a software object serving the requests like accessing the exposed properties, actions and events. In the second case, the Servient is said to consume Things, i.e. it creates a runtime resource model that allows accessing the properties, actions and events exposed by the server Thing on a remote device.

Figure 1 depicts the abstract W3C architecture for the WoT, including the blocks within each Thing and their possible interactions. In particular, the W3C working group identified a short list of interaction patterns that are general enough to cover most of the existing IoT deployments, regardless of the application domain. The simplest one is the Client-Thing interaction, i.e. a Web application that invokes actions on a remote Thing, after having consumed it.

Due to the recent appearance of the W3C WoT standard, few real-world applications and testbed have been proposed so far in the literature. A demo showing the possibility to query a W3C WoT sensor device from a mobile phone is sketched in [11]. In [12], an interesting application of the W3C WoT architecture to the automotive industry is described; more specifically, the authors illustrate how to describe the car signals data with a semantic ontology, and how to make them available to external applications through the W3C WoT interaction patterns. Security risks and vulnerabilities presented by WoT metadata are discussed in [13]. Versioning mechanisms for the TD metadata are proposed in [14]. Finally, in [15], we proposed the WoT Store, a W3C WoT-compliant framework that enables the semantic Thing discovery and the seamless distribution and execution of WoT applications. The WoT Store constitutes the natural execution environment for the mash-up applications considered in this study: we plan to explore such feature in a future work.

Fig. 2.
figure 2

The IoT/WoT monitoring system deployed in this study.

3 The W3C WoT Testbed: Architecture and Components

The goal of this study is to investigate the suitability - both in terms of ease of deployment and of performance- of the W3C WoT architecture for Industry 4.0 applications. To this purpose, we consider a generic IoT monitoring system of a production site, characterized by the presence of heterogeneous sensors using different communication technologies. The overall architecture of the testbed, depicted in Fig. 2, is structured on three tiers:

  • Edge layer. This layer is composed of three Wireless Sensor Networks (WSNs), operating over the same environment: an IEEE 802.15.4 WSN network, a IEEE 802.11 Wi-Fi WSN network and a BLE device. The 802.15.4 network includes four devices (Arduino Xbee boards), with one Coordinator and three Leaf nodes equipped with sensing units (ThinkerKit temperature sensor). The Wi-Fi network includes three devices (two NodeMCU and one Arduino WiFly board), all provided with a direct link toward the Access Point (AP) and with a DHT11 temperature/humidity sensor. Finally, the BLE WSN consists of one ESP32 board, provided with a DHT11 sensor.

  • Fog layer. The 802.15.4 coordinator, the BLE and the Wi-Fi devices are connected to the corresponding Fog node, via USB cable links (for the BLE and the 802.15.4 Coordinator) or Wi-Fi links (for the IEEE 802.11 devices). Each fog node is constituted by a Raspberry PI3B+ board and it is in charge of exposing the corresponding Web avatar (i.e. the Web Thing) for each managed device and WSN.

  • Processing layer. This layer implements the logic of the monitoring system. It is constituted by a Linux server running the mash-up applications further defined in Sect. 4, and connected to the Fog nodes via Wi-Fi links. More specifically, the layer is in charge of: (i) orchestrating the sensing operations, by properly selecting the devices to query at each time slot according to the policies of Sect. 4; (ii) storing the collected data within a time-series database; (iii) processing and analyzing the data in order to implement the Digital Twin model of the monitored site.

In this study, for space reasons, we omit the data analytics process, and also the creation of the Digital Twin model, leaving it to future works. Instead, we detail the data retrieval operations, and specifically the way we implemented the WoT W3C components of the architecture reported in Fig. 1, i.e.:

  • Edge devices implement low-level communication and sensing operations in the embedded firmware. The implementation as well as the list of operations and the data format used by each device is technology dependent. This layer is part of the IoT, while it is not covered by the WoT architecture.

  • Fog nodes run a W3C WoT Servient, by using the JavaScript (JS) framework available at [16]. Each Fog node exposes two types of Web Things, i.e.: multiple (i) Thing Devices, describing the properties, events and actions of physically managed edge devices, and one (ii) Thing Network, describing the overall performance of the virtual WSN composed by the list of connected Thing Devices. Moreover, we consider three possible protocol bindings for each Thing, i.e. interaction modes with the Things, based on the HTTP (default choice), the CoAP or the MQTT protocols. The System APIs are implemented in Javascript, and further structured into two layers, i.e.: (i) a Device Query level, that is in charge of issuing request-response communication with the Edge device, based on the wireless technology and the protocol stack supported by this latter (e.g. UDP socket for the WiFi devices, Serial socket for the Zigbee Coordinator, BLE connected mode for the BLE device), (ii) an Inter-Process Communication (IPC) level, that makes the sensor data available to the upper Scripting APIs via IPC facilities (in our case, implemented in the ZeroMQ libraryFootnote 1).

  • Finally, the Processing node interacts with each Fog node/Servient in order to consume Things, e.g. by periodically invoking the getData action from the Things selected according to the actual mash-up policy.

Table 1. Example of Properties, Actions, and Events described in a Thing Description of a Device Thing.

Table 1 shows some of the properties, actions, and events described in the Thing Description (TD) for a Device Thing. The TD of a Network Thing includes only properties that are referred to the average network performance (i.e. the delay, the packet delivery ratio and the throughput) and actions that can be invoked from the entire network, like for instance the getAllData(). Similarly, the snippet below shows a code fragment of the mash-up application, specifically the way we query a sensor device in order to read its temperature value. We can notice that - through the WoT architecture- the mash-up application is agnostic on the wireless access technology in use, and retrieves data from heterogeneous sensors by means of a common API regardless of the WSN implementation. The rationale of the sensing applications is presented in the Listing 1.1.

figure a

4 The W3C WoT Testbed: The Mash-up Sensing Policies

We implemented multiple mash-up sensing policies, and we tested the capability of switching among them in a seamless way in Sect. 5. To this purpose, let D be the set of available devices, and \(W(d_i)\), \(\forall d_i \in D\), be the function describing the WSN type. In our testbed, \(W: D \rightarrow \{WiFi, BLE, Zigbee\}\). We assume the time to be divided into discrete time-slot, i.e. \(T=\{t_0, t_1, ....\}\), corresponding to sensing events when the mash-up application is issuing getData command toward a selected subset of the available devices. Let \(t_{interval}\) be the temporal interval between two measurements, i.e. the time difference between \(t_{i+1}\) and \(t_{i}\), assumed constant. Moreover, let \(\kappa : D \times T \rightarrow \{0,1\}\) the function indicating whether device \(d_i\) is active, i.e. it is used at time slot \(t_j\) (in this case, \(\kappa (d_i,t_j)=1\), otherwise \(\kappa (d_i,t_j)=0\)). All sensing policies share a common rationale, i.e.: they keep the area covered higher than a predefined threshold, while maximizing a performance index I. In our case, the area coverage is expressed in terms of number of active devices (M) at each time-slot. More formally, all policies address the optimization problem formally defined below:

$$\begin{aligned} \begin{aligned} \text {Goal}:\,&\text {Maximize}\,\,I \\ \text {Constraint}:&\sum _{d_i \in D}{\kappa (d_i,t_j)=M, \forall t_j \in T} \end{aligned} \end{aligned}$$
(1)

The performance I can vary according to sensing policy in use. We implemented and tested four different metrics:

  • Static Energy-aware policy (\(P_0\)). The mash-up application selects the M active devices at each time-slot according to a pure round-robin scheme, in order to discharge them with the same rate.

  • Dynamic Delay-aware policy (\(P_1\)). The mash-up application takes into account the average delay required to issue a getData command and to receive the corresponding reply message. The M devices with the lowest Round Trip Time (RTT) are selected at each time slot.

  • Dynamic PDR-aware policy (\(P_2\)). The mash-up application takes into account the communication reliability of each sensor expressed in terms of average Packet Delivery Ratio (PDR), i.e. the ratio of received replies over the total number of getData requests sent toward each \(d_i\). Specifically, the M devices with the highest PDR values are selected at each time slot.

  • Dynamic Delay-PDR-aware policy (\(P_3\)). The mash-up application takes into account both the delay and the PDR, as better explained in the following.

Excluding \(P_0\), all the other policies compute the M sensors to query at each time-slot based on the current traffic loads and network conditions. For this reason, we employ a dynamic, learning-based scheme based on the Reinforcement Learning (RL) frameworkFootnote 2. In brief, this latter refers to a class of machine learning algorithms where an agent learns over time the optimal sequence of actions needed to perform a task, by dynamically interacting with the environment and by receiving a numeric reward at each interaction. More formally, the RL framework can be represented as a Markov Discrete Process (MDP) \(<S,A,R,TR>\) where: S is the set of States, A is the set of Actions, \(R:\{S,A\} \rightarrow \mathrm{I\!R}\) is the Reward function, expressing a numeric reward received by the agent when executing action \(a_j\in A\) in state \(s_i\in S\), and \(TR:\{S,A\} \rightarrow S\) is the transition function, expressing the next state \(s_j\) after performing action \(a_j\) from state \(s_i\) (a deterministic environment is assumed). The goal of the RL agent is hence to determine the optimal policy function \(\tau : S \rightarrow A\) that indicates the optimal action to execute at each state, so that the long-term reward is maximized. In our modeling, we omit the state function S, while the list of action A coincides with the list of devices D. The immediate reward \(R(d_i)\) is computed when issuing a getData command on sensor \(d_i\), according to the policy in use:

  • \(P_1\): this is the RTT for each getData command. Only successful requests (i.e. reply messages are received) are considered.

  • \(P_2\): this is a positive value (+1) if the getData is successful, 0 otherwise.

  • \(P_3\): similarly to \(P_1\), however a penalty equal to \(t_{timeout}\) is applied in case no reply is sent back after a timeout.

Each time a getData is issued on \(d_i\), and the immediate reward \(R(d_i)\) is computed, we also update the Q-value entry at time slot t for \(d_i\) as follows:

$$\begin{aligned} Q_t(d_i)=Q_{t-1}(d_i) + \alpha \cdot (R(d_i)-Q_{t-1}(d_i)) \end{aligned}$$
(2)

where \(\alpha \) is a learning rate, set equal to 0.7 in our experiments. Balancing the exploration and exploitation issue is a crucial issue in dynamic environments [17]. For this reason, we consider an \(\epsilon \)-greedy exploration scheme, i.e.: each time a getData is executed, the policy selects with probability \(1-\epsilon \) the sensor with the k-th highest Q-value, and it performs a random selection over D otherwise (avoiding duplicates). We repeat the \(\epsilon \)-greedy selection M times at each time slot, since all policies need to guarantee an M-coverage of the scenario (in other words, the k above varies between 0 and \(M-1\)). The \(\epsilon \) parameter is progressively discounted at each time slot, i.e. \(\epsilon _t=\epsilon _{t-1} \cdot \psi \), with \(0< \psi < 1\), in order to reduce the exploration over time. At the same time, the \(\epsilon \) parameter cannot decrease below a minimal threshold (\(\epsilon _{min}\)), i.e. a default exploration rate is kept anyway in order to detect any possible change in the scenario, and to adapt the system policy accordingly. We set \(\epsilon \) = 0.8, \(\psi \) = 0.97, \(\epsilon _{min}\) = 0.1 in our testbed.

Fig. 3.
figure 3

The average per-device RTT and PDR is shown in Fig. 3(a) and (b), respectively. The per-device RTT for the CoAP protocol is shown in Fig. 3(c).

5 The W3C WoT Testbed: Experimental Results

In this Section, we report a subset of experimental results collected through the WoT testbed described above. The experimental analysis is divided in three stages: (i) first, we characterize the overall performance of different WSNs and sensors; (ii) second, we evaluate the four different mash-up policies of Sect. 4; (ii) finally, we demonstrate the possibility of dynamic mash-up policy replacement and quantify the overhead introduced by the W3C WoT architecture. Fig. 3(a), (b) and (c) refer to the first analysis. Specifically, Fig. 3(a) and (b) show respectively the average RTT and PDR for each device and WSN type, when the HTTP protocol is used to interact with each Web Thing. It is easy to notice that the Wi-Fi devices are producing the lowest RTT values. The PDR original results demonstrated that the Wi-Fi WSN is also the most reliable technology. However, in order to differentiate the mash-up policies, we introduced a probabilistic packet filter on the Wi-Fi Serviant, discarding the sensor data messages with a loss rate equal to 70% to emulate a congested access point. As a result, comparing Fig. 3(a) and (b), we can notice that the sets of M=3 nodes maximizing the RTT or the PDR depends on the selected performance index. Finally, Fig. 3(c) shows the per-device RTT when the CoAP protocol is used for data gathering. Only minimal differences can be noticed compared to the HTTP case (Fig. 3(a)).

In Figs. 4(a) and 5(a), we evaluate the performance of different mash-up policies. Figure 4(a) shows the RTT values of \(P_0, P_1, P_2, P_3\) algorithms over time-slots; as expected, \(P_1\) produces the lowest delay since it takes into account the per-packet RTT as immediate reward. Also, we can appreciate the learning phases of \(P_1\): the RTT is high during the exploration phase and it is progressively reduced when increasing the amount of exploitation. After time-slot 1000, the RL algorithm has discovered the optimal set of sensors, however it keeps performing random actions for continuous, minimal exploration. This justifies the jagged shape of the plot. In Fig. 4(b) we depict the per-device ratio of utilization over time for the policy \(P_1\). While during exploration all the devices are equally used, after time-slot 1000 the mash-up policy is mostly exploiting the three Wi-Fi devices since -in accordance with Fig. 3(a)- they are associated to the lowest RTT values. Figure 4(c) compares the policies in terms of PDR. Here, the optimal policy is \(P_2\); form Fig. 5(a) we can notice that, after the exploration phase, the three Zigbee devices are maximally used, hence conversely to Fig. 4(b) but again in accordance with Fig. 3(b). We tested the dynamic policy replacement in Fig. 5(b); i.e. from time-slot 1 to 3000, policy \(P_1\) is used (delay minimization), then \(P_2\) from 3001 to 6000 (PDR maximization), finally we switch to \(P_3\) (delay-PDR trade-off) from instant 6001. We remark that the policy replacement is simply implemented as the shut-down of a JS process and the execution of a new one, thanks to the abstraction provided by the W3C WoT architecture; no hardware or software re-configuration of the WSNs is required. Finally, we evaluate in Fig. 5(c) the overhead introduced by the W3C WoT deployment, and specifically by the WoT servient: to this aim, we compute the RTT required to perform a sensor request directly at the System API level. We can notice that most of the overhead is due to the channel access and the processing at the firmware level, while the overhead introduced by the Servient and by the additional communication with the Web Thing is negligible.

Fig. 4.
figure 4

The RTT an PDR values for the four mash-up policies are shown in Fig. 4(a) and (c). The device utilization ratio for the \(P_1\) policy is shown in Fig. 4(b).

Fig. 5.
figure 5

The device utilization ratio for the \(P_2\) policy is shown in Fig. 5(a). The RTT and PDR values when replacing the active policy at runtime in shown in Fig. 5(b). The RTT when enabling/disabling the WoT approach is shown in Fig. 5(c).

6 Conclusions and Future Works

In this paper, we have described the deployment and evaluation of a W3C-based WoT system for generic site monitoring applications. In order to stress the interoperability issue, we have realized a testbed composed of three heterogeneous WSNs and mash-up applications orchestrating the sensing operations over the monitored area. We have discussed how the different WoT components have been instantiated on the target scenario, and we have demonstrated the capability of decoupling the sensing policy from the low-level networking operations, thanks to the Thing meta-data description. Future works include: the digital twin model implementation, the evaluation of additional mash-up policies, the integration with the WoT Store [15] framework.