1 Introduction

The term Internet of Things (IoT) was first used years ago by industry researchers, but has only recently emerged into the mainstream public eye [1]. This concept is used to describe the capacity of network connected equipments and devices to sense and collect different types of data and then share that information across the Internet in order to be processed and used in several applications. The term industrial Internet is commonly used interchangeably with IoT [2], which is not necessarily correct since it refers primarily to commercial applications of the IoT technology in the manufacturing field, whereas the IoT covers a much wider range of applications and therefore it is not limited to industrial ones. It is believed that in the next decades the IoT will have a major impact on how society will evolve. In fact, the numbers confirm that the IoT is growing fast, gaining vast attention from a wide range of industries. Projections show that the IoT will be one of the most important areas of future technology [1].

The data collected from a wide variety of IoT devices such as, surveillance video cameras, sensors, vehicles, home appliances, medical devices, among others, will promote the appearance of numerous applications and new services for citizens, companies and public administrations. Indeed, the IoT concept can be used in a large number of disparate domains, such as industrial and home automation, civil protection, elderly assistance, e-health, smart grids, smart and precision agriculture, traffic management, and so on.

One of the fields where the IoT concept can be used to respond to the challenges launched by many governments worldwide is in the intelligent management of cities, i.e., in the implementation of the Smart City concept [3]. This concept does not yet have a uniquely universal definition, and, in some cases, its usage is inappropriate. The concept behind the Smart City designation is increasingly popular, though in many cases it is referred to by different names and in different circumstances, since there is a range of conceptual variants generated by replacing “Smart” with other alternative adjectives. Nevertheless, nowadays the usage of the word “Smart” captures the innovative and transformative changes driven by new technologies, not forgetting the social factors which are also important [4]. Thus, in an attempt to define a Smart City, it can be said to be a large organic system, having a close relationship with all its core subsystems, where none of them operates in isolation [4]. In a simpler way, it is a city that uses advanced technologies to face the main problems of urban life, such as traffic, parking, lighting, pollution, garbage collection, city crowding, poverty, surveillance and maintenance of public areas, among others [5, 6].

It is difficult to find a definition of closed-circuit television (CCTV) in the technical and scientific literature, as many writers assume in their texts that everybody knows what they are referring to [7]. Nonetheless, in a simplified way, a CCTV system can be considered as a closed system that gathers video images in a single place. It differs from video broadcast systems because the signal is not openly transmitted. Thus, it can be used to maintain a close observation of a person or group. Indeed, one of the main usage domains of CCTV systems is in the surveillance of areas that may need monitoring, such as banks, stores and other areas where security is needed. These systems are used to monitor behaviors, activities, or other information concerning the specific location, in order to possibly prevent disasters and to protect goods and people. Despite the advantages listed above, these systems require a large amount of storage space to keep the recorded videos, as well as people watching them at all times.

Security is a growing concern of modern societies. As time goes by, cities are investing in intelligent surveillance systems, which contribute to the reinforcement and deployment of the Smart City concept. With smarter cities comes smarter responsibilities, video surveillance being one of them [8]. Nowadays, with the steady increase of the population and people moving around the world, it is not physically possible for law enforcement agents to watch every event and follow suspects around for great lengths of time. Consequently, agencies have begun relying on surveillance systems and technologies to help them monitor people’s activity and to keep places safer.

This paper presents a low-cost smart surveillance platform designed to create a ubiquitous environment and to adapt to the client’s needs, providing them with the best experience possible. The architecture was thought to have the lowest cost possible and to satisfy the different needs of each user, by allowing them to choose which type of surveillance cameras to employ and where to place them. The solution developed is suitable to citizens, companies and public organizations (e.g., city council). The service provided by the developed platform uses intelligent data recording, only registering the moments where motion is detected to reduce the storage space needed, to facilitate and speed up the occurrence search. In addition, it is also able to notify the client during an occurrence.

The proposed and developed integrated platform is a low-cost, scalable and customizable surveillance system with the intent of providing a secure environment and improving people’s perception of security. The rest of the paper is organized as follows. Section 2 presents some studies related to surveillance systems. The general architecture of the proposed solution is described in Sect. 3. In Sect. 4, the implementation of a functional surveillance system is presented. Finally, in Sect. 5, the conclusions are drawn and some ideas for future work are presented.

2 Background

The field of Smart Cities is growing fast and becoming increasingly popular as a subject, particularly with regards to security. For this reason, some important scientific studies are available in the technical and scientific literature referring to the security domain involving Smart Cities.

The work presented by Duarte Duque et al. [9] states that with assistance from state of the art algorithms to segment, track and classify moving objects it is possible to turn the video surveillance system into an observer. These video surveillance systems are able to detect and predict abnormal behaviors using real-time unsupervised learning. The large deployment of these surveillance cameras could easily create an intelligent surveillance system, thus avoiding the need to have people analyzing surveillance videos, even when there are no occurrences.

The intelligent video surveillance system presented in [10], which is based on an image subtraction method allowing to the object to be identified, demonstrates that it is possible to detect motion in live video feed.

In [11], a way of tracking moving objects is presented, as well as how to create a filter that selects only the objects of interest. This paper demonstrates that surveillance systems with motion detection could also be used to search for something specific.

The work presented in [12] states that with a low-cost, low-power microcomputer (Raspberry Pi) and a low-cost camera it is possible to create a surveillance device capable of streaming the captured video in real-time to any browser. This paper further describes that it is still possible to reduce the required storage space for these types of solutions with the help of motion detection algorithms.

As stated in [13], a Smart City is supposed to be a safe place, where video surveillance plays an important role. However, keeping operators in a control room 24/7 is not the best option. Instead of having operators watching the surveillance videos permanently, the system, which comprises innumerable cameras spread everywhere, should be able to detect and track suspicious objects autonomously and in real-time.

The studies presented above make a significant contribution to the field of smart video surveillance, as the concepts reported in these papers are very useful and can be used as a base reference. The video surveillance solution described in this paper is a low-cost, scalable and customizable to client’s needs.

3 System Architecture

The architecture of a surveillance system for smart cities should be modular, scalable, ubiquitous and, most importantly, low-cost. The system modularity is accomplished if the user has a panoply of different equipment at their disposal that can be chosen in order to implement a given solution. Thus, they can choose a device with different characteristics, according to their needs. For example, if the user only wants live streaming functionalities and does not care about image quality, they can choose the device with the lowest price that includes live streaming capabilities. On the other hand, if the user wants to have video analysis functionalities and good image quality, they should choose a good performance device in spite of the price.

The scalable characteristic is important to maintain the flexibility of the system and control costs (minimized). In order to do so, all the smart camera devices should be wireless, i.e., the devices can only communicate with the internet wirelessly. Thus, with this solution there is no need to add physical infrastructures (e.g., cables) when a new device is set up. Besides that, the user can always add more modules to their system according to their needs, without the requirement of having to change the existing infrastructure.

Ubiquity is a very important characteristic of smart surveillance systems. To that end, the system can only be deployed where there are internet connections of some sort (e.g. wireless, mobile data or cable internet). Lastly, the overall system acquisition and maintenance costs should be kept low. This can be accomplished by a careful selection of common IoT hardware that fulfil the specifications and price.

The proposed system architecture, shown in Fig. 1, comprises several entities, namely, the server entity, surveillance entity and the user’s entity. The server entity, represented in Fig. 1 by the Online Platform, is responsible for managing all the data received from the devices’ modules and to provide an online platform allowing the authenticated users to access the data anytime, anywhere.

Fig. 1.
figure 1

Architecture of the proposed solution.

The surveillance entity, represented by the “Smart Cities” and the “Client’s House” blocks in Fig. 1, includes the devices responsible for recording and detecting motion. This entity’s devices are connected to the server entity via internet.

The Authenticated User block of Fig. 1 corresponds to the user’s entity. This entity represents all the authenticated users, which can access the devices in the surveillance entity and check live-stream feeds, take pictures, check motion logs and set alarms. If a person (or an organization) wants to become an effective user, they only needs to access the online platform and create an account. After that, the new authenticated user can set up all their devices and enjoy all the services provided by the integrated system.

The proposed system and architecture is based on the edge computing concept since all the data are managed and processed on both edges of the system, balancing the computing power needed on the used IoT devices. Since the relevant data are produced on the devices’ modules, at the edge of the network, it is more efficient in terms of the required system communication bandwidth to process the data at the end devices than in the cloud servers [14]. If the processing data operation was performed in the cloud servers, a larger bandwidth would be required.

4 Implementation

This section presents and describes the development and implementation of a functional prototype in order to demonstrate the technical and economic feasibility of the proposed smart surveillance system. Figure 2 presents the diagram of the implemented system. As can be seen, the development encompassed three different camera modules (with different characteristics and performances) and the online platform.

Fig. 2.
figure 2

Diagram of the camera modules and online platform.

The Raspberry Pi 3 [15] module, which belongs to the surveillance entity referred to in Sect. 3, is the most advanced of the three developed modules. It can detect motion, record videos, take pictures, and it can also stream live videos. Among the three developed modules, this is the one that has the highest cost; on the other hand, it has more features and provides the best image quality.

The motion detection feature of this module requires the Raspberry Pi computing power to process and continuously analyze the video captured by the RaspyCam [16] that is attached to the Raspberry Pi 3. This module can be seen as a smart surveillance camera. The motion detection algorithm developed for this module uses the image recognition library of the OpenCV [17]. When a movement is detected by the smart surveillance camera, it immediately starts to record the video until the movement ceases. Then, the smart device uploads the recorded video to the online platform module. The live streaming and picture taking features are always available on the smart surveillance cameras. To activate them, the user only needs to make that request. After doing so, the live stream or picture taking become immediately available. All the captured data (videos and pictures) are sent and stored in the online platform module.

The Raspberry Pi Zero [18] and ESP-32 [19] modules are also part of the surveillance entity of Sect. 3. These two modules are constituted by devices with a computing power lower than the one used in the module described above. Because of that, these modules do not have the motion detect feature via local video analysis, though they are cheaper alternatives to live-streaming videos. The Raspberry Pi Zero module has more computing power than the ESP 32 module, so it provides higher image quality and also allows the user to take pictures from the online platform. These devices are important for the modularity and scalability of the system because several options are offered to the users, allowing them to choose the right device for their needs.

In Sect. 3, the online platform of Fig. 2 is the server entity. In the developed system, the main online platform is responsible for centralizing all the data sent by the smart surveillance modules, allowing the user to check live video streams, take pictures, and check the movement log of all their devices at anytime, anywhere. Through this module, it is also possible to set alarms on each device. When triggered (by movement detection), the system alerts the user via email and sends the video where the movement was detected. The communication between the smart surveillance modules and the online platform is accomplished by an application programming interface (API), which allows registering devices and information exchange in real time.

The devices of the camera modules responsible for managing the smart surveillance features are the Raspberry Pi 3, Raspberry Pi Zero and ESP-32. In the implementation of the motion detection feature in the Raspberry Pi 3 module, the OpenCV library was used, while the Python 3.0 [20] was used to develop the scripts (using threads) responsible for recording, storing and sending the video to the cloud. The Flask framework [21] was also used to create a webserver with endpoints that enable the video streaming and picture taking functionalities. In the Raspberry Pi Zero camera module, the Python 3.0 was also used and the Flask framework for the same functionalities but due to the lack of computing power (Raspberry Pi 3 has 1.2 GHz and quad core processor vs Raspberry Pi Zero that has a 1.0 GHz and single-core processor) and memory (Raspberry Pi 3 has 1 GB vs 512 MB of the Raspberry Pi Zero), the video recognition is not available. The ESP-32 was programmed with C language and, once again, due to a lack of computing power, it was only possible to live stream video with this module.

The online platform consists of one website, an API for the communications with the devices, a database and the ownCloud [22] service. The website and API were developed using the Laravel [23] platform (using PHP and JavaScript). The website comprises different areas, namely, the landing page, login, registration, alarms and devices management. The API is constituted by routes that enable the devices to connect to the platform. To verify the user, the route /api/login will submit the user data (email and password) and return an authentication token (if the user is registered on the platform). This token is used to ensure the user authenticity. The route /api/device/add is used to register a device using the user token and to add the device information (e.g., MAC address, IP address, state, type of device) to the database. The route /api/motion/add, which uses the user token, adds a motion log to the database and a uniform resource locator (url) to the video. The route /api/picture/add, which also uses the user token, adds the time of the picture and the ownCloud shared link to the database.

The ownCloud service is responsible for managing the pictures and videos of each user cameras. When the user creates an account on the online platform, a user account is also automatically created on the ownCloud platform. When motion is detected, or a picture is taken, the device sends the video to the ownCloud servers, as illustrated in Fig. 3. After the video is stored, the ownCloud returns a shared link to the device that is published on the website using the routes /api/motion/add for the video and /api/picture/add for the image captured. Also, an email is sent to the user, warning them that a motion was detected.

Fig. 3.
figure 3

Motion recording diagram.

5 Assessment Tests of the Solution

Several tests were conducted to assess the performance of all the electronic parts (hardware and software) and of the cloud platform. The used methodology for testing the electronic parts is the following one. First, each surveillance camera was tested independently of all the others. The video stream from each one must be accessible on the cloud platform. This way, the quality of the video stream of each surveillance camera can be analyzed. After the video stream validation of all the surveillance cameras, the ones able to take pictures were tested. To do so, the platform requested each surveillance camera to capture a frame from the stream. Then, it was checked whether the frame was stored in the cloud. After confirming the expected behavior of each surveillance camera, the communication with the cloud and captured feature were both evaluated. To test the motion detection, the surveillance camera was left in a classroom pointed towards the door. Each time someone entered or left the classroom the Raspberry Pi 3 module recorded the video and sent it to the cloud platform, also sending the respective warning email to the user.

For testing the cloud platform (ownCloud and website), some simple load tests were carried out. With a script in Python, a scenario was simulated where several cameras detected motion at the same time interval and uploaded the respective video file to ownCloud, triggering the warning message in the website and sending an email. After confirming that every file was uploaded successfully, and all emails were sent without any significant delay, the load test was completed, and the platform fulfilled the initial specifications.

6 Conclusion and Future Work

This paper introduces a low-cost smart surveillance system that integrates video analysis for real time motion detection. The topology, development and assessment tests of the proposed system are also presented. The system architecture allows the users to interact and control all their cameras through the online platform system at anytime, anywhere. To enhance the security perception of the user, which reflects in their well-being, they can configure the system to always inform them of the status of their cameras, i.e., if the cameras are working properly. Besides that, when a motion or an abnormal situation is detected by the autonomous surveillance system, it immediately sends an alarm message to the user, with the relevant information of the occurrence.

The proposed system uses smart devices to capture, process and send the surveillance videos to the online platform system. To improve system flexibility, three smart devices with different specifications were implemented and tested. These devices, which are also presented and described in this paper, were developed with available low-cost technology. To attain the desired performance, each smart device uses different surveillance cameras and hardware. One of the smart devices comprises an ArduCAM (OV2640) and an ESP32. One of the other two is constituted by a RaspiCam and a Raspberry Pi Zero W and, the last one, comprises one RaspiCam and a Raspberry Pi 3 Model B. Although the smart devices have different cameras and hardware, their operation functionalities are similar. All the data collected by them are sent to the cloud servers system. The performance assessment conducted on these smart devices demonstrates the feasibility of the solution.

Finally, it is important to mention that the video captured by the smart devices implemented has a relatively low frames rate, i.e., a low number of frames per second (FPS). Thus, in a near future, and to improve the video FPS captured by the smart devices, different solutions from the ones adopted in this paper will be studied. One major feature that will be developed and implemented is facial recognition capability, so that the smart devices do not trigger movement warnings in case of authorized users (familiar faces).