Keywords

1 Introduction

Like humans and/or animals, vehicles are able to communicate, which we call vehicular communication; thanks to new technology. Often, exchanging and/or sharing ideas can be considered as data communication. The communication is clear and succinct if we are able to share enough data so that actions can be made expected. In vehicular communication, there are varieties of data that can be shared, such as sensor-based data (LIDAR, for instance), image data and other signals like voice data. Communicating with them can serve various purposes, such as vehicle or user authentication, security and safety. Speech signals can be verified and/or addressed by users. However, other data, such as sensor-based data and images (via camera: front and rear) are expected to be analyzed in a way we can share important information in any studied network. Image data is more crucial and can provide more information than other single-valued data (mostly, numeric). Let us recall the following statement:

“A picture is worth a thousand words.”

As we are agreed on the fact that a picture can tell us a complete story and is re-usable, in this paper, an idea of how can a set of images can be analyzed for a particular event/accident through image analysis and machine learning. The paper aims to provide a proof-of-concept in vehicular communication for various purposes, such as vehicle/user authentication, safety and security.

1.1 Overview

How good it will be if we have cars built/manufactured with cameras. In the US, the department of transportation has shown interests in keeping both front and backup cameras (rear) [1]. Conventionally speaking, cameras are used to avoid possible collisions. Assuming that cars have cameras, communication can happen with a lot of data exchanges among the vehicles in the particular vehicular communication network. Not limited to data exchanges, but also data storage can help communication robust for further analysis/research view point. As mentioned earlier, image data can provide a user good driving experience since one vehicle can collect images along the route/path that can be any event (accident) and other traffic issues. Note that traffic issues can be relayed through vehicular network(s) so that one can understand how detour can be made for security and safety reasons. Importantly, vehicle/user authentication can be analyzed and proved with the help of the data collected by the vehicles that are shared through the network.

Assuming that the Vehicular Adhoc Networks (VANETs) in place, the paper is focused on the proof-of-concept prototype – through the use of image analysis – to be able to continue serving several purposes, such as vehicle/user authentication, safety and security in addition to quality driving. However, researchers in the domain: vehicular communication should not be confused with what have been discussed in the literature i.e., security and safety, which is beyond the scope of the paper. Besides, the paper is not just about those data communication that mainly deals with cryptography issue that can help authenticate vehicles/users. Low-level data communication, text-based messages, for instance, is the scope of the current work; this is about high-level image data communication. In short, for any reasons, image data could provide more and complete information in addition to text- and/or key-based data sharing within the network. In Fig. 1, let us assume that vehicles can communicate with any data they like to share. But for this paper, let us limit the data sharing through a set of images. In any particular network, image data can be shared so that every vehicle can have all information for further analysis. Dealing with images (hundred of images) is not trivial since some images can provide no information i.e., a portion of them can happen fraud. At the same time, other images can be redundant. All of these issues can raise a question like how can it be possible to analyze hundred of images so that the right information can be shared. In this paper, the proof-of-concept framework is discussed that helps open the new trend in the domain.

Fig. 1.
figure 1

An illustration about how can vehicles communicate to each other. In this illustration, all vehicles are equipped with cameras front-end and rear-end. Assuming accessible network for their communication, vehicles are supposed to communicate at the time when it is required: an event/accident, for instance.

1.2 Organization of the Paper

The remainder of the paper is organized as follows. In Sect. 2, the use of image data will be discussed, which is particularly focused on the domain: vehicular communication. Section 3 will provide high level concept on image stitching. Section 4 will focus on the problem i.e., how can we construct panoramic images from a set of images shared by several different vehicles in the particular vehicular network. Section 5 concludes the paper.

Fig. 2.
figure 2

An illustration of an event/accident, where pictures/images are taken by nearby vehicles. These images are shared through the network (vehicular) so that image analysis can be done at the same time. Check Fig. 1 to know how vehicles are communicated.

Fig. 3.
figure 3

A set of images, collected from shared vehicles. Note that the images are not ordered since they come from different vehicles at different time.

Fig. 4.
figure 4

Three pairs of images with the matching pairs (yellow), where red circles and green plus represent the location where overlapping happens between them. (Color figure online)

2 A Picture is Worth a Thousand Words

“A picture is worth a thousand words” is an English language-idiom that refers to the notion that a complex idea can be conveyed with just a single picture, this picture conveys its meaning or essence more effectively than a description doesFootnote 1. In a 1913 newspaper advertisement for the Piqua Auto Supply House of Piqua, Ohio, a phrase: “One Look Is Worth A Thousand Words” can help understand the importance of the picture (Piqua Leader-Dispatch. page 2. August 15, 1913). This happens mostly in all applications from healthcare (artificial intelligence and machine learning tools/techniques for medical imaging) to combating crime (biometrics), to name a few. With this background, in vehicular communication, it could be a better idea to take advantage images that can be shared in the network.

3 Image Stitching

In general, image stitching is the process of combining multiple images based on their common fields (overlapping fields) to produce high-resolution image. It has extremely rich state-of-the-art literature [3]. Image stitching has been widely used in several different modern applications, such as image stabilization, medical imaging, image super-resolution, video stitching (frame-by-frame image stitching) and object formation/development via insertion. In this work, images are taken by the vehicles (front- and rear-end cameras) and are shared in any particular vehicular communication network to develop panoramic image so that the complete scene can be read for further analysis/investigation (Fig. 2).

To develop a prototype, even though we have varieties of tools/techniques, the following process has been followed.

  1. (a)

    Get the stable Harris points from the images (potentially used for image stitching);

  2. (b)

    Based on the similarity score, find point correspondence between a pair of images using the SIFT descriptors [5, 6] (VLFEAT SIFT tool box was used, URL: http://www.vlfeat.org/overview/sift.html); and

  3. (c)

    Compute homography following the matched points using RANSAC (taken from RANSAC algorithm with example of finding homography: Edward Wiggin. MATLAB Central 2011).

In this study, the most stable Harris points are estimated using the scale-space maximization of the Laplacian of Gaussian (LOG) formulation [4, 7, 8]. Besides, not just limited to SIFT key points, SURF can be applied as reported in the original work [2].

In Fig. 3, four different images are shown and that are considered for image stitching (based on the significant overlapping pixels between the pairs). In Fig. 4, three pairs are shown with the matching pairs and the result is provided in Fig. 5. For detailed information about how it has been studied in the domain: vehicular communication will be discussed in Sect. 4.

Fig. 5.
figure 5

Expected panoramic image from a set of images.

4 Panoramic Image: A Proof of Concept

To better understand the work, let us start with a scenario:

A vehicle captures an event/accident. It shares image to neighboring vehicles/users. Images can be multiple, not just limited to one. The same process holds true for other vehicles that are in the event.

In vehicular communication network, a roadside unit (RSU) is expected and it helps verify data that required RSUs higher computation power and larger storage space. These units are connected (i.e., network: wired or wireless). In the network, any data can be shared/communicated. In case we do not have RSU, one of the vehicles/users can be used for processing the data.

With the use of an image analysis techniques, based on the similarity scores (through image matching), we are able to find images in a sequence even though the images are shared at different times and in different order. For example, in Fig. 3, we have four different images that are shared by four different vehicles. It is important to note that, at the time vehicles shared images, we cannot really order them based on the time they share. In Fig. 4, the image sequence can be explored based on the similarly matching scores from all possible pair of images. For n number of images (shared in that particular vehicular network), there are \(\frac{n \times (n-1)}{2}\) pairs of images to compute similarity score. For In Fig. 3, we will need to compute similarity scores from the following pairs = \(\lbrace (1,2), (1,3), (1,4), (2,3), (2,4), (3,4) \rbrace \). It is possible to figure out the common pixels that largely happen between 2D images taken from the same event. After image matching, the resulting similarity between the images helps us determine whether the 2D images are taken from exactly the same event. This means that the machine is able to learn those images with higher similarity scores. As a result, such images are accepted/used for building panoramic image. In Fig. 4, three pairs with the image matching processes are shown. Figure 4 provides the way we locate local features and compute similarity between the possible pairs of images and check whether it is useful for building panoramic image. In this example, one can see/read the matching pairs that follow local key points (from each sample) to see how similar they are. Once the pairs are verified i.e., whether they originated from the exact same event, panoramic building process starts by stitching them. Image similarity score (via matching, see Fig. 4) helps stitch them together and makes a panoramic image. As mentioned before, panoramic images can be used for further analyzing the event (even in the future, since images care be stored and can be reused). To build a complete panoramic image, it is always good to have a set of good number of images (shared in the network). The higher the number of images, better is the quality of the panoramic image. In this prototype, the set of three images can be sufficient to build panoramic images, and at the same time, false or fraud images can be avoided.

Besides, possible fraud images should be eliminated from the sequence that are potentially shared by malicious vehicles/users. In this paper, fraud images are taken care of as well, which is based on the similarity score. Another way to avoid fraud images is to use the GPS location of the vehicles/users. If the GPS location of the vehicle/user is different from the real event, the images can be discarded without further image analysis processes.

As discussed above, event-related message/image can be verified through the panoramic image generated. Once the image is verified as authentic, then the vehicles on the road will use the verified information for making decision for their driving. The goal of our scheme is to prevent attacks by malicious vehicles such as message/image fabrication attack, and to provide safe and enjoyable experience to drivers by delivering only authenticated information.

It is important to note that the whole process

  1. (a)

    does not just describe an event (or authenticate the event) fully

  2. (b)

    but also, authenticates user.

From the set of rejected images, one can understand that they have been shared by malicious users or they are not the part of the event. How about if we still receive images from the event after the completion of the panoramic image? If it is the case, machine will learn those upcoming images that are communicated/shared for better development of the panoramic image. Once we have the panoramic image from a set of images communicated at time t, new images at time \(t + i\), where \(i=\{1, 2, \dots , N\},\) will be verified to see whether they can help enrich the quality of the panoramic image. Rejection is also required. Rejection is much easier since machine can take complete panoramic image for the reference to compare based on the image matching criterion discussed above.

5 Conclusion

In this paper, the use of set of images that are shared by vehicles in any particular vehicular network, for building panoramic image has been clearly reported. The primary idea is to know how one can read/understand the complete story of what has happened about the event/accident. The focus of the study is not just to check whether images can help vehicular communication for authentication, safety and security but also improve the quality driving experience. Not to be confused, the work is not about how images are shared and improve the VANET. In a word, the paper is the proof-of-concept work, where image analysis and machine learning can help build panoramic image in vehicular communication for several different purposes.