1 Introduction

The user-base of multiple social networks is getting wider and more active in producing content about real world events almost in real time [1]. Social sensors, i.e., users contributing their individual ‘data’ [3], publish a large amount of data streams (images, videos and texts) over the social networks (also called social clouds [2]). Social-sensor data streams related to public events, especially multimedia content, may contain critical information that describes a situation from various aspects, e.g., what is happening, where it is happening, who are involved and what the effects on surrounding are. Monitoring the events or scenarios over social-sensor data streams assists concerned officials to analyse an unfolding situation, such as in crisis management, urban management and scene analysis. Hence, utilising these social-sensor data streams can significantly facilitate the task of scene reconstruction and aid in comprehending evolving situations [3].

Scene reconstruction is generating a 3D model of a scene given multiple 2D photographs of the scene [19]. The extensive availability of social sensors (e.g. twitter feeds) helps in gathering indirect pictorial view of the event. Various studies focus on visual and spatio-temporal scene reconstruction in social media [3, 5]. One of the major challenges in current scene reconstruction process is the efficient and real time delivery of sensors’ (e.g., cctv, accelerometer, etc.) data to the end users, e.g., urban management, that meet their requirements (time, location, content relatedness, quality, price, coverage, etc.) [5]. Most of current work focuses on utilising image processing. However, the traditional approach of image processing relies on performance of hardware and software which is both costly and time consuming [20]. To overcome this challenge, this research employs the theory of Service Oriented Architecture (SOA) instead of image processing, by defining a social sensor cloud service model based on metadata of social media images and related posted information.

Social-sensor data streams are generated from multiple sources and in multiple formats. SOA abstracts social-sensor data streams into small independent function(s), namely services. This results in uniform and ubiquitous delivery of social-sensor data as a service, making it easy to access and reuse in multiple applications over different platforms. This reduces the complexity of social-sensor data collection. The functionality of social-sensor data (e.g., spatio-temporal, textual and context information of an image) is abstracted as a service and the qualitative features (e.g., price, coverage) are abstracted as non-functional properties of the service. Access to social sensor data streams and implementation for scene reconstruction will be simplified based on the service model. Other benefits include higher availability, better scalability, dynamic deployment and greater testability.

Usually a single service may not satisfy users’ requirements. The challenge is to design an efficient method for selecting the social-sensor cloud services that are in the same information context, i.e., covering same event or segment of an area at any given time required by the user and also meet user’s quality demand. Most of the existing techniques developed for standard Web service discovery cannot be directly applicable to sensor services [4]. Due to the large number of images over the Web and its time-location dependency, sensor services need to be organised in a way to allow efficient search based on their spatio-temporal properties, e.g., time or location.

This paper focuses on proposing a novel social-sensor cloud service model and a social-sensor service selection algorithm for collecting images for scene reconstruction based on spatio-temporal, textual and QoS parameters of the service. To the best of our knowledge, existing approaches to use social media data are mainly data centric. Current approaches are built upon data mining and information retrieval techniques without concerning qualitative aspects of images. The proposed approach conceptualize the spatio-temporal and textual aspects of social-sensor data streams as social-sensor cloud services’ functional attributes, and the qualitative aspects as their non-functional attributes. The proposed framework is a 4-stage algorithm capable of context-aware selection of social-sensor cloud services by using their functional and non-functional properties. Functional properties includes spatio-temporal parameters, spatio-textual context, etc., and non-functional requirements includes image quality, price, resolution, etc. The 4-stages of algorithm are (1) service indexing, (2) selection w.r.t spatio-temporal features, (3) filtering w.r.t textual-correlation and (4) coverage assessment and QoS-aware selection.

The novelty of this research lies on (1) abstracting social media image metadata and related posted data, i.e., social-sensor data streams, as social-sensor cloud services, (2) supporting efficient and real time access to high-quality and related images for scene reconstruction without image processing. The rest of the paper is structured as follows: Sect. 2 reviews the related background work. Section 3 describes the motivating scenario. Section 4 formally defines the model for a social-sensor cloud service along with functional and quality attributes. Section 5 details the proposed selection approach. Section 6 describes the experiments and evaluation of the approach. Section 7 concludes the work.

2 Related Work

Our social-sensor cloud service selection approach draws background work from two main areas: sensing-as-a-service and service selection [1,2,3,4,5,6,7,8,9].

Social Sensing and Sensing-as-a-Service. is a large-scale sensing paradigm based on the power of IoT devices, including smart phones, smart vehicles and wearable devices, etc. [3, 5]. This allows the increasing number of mobile phone users to share local knowledge (e.g., local information, event coverage, and traffic conditions) acquired by their sensor-enhanced devices and the information can be further aggregated in the cloud for large-scale sensing [10]. A broad range of applications are thus enabled, including traffic planning [3], environment monitoring [13], mobile social recommendation [17], public safety [18], and so on. Spatio-temporal social media analysis for abnormal event detection is discussed in [6]. Another research proposes an approach towards multi-scale event detection using social media data, which takes into account different temporal and spatial scales of events in the data [7]. However, most of these approaches are data centric, built upon data mining and analysis techniques. This require considerable amount of expertise and time. Moreover, transition from a traditional cloud systems to the SOA-based sensor-cloud raises the need to consider spatio-temporal aspects of sensor data with better performance and faster access to new services. Thus, using SOA and social sensors for scene analysis is far better than using image processing over the batch of images or traditional cloud computing to build the scene.

Service Selection. is one of the major research problems in service-oriented computing [4, 9, 11, 12]. The service selection and composition have been applied in a number of domains including scene analysis and visual surveillance [12]. The service composition problem can be categorized into two areas. The first area focuses on the functional composability among component services. The second area aims to do optimal service composition based on non-functional properties (QoS). In [11], service composition from media service perspective has been discussed. [4] and [9] propose a composition approach for Sensor-Cloud and crowd sourced services based on dynamic features such as spatio-temporal aspects. Algorithms are presented in both papers to support the proposed approaches. Analytical and simulation results of the proposed approaches are presented to validate their feasibility. However, social-sensor cloud service selection using functional/non-functional attributes through social-sensor is yet to be explored.

3 Motivating Scenario

A typical scenario of scene reconstruction for car accidents is used to illustrate the challenges in scene analysis. Given a segment A on the road, suppose an accident happens at time \(t_{0}\), as shown in Fig. 1a, b and c, depicts the scenario before the accident happens. The fan shapes are 2D representation of the social-sensor cloud services’ coverage.

Fig. 1.
figure 1

Accident timeline

It is assumed that scene reconstruction of the accident is required by urban management to determine the cause(s) or aftermath(s) of the accident to prevent further incidents of a similar kind. The wide deployment and availability of smart-phones users and their connectivity with social networks and services, i.e., the commuters using social media/networks, might provide extra visual coverage by either sharing images or posts. For instance, in the South Melbourne Bus AccidentFootnote 1, multiple posts with hundreds of images on this event were reported on various social networks. In such cases, the commuters can be regarded as social sensors sharing their image data over social sensor clouds, i.e., social networks. Using social media images’ metadata and related posted data as services, i.e., social-sensor cloud services, can help to fulfil the user’s need of maximum coverage. The idea is to leverage freely available information over social network clouds to help investigators to analyse the accidents scene.

The aim is to develop a new framework for social-sensor cloud services selection. The algorithm will be based on spatio-temporal information, textual features, spatio-textual correlation and quality of service parameters. As shown in Fig. 4, the proposed solution would be a multi-stage selection algorithm, to select social-sensor cloud services based on a user’s query. Let us assume that the user’s query q is defined as \((R,d,t_{s},t_{e},Q_{U})\). R is represented as a tuple \((P<x,y>,l,w)\) that indicates the region of interest, where \(P<x,y>\) is a geospatial co-ordinate set, i.e., decimal longitude-latitude position (e.g., -37.8089435,144.9651172) and l (e.g., 5 m) and w (e.g., 2m) are length and width distances from P to the edge of region of interest. \(\mathrm{t}_\mathrm{s}\) (e.g., 2:29:23 pm AEST, Wednesday, 14 June 2017) and \(\mathrm{t}_\mathrm{e}\) (e.g., 2:59:23 pm AEST, Wednesday, 14 June 2017) give start and end time of scene. d is a phrase describing the event (e.g., ‘Melbourne Central, Accident’). \(Q_{U}\) is a set of non-functional attributes (e.g., P, i.e., price of the service is not more than $0.5). Therefore, given the services, the proposed framework will select the services that, in the given time frame, are spatially located in the user defined region and textually related to the user’s description, and meet the user’s QoS requirements.

The functional attributes of a social-sensor cloud service Serv include:

  • Time T of the service at which the image is taken

  • Set of special mentions and keywords M providing additional information regarding an image or a service

  • Service location L(xy), i.e., longitude and latitude position of the service

The non-functional attributes of a social-sensor cloud service Serv include:

  • Textual correlation TxtCo, the textual similarity between the tags/keywords, i.e., Serv.M, of an atomic service Serv and the query q’s description q.d.

  • Coverage Cov of the total area covered in the user required region R

First, for any location of the query q, all indexed services available in the area of interest defined by the region R, across time \(\mathrm{t}_\mathrm{s}\) to \(\mathrm{t}_\mathrm{e}\) are selected. However, the region R is expanded if the selection does not meet query demands. It is assumed that the R encloses S, a set of services, relevant to query q. Textual correlation is considered next, i.e., similarity between d and M, between the query and services in the region R. For example for every special mention M (e.g., Melbourne Central station) of the service Serv and description d (e.g., Melbourne center) of the query q, their textual correlation \(relation_{t}\) is calculated as the similarity ratio between a service \(Serv_{i}\) and a query q. The similarity is measured between 0.0 (the lowest) and 1.0 (the highest) and denoted as \(\theta \). This gives a subset of services that are spatio-temporally and textually correlated to the query. Next, the coverage of all selected services is assessed. The best available services are selected that are both spatially located in the user defined region and textually related to the user’s description. The selection is finalized until all selected services achieve the maximum coverage. The selected services can assist in reconstruction for the required scene.

4 Model for Social Sensor Cloud Service

In this section, we define several concepts to locate a social-sensor cloud service. The aim is to locate and select the social-sensor cloud services which are in the same spatial and visual context based on the functional properties of the service. The selected services can assist in building a visual summary of a required scene in given space and time.

4.1 Model for an Atomic Social Sensor Cloud Service

Here we discuss the key concepts to model an atomic social-sensor cloud. We define the model of a crowd-sourced social sensor cloud service, in terms of spatio-temporal features of crowd-sourced social sensor.

Definition 1:

Scene S is defined as an observation on a real world happening. This observation is a collection of connected images in same spatial and temporal dimension.

Definition 2:

Visual summary VisSum is defined by a set of 2D images that are highly relevant to the scene S. VisSum gives viewer an accurate impression of what a particular scene S looks like. Any two images are considered highly relevant if at least one feature of the images is common.

Next we define the model of social-sensor cloud service, in terms of spatio-temporal features of social-sensor.

Definition 3:

Crowd-sourced Social Sensor SocSen is the user of a social media. A sensor posts content on social media, i.e., Social Sensor Cloud. It is assumed that the data shared by a social sensor contains visual information, textual reference, time and location.

Definition 4:

Social Sensor Cloud SocSenCl is a social media hosting data from the social sensors. It is defined by

  • Social Sensor Cloud ID SocSenCl_id, i.e., a unique sensor id

  • Sensor Set SenSet = {SocSen_id\(_\mathrm{i}\), 1 \(\le \) i \(\le \) m} represents a finite set of sensors SocSen that collect and host sensor data in the respective cloud. It is assumed that each cloud hosts data from at least one sensor.

Definition 5:

Atomic Social Sensor-Cloud service Serv is defined by

  • Serv_id is a unique service id of the service provider SocSen.

  • SocSenCl_id is an ID of the cloud where the service is available.

  • F is a set of functional properties of the service Serv. For each Serv, F = {T, M, L, dir,VisD, \(\alpha \)}.

  • nF is a set of non-functional properties of the service Serv. For each Serv, nF = {TxtCo}.

4.2 Functional Model of an Atomic Social Sensor Cloud Service

Functional requirements capture the intended behaviour of the service and forms the baseline functionality necessary from an Atomic Service. The following propose the minimal functional requirements associated with an atomic service:

  • T is time of the service at which the image is taken

  • M is special mentions and keywords, providing additional information regarding image.

  • L \(\langle \) x,y \(\rangle \) is the service location where \(\langle \)x,y\(\rangle \) is longitude and latitude position of the service

  • VisD is the visible distance i.e., the maximum distance, covered by the service.

  • dir is the orientation angle of the service.

  • \(\alpha \) is the angular extent of the scene covered by the service.

Thus, the functional model of each service is represented by the service coverage model \(\mathrm{Serv}_\mathrm{c}\), as shown in Fig. 2.

Fig. 2.
figure 2

\(\mathrm{Serv}_\mathrm{c}\) model

Fig. 3.
figure 3

Query region and coverage model q.R

4.3 Quality Model of an Atomic Social Sensor Cloud Service

Discovering and selecting the best available services satisfying the user’s requirements is an important challenge. The first step is to define a QoS model, i.e., a set of QoS aggregation rules. However, the user’s QoS demands can be different from the system’s QoS matrix for optimal and effective selection. For this purpose the QoS model for both User and Social Sensor Cloud services are introduced. The proposed system-defined QoS criteria of an atomic services include:

  • \(Q_{serv}\) is a tuple \(\langle \) \(\mathrm{Q}_{1}\), \(\mathrm{Q}_{2}\)... \(\mathrm{Q}_\mathrm{n}\) \(\rangle \), where each \(\mathrm{Q}_\mathrm{i}\) denotes a Quality of service (QoS) of Serv. The QoS criteria include:

    • TxtCo is the textual similarity between the tags/keywords, i.e., Serv.M, of an atomic service Serv and the query q’s description q.d. WordNet-based approach LIN [15] is used to calculate the textual similarity between textual description q.d and Serv.M of the service Serv. It measures semantic relatedness of concepts based on the ratio of the amount of information needed to state the commonality of the information content of the d, i.e., IC(q.d), along with the information content of Serv, i.e., IC(Serv.M), to the amount of information needed to describe them. The measure is determined by [15]:

      $$\begin{aligned} related_{LIN}(q.d, Serv.M) = \frac{2IC(lcs(q.d, Serv.M))}{IC(q.d) + IC(Serv.M)} \end{aligned}$$
      (1)

      where, \(IC(description) = -log(Probability(description))\), and,

      lcs(q.dServ.M), i.e., least common subsumer is the quantity of information common to two descriptions. It is determined by the information content of the lowest concept in the hierarchy that subsumes both q.d and Serv.M [15].

    • Cov is the total area of patches covered in the user required region R (Fig. 3). Coverage can be illustrated by:

      $$\begin{aligned} Cov \longleftarrow \{\sum _{i=1}^{n} Serv_{i} \in S' \mid \sum _{i=1}^{n} Serv_{i} <\cdot R, t_{0} \le t \le t_{1} \} \end{aligned}$$
      (2)

      where, \(Serv_{i} <\cdot \) R means \(Serv_{i}\) covers some of the region R. \(S'\) is the set of services spatio-temporally and textually related to the query. Since it is uncertain that the user desired time gives the best available results, we limit the temporal range between \(\mathrm{t}_{0}\) and \(\mathrm{t}_{1}\).

Moreover, for the effective and efficient selection as per user demands, user defined QoS parameters are also required. For this purpose some baseline QoS attributes for Social Sensor Cloud services are introduced:

  • \(Q_{U}\) is a tuple \(\langle \) \(\mathrm{Q}_\mathrm{U1}\), \(\mathrm{Q}_\mathrm{U2}\)... \(\mathrm{Q}_\mathrm{Un}\) \(\rangle \), where each \(\mathrm{Q}_\mathrm{Ui}\) denotes a Quality of service (QoS) requirement of user. The QoS criteria include:

    • P is the price of the service, i.e., does the service need any sort of financial incentive for service providers or not.

    • Res is the minimum requirement of image resolution to be provided by services.

    • ColQ is images’ definition, i.e., grey scale or high definition.

5 QoS-Aware Social Sensor Cloud Service Indexing and Selecting Approach

We propose a framework to index, filter and select the best available Social Sensor Cloud Service according to a user’s query. The query q can be defined as \(q = (R,d,t_{s},t_{e}Q_{U})\), giving the region of interest, description and quality parameters of the required service(s). The entry:

  • \(R = \{P<x,y>,l,w\}\) (Fig. 3), where P is a geospatial co-ordinate set, i.e., decimal longitude-latitude position and l and w are length and width distance from P to the edge of region of interest.

  • \(\mathrm{t}_\mathrm{s}\) is the start time of the query

  • \(\mathrm{t}_\mathrm{e}\) is the end time of query.

  • d is a phrase describing the query e.g., Melbourne Central.

  • \(\mathrm{Q}_\mathrm{U}\) is a set of non-functional attributes, e.g., Coverage, Resolution, Pricing etc.

Figure 4 shows the proposed selection framework for social-sensor cloud services. The aim of our approach is to efficiently locate the available services that match with the users’ requirements by constructing and indexing the information and location context of the service with the functional and non-functional properties. To manage and enable fast discovery of the social sensor cloud services:

  • First we index all the available services. Considering the spatio-temporal nature of the services, we index both location and time of service using R-tree.

  • Then the search space is reduced by selecting a set of all the spatio-temporally close services S from the BR. BR is the user-defined region of interest defined in a spatio-temporal cube.

  • Further, we calculate txtCo of each service in S with q.d. Considering txtCo, we select set \(S'\) of services textually related to the query.

  • Next, we assess the coverage, i.e., \(\mathrm{Serv}_\mathrm{C}\) of all the services in \(S'\) and compute the spatial coverage of region.

  • Finally, \(\mathrm{Q}_\mathrm{U}\) is used to select the best available service(s).

If the desired coverage is not achieved the search space BR is increased dynamically until the maximum coverage and QoS parameters are achieved. The system defined QoS attributes are determined in two ways:

  • Before selection, the values are given based on previous executions of services or user’s feedback.

  • During selection, the values are given by monitoring services and query QoS attributes and dynamically evaluating the attributes.

Fig. 4.
figure 4

Social-sensor cloud service selection framework

The implementation process of the selection approach is:

5.1 Service Indexing and Spatio-Temporal Filtering

Indexing and spatio-temporal filtering of services enable the fast discovery of services (Algorithm 1). We index services considering their spatio-temporal features using a 3D R-tree [4]. 3D R-tree [21] is a tree data structure which is used as a spatio-temporal index to handle time and location based queries. Time is considered as the third dimension in the 3D-R Tree. The leaf nodes of the 3D R-tree represent services which are organized using minimum bounded region (MBR) [21] that encloses the service spatio-temporal region. It is assumed that all available services are associated to a two-dimensional geo-tagged location and time. For the effective area of query q, we define a cube shape region BR using user-defined rectangular event area R and start and end time of the service, i.e., \(t_{s}\) and \(t_{e}\). Region BR encloses a set of services relevant to q. The services outside this region are assumed to have little probability of being relevant to the query. Figure 5 illustrates the query region R and the bounded region BR across time \(\mathrm{t}_\mathrm{s}\) to \(\mathrm{t}_\mathrm{e}\).

Fig. 5.
figure 5

Illustration of q.R and BR

Fig. 6.
figure 6

Illustration of coverage

The 3D R-tree efficiently answers typical range queries, e.g., “select all services bounded by the rectangle R in time \(t_{s} \,\, to\,\, t_{e}\)”. This results in filtering of all the services outside the bounded region of interest BR.

5.2 Textual Co-relation Between Service and Query

To improve the efficiency of the proposed approach, the textual correlation is considered. It might happen that the service does lie spatially in the query area Region, but has no textual relation with the query q. In such cases the textual correlation in terms of spatio-textual similarity is used for service filtration.

Equation (1) measures the relatedness of the two descriptions. The relatedness score is between 0.0 (the lowest) to 1.0 (the highest). For implementation a Java based library, WS4J (WordNet Similarity for Java) is used. The use of this library is defined in an on-line documentationFootnote 2. We have used \( \theta ' \) to define \(related_{lin}(q.d, Serv.M)\). The higher value of \(\theta '\) shows higher textual correlation. On the basis of TxtCo, the set of services \({S'}\) is selected. Algorithm 2 shows the textual correlation filtering.

figure a
figure b

5.3 Coverage Assessment Using Serv\(_\mathrm{C}\)

\(\mathrm{Serv}_\mathrm{C}\) is a 2D representation of the service coverage, illustrated as the grey region in Fig. 3. If the user requires the coverage of R between time \(t_{s}\) to \(t_{e}\), all the Serv in \({S'}\) overlapping the bounded region BR are selected. The relationship between \(\mathrm{Serv}_\mathrm{C}\) and R can be illustrated as:

$$\begin{aligned} Coverage \longleftarrow \{ Serv_\mathrm{i} \in S' \mid Serv_\mathrm{i} \cap BR, t_{0} \le t \le t_{1}\} \end{aligned}$$

Thus all the services with \(\mathrm{Serv}_\mathrm{C}\) which overlap with the region BR are selected (Algorithm 3). Since it is uncertain that the user desired time gives the best available results, we limit the temporal range between \(\mathrm{t}_{0}\) and \(\mathrm{t}_{1}\). Pictorial illustration of coverage is shown in Fig. 6.

figure c

Coverage can be calculated by:

$$\begin{aligned} Cov = \int _{t=s}^{e}AREA(R)dt - (\int _{t=s}^{e}Area(\cup _{i=1}^{n} Serv_{i}) dt - \int _{t=s}^{e}Area(\cap _{i=1}^{n} Serv_{i})dt) \} \end{aligned}$$
(3)

where:

$$\begin{aligned} AREA(R) = l * w \end{aligned}$$
(4)
$$\begin{aligned} Area(\cup _{i=1}^{n} Serv_{i}) = \sum _{i=1}^{n} (\cup _{i=1}^{n} (0.5*Serv_{i}.VisD*Serv_{i}.\alpha ) \end{aligned}$$
(5)
$$\begin{aligned} Area(\cap _{i=1}^{n} Serv_{i})) = \sum _{i=1}^{n} (\cap _{i=1}^{n} (0.5)(Serv_{i}.VisD)((Serv_{i}.\alpha )) \end{aligned}$$
(6)

If selection does not meet the maximum achievable coverage, i.e., Cov is not met, the R is adjusted and increased to \({R'}\). Further, spatio-temporal selection and filtering w.r.t textual correlation is repeated until the maximum coverage is achieved. \({R'}\) is achieved by increasing the length l and width w of the region R. The minimum unit of increase is 7 m, i.e., average lease measurable increment in decimal latitude and longitude values. If required l and w are further incremented in multiples of 7 m.

5.4 QoS-Aware Service Selection

In the final stage of service selection (Algorithm 4), the user defined quality parameters are considered in selection of the best suited services. The threshold values of these parameters are adjusted by the user of the service at time of query generation.

6 Experiments and Results

A set of experiments is conducted to evaluate, analyse and investigate the contribution of our proposed framework in comparison to image processing.

figure d

6.1 Experimental Setup

To the best of our knowledge, there is no real spatio-temporal service test case to evaluate our approach. Therefore, we focus on evaluating the proposed approach using the real dataset. The set is a collection of 10000 user uploaded images downloaded from social networks (flicker, twitter, google images, etc.). To create the services based on images, we have extracted its geo-tagged location, special mentions and tags as its textual description. Time when an image was captured, camera direction, maximum visible distance of a camera and camera viewable angle are abstracted as the functional property values dir, VisD and \(\alpha \) respectively. Quality features, i.e., colour quality and resolution, are abstracted as QoS property values. In addition, QoS parameters of price are manually assigned to all services. The threshold values of these parameters {textual correlation} are adjusted by the user of the service. For textual correlation, using previous research as reference we have set the value of threshold \(\theta \) = 0.5 [16]. For coverage, we have arbitrarily used above 80% for experimental purpose.

We generated 10 different queries based on the locations in our dataset. In these experiments we evaluate service selection based spatio-textual features. For our proposed approach, we have conducted the experiments with 10 different queries, e.g.,

\(q<R,d,ts,te,Q_{i}>\) - where

  • \(R(<x,y>,l,w)\) = (-37.8101008,144.9634339, 5 m, 2 m)

  • d = (Melbourne Central, Melbourne CBD)

  • \(t_{s}\) = 2:39:20 pm AEST, Wednesday, 14 June 2017

  • \(t_{e}\) = 2:59:23 pm AEST, Wednesday, 14 June 2017

  • \(Q_{U}\) = (\(\$\) 0.0, 1600x1200, any)

The results of these experiments are evaluated against the traditional image processing technique using SIFT (Scale-Invariant Feature Transform) [14]. In the second part of the experiment we have set a baseline for comparison. All the images are manually analysed by human to form a baseline for this experiment. We have used a \(360^{o}\) structured image dataset I of the area of interest R. The image set is extracted from Google Map Street View. The selection is achieved by similarity analysis between the SIFT features of the image set I and our experiment dataset. This is achieved by individually comparing the key point feature vector of the images in I and the experiment dataset, and finding the images’ matching features based on Euclidean distance of their feature vectors. To transform the matching keypoints into a scalar quantity, the percentage of keypoints that match the reference map is calculated [14], i.e., the number of matching keypoints (Number of mKP) divided by the total number of keypoints (Total number of KP) for each image [14]. Further, images are selected if the percentage of similarity is above 80% for the experiment purpose.

All the experiments are implemented in Java and Matlab. All the experiments are conducted on a Windows 7 desktop with a 2.40 GHZ Intel Core i5 processor and 8 GB RAM.

6.2 Evaluation

We have aimed to evaluate the proposed approach on the basis of (1) effectiveness in selecting related services (precision), (2) accurate and required coverage of the user required region (recall) and (3) time taken to select related services (execution time). Precision and recall matrices are used for evaluating the proposed framework against the image processing approach. All the images and selected services are manually analysed by human to form a baseline for this experiment. We have investigated that how precision and recall of the query result vary by applying the proposed approach in comparison to SIFT image processing. The experiments show that in terms of precision and recall the proposed approach shows slightly better performance than the image processing. The reason being that the proposed approach focuses on event based selection where as the image processing approach is location oriented. Our proposed approach helps in better selection of images for scene reconstruction because it considers the related textual data that describes a situation from various aspects, e.g., what is happening, where it is happening, who are involved and what are the effects on surrounding. Whereas, the image processing approach is more location oriented because it selects the images based on the similarity of surrounding landmarks rather than the insight of event being covered. Moreover, in terms of execution time efficiency, the experiments’ results show that the time ratio between the proposed approach and image processing is 1:100. The results are depicted in Figs. 7 and 8 and Table 1.

Fig. 7.
figure 7

Precision

Fig. 8.
figure 8

Recall

Table 1. Execution time

7 Conclusion

In conclusion, this paper proposes a social-sensor cloud service selection framework based on spatio-temporal and textual correlation, and QoS parameters. We conducted the experiments to evaluate the proposed framework in comparison to a traditional image processing approach. Experimental results show that our approach is better than the traditional image processing approach. In future, we plan to focus on social-sensor clouds services composition for fast visual summary of the scene for scene building and event analysis.