Keywords

1 Introduction

Nowadays, e-commerce is becoming a very important business opportunity, with a high growth potential (i.e., by 2015 goods for a value of about $2,251 will be purchased on e-commerce) [1].

However, the e-commerce success is closely linked to the customer experience related to product purchasing process, starting from the product search. Accordingly, search engines play an important role in this context, as they represent a powerful and simple tool to facilitate the purchase of products and to help consumers to find what they want. Many efforts have been made to improve searching methods and processes based on the linkage of websites with keywords [2], so that search engine have actually reached a good level of efficacy. However, as they operate through the semantic processing of the keywords typed by the user and/or according to specific searching criteria set by user (e.g., search filter), they do not allow to find products that fall outside the user’s knowledge sphere.

To improve search engine effectiveness, other techniques are used to track the user actions during the browsing with the aim to collect information in order to identify user profile. Among them, the most widespread is known as “profiling cookies”. A cookie profiling, also called web profiling [3], is based on the use of persistent or permanent cookies to track the user’s overall activity online. Information collected about users’ preferences is used to enhance search engine, by proposing other contents (e.g., products, topics, etc.) related to that previously searched. However, the massive use of these techniques may be considered detrimental to privacy and can negatively affect the consumer experience on the web. Moreover, they do not allow the user to find products or services that s/he does not already know or that s/he has not ever searched for. Furthermore, the consumer often has simply a need and does not know which is the product or the service that can satisfy it.

A possible solution for this issue could be the introduction of a search engine able to guide the customer in searching the desired product or service according to his/her characteristics and needs. However, a management strategy of several knowledge domains is needed to achieve this objective.

In particular, three essential aspects should be considered:

  1. (1)

    Define a User Ontology (UO) able to proper conceptualize the user characteristics and needs;

  2. (2)

    Define a Product Ontology (PO) by analysing existing products in order to cluster them according to their features, functionalities and application context;

  3. (3)

    Define link properties able to act as a bridge between the user and product ontologies.

This research work aims to provide a method able to support the achievement of these points. As it is almost impossible to define ontologies which are independent by context, the described method is applied, as an example way, to the products class known as Smart Objects that are part of the Internet of Things (IoT) market.

2 Research Background

The Web has become the most important global means of communication, changing the everyday life of people. However, the high number of Internet users and accessible Web pages makes difficult for users to find documents that are relevant for their needs [4, 5]. Starting from a set of keywords, web search engines help people to find the resources they are looking for [6]. Although they index much of the Web’s content, they often fail to select the pages that a user wants or needs. To face this issue, the Semantic Web, which is an extension of the Web that use the standards of the World Wide Web Consortium (W3C), has been developed [7, 8]. It aims to identify the contextual meaning of terms and understand the searcher’s intent behind the query to generate accurate and relevant results [9]. Considering its peculiarities, it has been adopted by several e-commerce web sites.

To ensure an efficient search activity, a well-structured knowledge, a consistent and flexible database, and a proper information management are necessary. The goal is to enable a more effective access to knowledge contained in heterogeneous information environments, such as the web.

However, in order to meet the demands of extremely high query volume, search engines tend to avoid the personalization of information access [10]. However, in real life, different users may have different preferences. It generates two important challenges: the identification of the user context and the organization of the information in such a way that matches the particular context.

Since the acquisition of user interests and preferences is an essential element in identifying the user context, it is a new challenge for the user modeling [11, 12]. It consists in the acquisition of the user’s information (e.g., goals, needs, moods, preferences, intentions, etc.) by an explicit [13] or implicit [14] process and the interpretation of the user’s actions, for a software system [15]. The collection of user information used for the service customization is the profiling process. In general, two are the main problems in the user modeling: firstly, understand what are the most significant user’s characteristics that need to be taken into account in a software system; secondly investigate how to acquire the user information through a system [16]. The user modeling process can be described as a sequence of three steps taken during user interaction with the system [17]: (1) collecting data, in this phase the system collects all the information concerning the user; (2) inference, at this stage, the system processes the previously collected data and performs the user’s profiling by classifying his interest features, preferences and objectives; (3) adaptation, this step is the actual procedure of the user model aimed to provide personalized experiences. As shown in [18] there are two types of user modeling approaches: behavioral-based and knowledge-based. The knowledge-based approach for modeling considers the level of user’s knowledge identified through questionnaires presented to users [19]. The behavioral-based approach instead considers the user’s behavior observed during the interaction with the system [18].

One of the first to introduce the concept of user model based on ontology was [20] who presented a generic architecture for Ontology-based User Modeling, called OntobUM. Some of the most common and used ontologies for modeling the user profile known in literature are the following. Gumo is OWL ontology that is based on UserML and allows to describe the important aspects of user dimensions such as objectives, interests, knowledge, preferences, emotional state, personality. Furthermore, it allows to represent information related to the context as the place and time, and even information about the emotional status and device preferences [21]. The FOAF ontology makes it understandable and actionable information to the machines more often present and widespread in the personal social network user such as name, interests, phone number etc., [22]. OPO is an ontology which aims to facilitate exchange and integration of information about the online presence of users in different types of web applications [23]. The CUMO ontology aims to represent the cultural background of a user and to make this information interchanged by different applications [24].

The definition of a user ontology supports the formalization of the knowledge about consumers’ behaviors and expectations while they are browsing the web enabling certain kinds of automated reasoning. In the same way, to conceptualize the entities and the relative interrelationships of a specific domain of discourse allows organizing the information of the web in order to simplify the search activity. Focusing on the IoT domain, the large-scale deployments of devices and services, information flow and involved users foster the need of a common architecture, but make this challenge very arduous [25]. The extent of the theme has led to the definition of several ontologies, which have approached the topic from different points of view. They are collected in an online catalogue, Linked Open Vocabularies for Internet of Things (LOV4IoT) [26]. Such a dataset aims to supports the building Semantic Web of Things applications, the extraction of frequent terms used in existing ontologies and the reuse and combination of domain ontologies by different stakeholders [27]. In particular, the SAREF ontology [28] describes the Smart Appliances domain has been created to reduce their energy consumption. It aims to support their management on a system level to ensure interoperability. For this aim, the connection between SAREF and the oneM2 M architecture has been studied to facilitate the communication between the smart appliances and any remote application [29]. Komninos et al. tried to deal with the limited effectiveness of smart city applications and increase their problem-solving potential by proposing an overall ontology for the smart city [30]. To achieve a good expressiveness without increasing the complexity and processing time, Bermudez-Edo et al. created the IoT-Lite ontology that describes the key IoT concepts that allows interoperability and discovery of sensory data in heterogeneous IoT platforms [31].

Analyzing the most relevant ontologies, it emerges the lack of a model that describes the IoT domain from another perspective: the consumers. In particular, all data related to connected devices used in everyday life (e.g., household appliances, smartwatches, activity trackers, health monitoring devices, etc.) should be organized with the final aim to increase the users’ awareness about new technologies. Moreover, a link between the user ontologies and the IoT ontologies should be investigated in order to increase the consumers’ satisfaction in the search, benchmarking and purchase activity, while browsing the web. The present research work aims to face these challenges.

3 The Proposed Ontological Model

3.1 User Ontology (UO)

User ontology is constructed by using a hybrid perspective. It starts from the representation of the user characteristics through incomplete descriptions of interests and preferences, approach based on stereotypes. For each stereotype, characteristics and objectives are defined. Subsequently following a feature based approach the dynamic aspects linked to preference changes and interest are modelled [32]. By using a top-down approach, the user profile category is defined into the two associated subcategories: user goals and user characteristics.

User characteristics include three user information domains: demographic, technical and health-related (Fig. 1). Such information can be collected by the system during both the registration and search phase.

Fig. 1.
figure 1

Protégé view of the user characteristics

Demographic attributes concern the user’s personal and socio-economic characteristics. The user’s personal information include age, gender, name, address, etc. The socio-economic characteristics include five classes and it is important to determine the level of expertise possessed by the user in performing the activities/tasks that have to be considered as a prerequisite for the use of the product itself (e.g., swimming, using smartphone, etc.). The classes are: (1) education (e.g. master, degree, etc.), (2) occupation (e.g. student, engineer, office worker, etc.), (3) family role (e.g. father, mother, etc.), (4) skill (e.g. cognitive, management, relational, personal effectiveness, etc.) and (5) conjugal status.

The behavioral attributes refer to user’s preferences and interest, which are important to determine the user favorites (e.g., “loves cats”, “likes blue color” or “dislikes classical music”) and user hobby or work-related interests (e.g., “interested in sports”, “interested in cooking”).

Health related attributes aim to define the spectrum of abilities of the individual, according to his/her health status (e.g., motor abilities, sensorial abilities, etc.). In particular, the health information includes two different classes Physical Factor (e.g. weight, eight, etc.,) and Physical Abilities & Disabilities which is divided into Cognitive, Perceptual and Motor abilities. In particular Cognitive Abilities includes: perception, recognition and interpretation of sensory stimuli (smell, touch, hearing, etc.); attention, ability to sustain concentration on a particular object, action, or thought, and ability to manage competing demands in our environment; memory, short-term/working memory (limited storage), and long-term memory (unlimited storage); language, skills allowing us to translate sounds into words and generate verbal output, visual and spatial processing, ability to process incoming visual stimuli, to understand spatial relationship between objects, and to visualize images and scenarios, executive functions, abilities that enable goal-oriented behavior, such as the ability to plan, and execute a goal. Perceptual Abilities include seeing, hearing etc. Finally, Motor Abilities according to Fleishman’s Taxonomy [33] include: control precision, rate control, aiming, response orientation, reaction time, manual and finger dexterity, arm-hand steadiness, wrist and finger speed.

User profile models sourced from bibliography were also considered and derived concepts were appropriately adapted and included in the ontology. Information from bibliographic sources was exploited for selecting the basic set of upper level classes.

User goals identify the possible reason for which the user is searching for a product. This part of the UO is defined starting from the activity domain and it is codified by the International Classification of Function, Disability and Health (ICF) [34] (Fig. 2).

Fig. 2.
figure 2

Protégé view of the user goals

3.2 Product Ontology (PO)

Product Ontology is constructed with a bottom-up approach. In particular, it started with the analysis of the Smart Object (SO) characteristics (i.e., any everyday use object equipped with sensors, memory and communication capabilities [35]).

First of all, the information related to common standard data for various consumer products have been gathered (e.g., producer, price, warranty, certifications, etc.). Subsequently, a depth analysis to identify the specifics of the smart products has been carried out.

In order to accomplish the task for which it was designed, the SO performs one or more functions. They have been classified in the following four groups: actuating (i.e., ability to transmit data to actuators to control a system remotely); event (i.e., ability to manage a particular scenario as, for example, by notifying another device that a certain threshold value has been exceeded); meter (i.e., ability to monitor the value of a specific attribute by getting data from a meter) and sensing (i.e., ability to detect a specific event). In relation to these functions the parameters that can be controlled (i.e., status parameters such as on, off, open, close, etc.), measured (e.g., activity parameters such as steps, speed, sleep time, etc.; ambient parameters such as temperature, humidity, air quality, etc.; health parameters such as heart rate, blood pressure, SPO2, etc.) and detected (e.g., fall, gas leakage, intrusion, etc.) has been identified and classified.

According to its nature, each SO is able to communicate with other devices or SOs, enabling a set of specific functions. For this aim, the communication protocols (both wired and wireless), the compatible devices (smartphone, tablet, laptop, and desktop), the compatible Operating System (e.g., android, iOS, windows phone, etc.) and the applications have been mapped.

In addition to the main function, the SO can offer a set of more general functionalities grouped in eleven classes. They refer to the geolocation (e.g., GPS); the user management (e.g., multi-user, automatic user recognition, etc.); time and date (e.g., time of day, calendar, etc.); settings (e.g., wizard, customized thresholds, personalized scenarios, etc.); data management (e.g., export, share, etc.); the interaction with compatible devices (e.g., answer to phone call, read a message, play music, etc.); the interaction mode with user (e.g., gesture, touch, voice over, etc.); the power management (e.g., battery status indicator, power saving mode, etc.); the user training in relation to the main function of the object (e.g., goals to reach, rewards, personal assistant, etc.), notifications (e.g., alerts, alarm, message received, incoming call; etc.) and add-on (e.g., holter, plug and play, night mode, etc.). Different channels (e.g., audio, video, haptic) can be exploited by the SO to communicate information or give feedbacks to the user. For this aim, the class related to the interaction mode has been created.

Another important aspect concerns the technical specifications, which distinguish each SO: they include operating conditions (e.g., temperature range), languages, power requirements (e.g., source, autonomy, charge time, etc.), physical characteristics (e.g., dimensions, colour, material, etc.), the box content, the data management for-mats and warnings. Moreover, all the characteristics of the main components are considered such as the display resolution, the memory capacity, CPU frequency, sensors (e.g., accelerometer, gyroscope, altimeter, etc.), etc. Also the specifications related to the measured parameters are taken into account: accuracy, range, resolution, unit, etc.

The class of unit of measure collects the standard for measurement of a quantity that can be related both to parameters and specifications. Furthermore, through the use of clustering algorithms, SOs have been classified for categories and sub-categories (i.e., health, sport, home, etc.) to define their domain of action.

Finally, a set of services has been identified according to the main function of existing SOs in order to simply the matching with the users’ needs. The main goals for which these products are designed refer to the possibility to control, improve, monitor, save or manage some aspects of people daily life in several contexts (e.g., office, home, outdoor, etc.).

In Fig. 3, a partial representation of Product Ontology is shown. Some classes had not been expanded for not affecting the readability.

Fig. 3.
figure 3

Protégé view of product ontology

3.3 Definition of Rules to Enable Ontologies Connection

Once identified the User and Product Ontology, the identification of possible connections between UO and PO, that allows to define the relations between user profiles and products, has been defined.

In order to ensure that subcategories defined in UO link with the subcategories defined in PO, the definition of subcategories related to User Goals must be performed with a bottom-up approach, so that the specification of user goals is carried out basing on functions/services offered by the products.

The resulting User-Product Ontology allows to identify the knowledge domain necessary to identify correlations between User profile and available products.

To define the proper links between UO and PO several rules have been identified. Basically, the UO domains named User Goals (UG) and User Characteristics (UC) have been connected with three PO domains: the Product Service (PS), the Product Categories (PC) and the Product Specification (PSp) domains. In particular, the UG domain has been linked with PS and PC domains, while the UC domain has been linked with PF and PSp domains.

The linkage between various domains has been detected by using a two-step method. Firstly, a knowledge base has been identified by considering the evident logical correlations that exist “a priori” between the two ontologies. Then, to discover other not evident relationships and to assess the strength and utility of predictive relationships, proper training sets and test sets have been carried out. In this way, the knowledge base has been refined and enriched (Fig. 4).

Fig. 4.
figure 4

The proposed ontological model

3.4 Example of Rules Defined to Connect the Two Ontologies

To define a-priori relationships, possible links between the considered domains are analysed, through the construction of four relationship matrices to correlate the four pairs of domains [i.e., UG-PS, UG-PC, UC-PS, UC-PSp]. In particular, the logical linkages that certainly exist (or do not) are defined and consequently a shortlist of uncertainly correlations is identified. Figure 5 shows part of the relationships defined in this way between the UG and the PC domains.

Fig. 5.
figure 5

Example of relationship identified “a priori” between UG and PC domains.

For example, the goal “Maintaining one’s health” is linked “a priori” all health sub-categories, while the goal “Ensuring one’s physical comfort” is not linked “a priori” with any product category. At this point, a learning procedure has been performed to predict relationships in those uncertain cases, and to confirm the a-priori assumptions. A sample of 100 users (average age 34 years, 40% female, familiar with technology) has been involved to identify the probability associated to uncertain links. The sample size has been chosen according to literature [36] to guarantee a low degree of variance.

In particular, the sample is distributed as follow:

  • A training set of 60 users to define the weight of linkages;

  • A validation set of 20 users to tune the parameters and improve performance;

  • A test set of 20 users to assess the performance of the presented linkage system, according to [37].

Once collected the data related to their characteristics (e.g., gender, age, etc.) and identified possible goals, according to the User Ontology architecture, the experimental protocol has been carried out as follows:

  1. 1.

    The first 60 users are asked to select the correlation between proposed goals and product categories shown in Fig. 5 (also for “a priori” certain link and not-link);

  2. 2.

    The task is repeated with the validation set of 20 users in order to tune the percentage values;

  3. 3.

    The last 20 users are asked to randomly select a target goal. Subsequently, after selecting the objective, the associated list of product categories is proposed, sorted by percentage. The users are asked to give a score between 1 and 5 on the basis of the proposed selection and the succession order. This step allows to determine the degree of satisfaction about possible results according to a given user goal.

In Fig. 6, the result of the first two phases is shown: the percentage values of uncertain links are determined and the “a priori” choices are confirmed (99,48% of matching for certain links and with 0,25 of variance, 99,67 of matching for not link with 0,78 of variance). As the assumptions made in advance were confirmed, the probabilities for certain link and not-links are approximated at reference value (Certain Link = 1, Not-Link = 0) to reduce future computational elaborations.

Fig. 6.
figure 6

Example of relationship identified by training set between UG and PC domains

The last phase provided data shown in Fig. 7. The evaluation data are normalized to max score (5) to show up percentage values. In order to guarantee a high degree of accuracy a selective threshold is set to 80%: as the chart shows, twelve goals exceed it, while two goals are critical. Crawling and Climbing fail to meet the requirements (respectively with 73% and 76%). Perhaps, such failure is particularly due to the activity that the goal includes: being uncommon activity they could lead to very different perceptions among similar users.

Fig. 7.
figure 7

Satisfaction of test users about result proposed according a given goal.

On the other hand, the goals that present the highest rate are in the field of Self Care (Managing diet and fitness with 87% and Maintaining one’s health with 90%): these finding allows confirming the previous statement about the user perceptions about common and uncommon action domains.

4 Conclusion and Future Works

In this paper, an approach able to define a new ontological model in order to merge the knowledge of user and product domains is proposed. After a deep analysis of existing ontology models of these two domains, an overall model to connect user characteristics and goals and product features is developed. About the user model, a structured ontology is defined according to the major reference models available in literature. On the other hand, as a lack of a user-oriented product ontology was detected, a custom model is developed taking a cue from existing domain ontologies.

To test the goodness of this approach, an experimental evaluation is carried out with 100 users, splitting them in a training and validation set (80) and a test set (20). The evaluation shows two mean outcomes:

  • The “a priori” assumptions are confirmed in the training step;

  • The “weights” derived from the training session have shown a good accuracy, except for two critical goals.

The results show the need to improve the training phase in order to tune the weight parameters, resolve the observed issues and create a knowledge model able to keep a high degree of accuracy and reliability.

This work is a deep revision of the ontological model marginally presented in [38]: in this context, such a model can be used to build a user-oriented search engine able to propose to customers results that best fit their needs and goals, exploiting the user information from cookie profile, avoiding intrusive advertising banners, and finally increasing the overall user experience in the searching activity.

The future works will include a deep integration with the cited search engine system and a consequential user validation in a web-based application.