Estimating a latent-class user model for travel recommender systems

Arentze, Theo; Kemperman, Astrid; Aksenov, Petr

doi:10.1007/s40558-018-0105-z

Estimating a latent-class user model for travel recommender systems

Original Research
Open access
Published: 02 February 2018

Volume 19, pages 61–82, (2018)
Cite this article

Download PDF

You have full access to this open access article

Information Technology & Tourism Aims and scope Submit manuscript

Estimating a latent-class user model for travel recommender systems

Download PDF

3045 Accesses
8 Citations
Explore all metrics

Abstract

In determining the selection of sites to visit on a trip tourists have to trade-off attraction values against routing and time-use characteristics of points of interest (POIs). For recommending optimal personalized travel plans an accurate assessment of how users make these trade-offs is important. In this paper we report the results of a study conducted to estimate a user model for travel recommender systems. The proposed model is part of c-Space—a tour-recommender system for tourists on a city trip which uses the LATUS algorithm to find personalized optimal tours. The model takes into account a multi-attribute utility function of POIs as well as dynamic needs of persons on a trip. A stated choice experiment is designed where the current need is manipulated as a context variable and activity choice alternatives are varied. A random sample of 316 individuals participated in the on-line survey. A latent-class analysis shows that significant differences exist between tourists in terms of how they make the trade-offs between the factors and respond to needs. The estimation results provide the parameters of a multi-class user model that can be used for travel recommender systems.

A Personalized Location Aware Multi-Criteria Recommender System Based on Context-Aware User Preference Models

Tourists’ City Trip Activity Program Planning: A Personalized Stated Choice Experiment

Preliminary Analysis and Design of a Customized Tourism Recommender System

1 Introduction

With the advancement of information and communication technologies (ICT) the development and use of recommender systems that can offer tourists personalized advice and recommendation on which activities to conduct at a destination has received increasing attention (e.g., Buhalis 1998; Buhalis and Law 2008; Mackay and Vogt 2012; Steen Jacobsen and Munar 2012). A typical user of a travel recommender system is a tourist who is interested in exploring a city and wants to make a tour around (e.g., Yang and Hwang 2013; Borras et al. 2014). Such a tour comprises a scheduled list of attractions (museums, heritage sites, shops, parks or other points of specific interest) as well as the trips needed to travel from one point to the other (e.g., Gretzel et al. 2004; Gavalas et al. 2014).

Travel Recommender Systems (TRSs) help to overcome the information load tourists may experience when they search for options, by providing users selected items that match their personal preferences (Braunhofer et al. 2015). For this a critical element of TRSs is the ability to acquire the relevant information about preferences and needs of the user and identify the POIs that match his or her interests. A number of alternative methods have been proposed to tackle this problem. These can be classified as collaborative filtering (matching a user to other users that have similar interests and preferences), content-based filtering (matching based on attributes of POIs) and knowledge-based methods (e.g., case-based reasoning). An overview of techniques in this area can be found in Hanani et al. (2001) and Adomavicius and Tuzhilin (2005).

Across these approaches users’ preferences to be predicted are often formulated as rates assigned to items (POIs) that reflect how much one likes the product or service. For determining an optimal tour, however, users have to trade-off their interests in certain POIs against other considerations such as travel costs (time and effort it takes to reach the location), fee or entrance costs, and preferred allocation of time across activities. Furthermore, individuals’ preferences may depend on needs that change depending on previous activities. Such dynamic needs give rise to saturation effects and variety seeking (Arentze and Timmermans 2009). If multiple activities have to be combined on a trip, the way a user makes trade-offs between these considerations determines overall preferences for selections of POIs. Thus, in the context of tour planning, the selection of POIs is a multi-criteria decision problem. Hereby, individual travelers may differ in the weights they assign to these components in determining their preference.

Although the multi-criteria nature of preferences for tours is widely acknowledged in advanced trip planners for ordinary travel (Kerkman et al. 2012), it has received limited attention in user models of TRSs. In the present study, we present a method to estimate tourist’s preferences taking into account the various factors involved in city trip planning. In this method, the preference value or utility for including a certain POI in a tour is modeled as a function of attributes of POIs. The utility function is estimated using a stated choice experiment administered in a survey. The estimated utility function defines a user model that allows a TRS to compose an optimal tour given personal information about specific interests of an individual user. We design a stated choice experiment that allows the estimation of the relevant parameters and present the results of an application involving a large sample of individuals from a national on-line panel. Individual tourists may differ in terms of the way they make the trade-offs. To account for heterogeneity among individuals and identify the extent to which preferences may differ, we estimate a latent-class model.

The method we propose is developed in the context of the c-Space TRS for city trips (Aksenov et al. 2014, 2016). A special characteristic of the c-Space system is that it takes dynamic needs into account by using an advanced algorithm to find personalized optimal tours called LATUS (Arentze 2015). In the context of the c-Space system, the estimates are used to define an initial user profile that can be adapted if more information about a user’s preferences becomes available. The recommender system and the LATUS algorithm have been described in earlier work as referenced above. In this study, we briefly explain the system and present the proposed method to estimate user preference profiles. The results of this study also provide substantive insights in tourists’ preferences for visiting POIs in city tours.

The rest of the paper is structured as follows. First, in the next section we will review the existing approaches in the field of TRS with respect to user modeling. Then, in Sect. 3, we briefly describe the c-Space system and LATUS algorithm to offer a system concept for the user model. Then, in Sect. 4, we describe the stated choice experiment and survey method. In Sect. 5, we present the results of the survey and estimation of the latent-class model. Finally, we conclude the paper with a discussion of major conclusions and directions for future research.

2 Related work

The core component of TRSs is a (filtering) algorithm to select from an exhaustive database the POIs that match a user’s preferences. Collaborative filtering is a much used technique in TRSs. In this technique, personal background or history information about a user is used to identify users with similar characteristics of whom the preferences are known. Preferences are typically represented in the form of ratings assigned to POIs. The average rates assigned by previous similar users is used as a best estimate of the preferences of the user the system is interacting with. The definition of similarity is a critical component in this process. If already ratings of the user are known from previous interactions with the system, similarity can be measured based on matching ratings. If such history information is not available then similarity may be defined based on known demographic data of users such as age, gender and education. An alternative to collaborative filtering is content-based filtering. In a content-based approach, items are recommended that have the same attributes as the items that the user has liked before (Neidhardt et al. 2015).

A generally acknowledged problem with the filtering methods is the so-called cold-start problem. This problem occurs when requests come from new users who have not yet submitted any ratings or concern new items which have not been evaluated before (the first-rater problem) (Fonte et al. 2013). Knowledge based systems have been proposed where preferences are derived based on reasoning about user requirements that go beyond a simple matching of ratings. A well-known example of a knowledge-based technique is case-based reasoning (Fonte et al. 2013).

The new user problem has also received attention in so-called Context Aware Recommender Systems (CARS). These systems emphasize that users’ preferences are dependent on contextual conditions and, hence, that recommendations should be context dependent. In tourism choice, weather conditions (sunny or rainy, etc.), travel party (alone or traveling with others) and travel mode (e.g., transport mode) are influential contextual conditions. Braunhofer and Ricci (2017) report the results of a survey conducted to identify important context factors and estimate the influence of these factors on rating predictions in the context of TRSs. Also, the role of emotion and personality traits have received attention as context factors in CARS. In a survey conducted to elicit tourists’ preferences, Neidhardt et al. (2015) use a picture based approach to address preferences on an emotional level. Braunhofer et al. (2015) show that personality traits of the Big-5 model provide useful information for generating context-aware recommendations. They argue that personality trait data are relatively easy to collect and especially useful for ranking the recommendations in case of new users.

TRSs have gone further than recommending POIs in isolation. Recommendation of complete packages is relevant for tourists who want to plan a tour combining visits to several POIs on the same trip, e.g., a day-tour in a city. Many systems have considered this extended problem of recommending routes (for a review see Wörndl and Hefele 2016). In planning a route, preferences related to interests in POIs need to be combined with other characteristics of POIs such as estimated visit times, travel distance and costs (fee or entrance). As Wörndl and Hefele (2016) state:

“the process of generating a path from a start to an end point with interesting POIs along the way can be split up into two subtasks. First, potential candidate places have to be determined and scored, and then a path finding algorithm need to generate the best route consisting of a subset of these places.”

An example is the image-based system MoreTourism (Linaza et al. 2011). This system first elicits a user’s preferences and next recommends the POIs that have the highest utility and an optimal route taking into account estimated visit times, open and close times, and costs.

In this study, we consider TRSs that have the objective to recommend complete tours. Finding an optimal tour requires that POI rating scores are traded-off against travel time, entrance costs and time-use characteristics of POIs. The purpose of the present study is to empirically assess the way individuals make these trade-offs. We model individuals’ preferences for POIs in the context of a tour as a multi-attribute utility function and estimate the utility weights in the framework of a discrete choice model. The influence of context conditions is taken into account to allow context-aware recommendation. Stated preference data from a representative sample of individuals are collected in an on-line survey. Using a latent-class model, the estimation of preference parameters and clustering of individuals regarding the preferences they display are performed simultaneously. Thus, the estimation results also indicate the extent to which preferences differ between individuals. In the next section, we will first briefly introduce the c-Space TRS.

3 The c-Space system

To formulate the multi-attribute utility function, the c-Space TRS (Aksenov et al. 2016) is the point of departure. c-Space generates personalized tours taking into account a user’s personal thematic interests in particular POIs (architecture, cathedrals, museums, etc.) as well as the weights he or she assigns to a set of basic leisure needs (relaxation, entertainment, new experiences, socializing, etc.). c-Space has been developed as a smartphone application wherein the recommendation functionality is integrated as a REST service (Simoes et al. 2015). Thematic interests and needs as well as time budget and travel constraints are retrieved in a dialogue with the user on the smartphone. Figure 1 shows an example of a dialogue. The resulting user profile is input to the LATUS algorithm together with utility weights of attributes of POIs. The recommended tour including travel plans to reach the various locations are displayed on a map of the city (Fig. 2).

In c-Space, location and attribute data about the available POIs in the city of interest are stored in a database. The attribute data stored include general information, such as opening hours and ticket costs (entrance fee), as well as information specifically collected for the c-Space system. The specific information includes the recommended duration of a visit to the POI (in hour units), attraction value (popularity) and theme (subject). The specific information is provided by experts from the local tourist agency. A special part of the POI data consists of parameters, one for each need, that indicate the extent to which visiting the POI matches needs on a zero–one scale (zero indicating no match and one complete match). These parameters are determined based on rule-based knowledge of the types of activities involved in a POI (e.g., a museum can satisfy a need for new experiences to a large extent and a need for physical exercise to a small extent; a botanic garden matches a need to be outdoor to a large extent and a need for entertainment to a small extent, etc.). The degree of match also determines the size of the impact the POI has on the need (e.g., a museum reduces the need of new experiences to a large extent).

The POI database and personal profile of the user provide the information for determining utility scores of POIs. The utility score of a POI is determined as a function of the match of the POI with the interests of the person, the attraction value of the POI, the match of the POI with current needs, the travel required (geographical distance) and the monetary costs involved in visiting the POI. A POI matches the interests, if the theme of the POI corresponds with a theme the user has indicated to be of interest to him or her. Match with current needs is determined based on the POI-need-matching parameters, the weights the user assigns to the different needs and an assessment of the current size of each need. The sum across weighted need impacts determines the utility score regarding the match with needs (see Arentze 2015 for details).

Due to the dynamic nature of needs, the utility function is dynamic. That is to say, the utility of a POI is dependent on other POIs included in the tour due to the impacts activities have on dynamic needs of the traveler (e.g., a museum will be less attractive if current POIs in the program have already reduced the need for new experiences). Therefore, each time a POI is added to the evolving program the state of the needs are updated before a next POI is considered. The LATUS algorithm is designed to determine the optimal selection of POIs taking into account these interactions between POIs on the overall utility.

LATUS starts with an empty program and successively adds POIs selected from a list of optional POIs until the time budget is fully used or no utility can be added anymore. The problem of finding the optimal tour is split in two parts: (1) determining the program by selecting POIs and (2) determining the sequence in which the POIs are visited in the tour and the travel routes between POIs. The optimum sequence is defined as the sequence that minimizes the overall travel costs and is found by means of a heuristic method. To find the selection that maximizes the utility of the tour, LATUS uses a heuristic method. The method is schematically shown in Fig. 3. In this method, the best POI to add is identified as the POI that meets a time-use requirement and maximizes the added utility. The time-use requirement is defined as a threshold level of the utility per unit time taking into account the time to reach the location and the (normal) visiting duration. The threshold level is a parameter set by the system that should reflect the time budget. The more time available the lower the threshold can be set and, vice versa, the tighter the budget the higher the threshold needs to be. Since the proper level of the threshold cannot be computed analytically, LATUS uses a trial-and-error method to find the proper threshold level in a pre-processing step. Starting with a best-guess initial value it increases the threshold when the resulting selection exceeds the budget and lowers the value when time is left in the budget. This heuristic appears to be very powerful in finding the optimal (highest utility) tours (Arentze and Timmermans 2009; Arentze et al. 2010).

The model estimation in this study provides the utility-weights of the user profile. The intended contribution of the present study is to show how utility weights for this class of TRSs can be estimated and segmented. In the sections that follow, we describe the design of the choice experiment, the survey and the results of the latent-class-model analysis.

4 Methodology

In this section we describe the methodology used in the present study. The core elements of the methodology are a stated choice experiment used to collect data about preferences of tourists on city trips and a latent-class model to estimate preference parameters. Before explaining these elements we will first discuss the underlying behavioral assumptions.

4.1 Behavioral assumptions

Although our point of departure is the cSpace TRS, our purpose is to derive a user model that is relevant more broadly for TRSs that are focused on recommending tours. Therefore, in this section we highlight the theoretical considerations that have led to the model specification used in c-Space.

4.1.1 Attributes of POIs

The proposed user model assumes that a tourist’s preferences for selecting POIs in the context of a city tour depend on a number of attributes. First, the general attraction value of the point of interest is relevant, that is, the extent to which the point of interest is special, worth a special trip, or even the primary reason to visit the city (e.g., Ashworth and Page 2011; Yeh and Cheng 2015). For example, in many travel guide books some kind of rating system is used to distinguish a top attraction from attractions of less importance (e.g., classification according to the Michelin stars: * of interest, ** worth a detour, *** worth the trip). Second, the extent to which the point of interest matches a person’s personal interest in particular objects/themes is a consideration. For example, some people may be fascinated by cathedrals whereas others find them boring. Third, options may vary in terms of the extent to which the activity matches a current emotional or motivational state given the activities a person has already conducted on the same (city) trip (e.g., Lin et al. 2014; Ma et al. 2013). For example, if all previous activities conducted so far have been indoors, the person may prefer to conduct the next activity in the open air. Fourth, accessibility and costs considerations may play a role: options may differ in terms of the effort (e.g., travel time) it takes to travel to the location or the fee one needs to pay to visit a site (e.g., Armbrecht 2014; Lew and McKercher 2006; Wynen 2013).

4.1.2 Dynamic needs

The choice of an activity generally involves a trade-off between these considerations. By their nature, attraction value, personal interest, effort and costs are static attributes, as the evaluation of these attributes does not depend on a momentary state of the person. In contrast, the extent to which visiting the POI meets the current needs of the tourist is inherently time-dependent. Although mood (and emotion) is also a relevant dimension in this regard (e.g., Wang et al. 2012), we focus here on basic needs. We adopt a classification of basic leisure needs that emerged in the empirical study by Nijland et al. (2010). Based on an analysis of motivations underlying leisure activities, the authors identified 6 need dimensions: new experiences/information; entertainment; relaxation; being in open air/green environment; physical exercise and social contact.

Individuals may differ in terms of how strong these needs are felt or valued. Some may develop more quickly a need for entertainment while others may be more sensitive to new experiences and so on. Such differences may be related to a personal trait (e.g., thrill seeking) (Schneider and Vogt 2012) but also be affected by the nature of the primary activity (the job or occupation) of the person in daily life. For example, a person who has a hectic job in daily life may be inclined to seek relaxation in leisure activities instead of new experiences or socializing.

4.2 Design of the choice experiment

To estimate tourists’ preference parameters regarding activity choice during a city trip, we use the technique of stated choice experiment (also known as conjoint analysis) (e.g., Hensher et al. 2015). In this technique, individuals are presented a choice task where they are asked to indicate their preference among a set of choice options (a choice set). The choice options are hypothetic and described in terms of a set of attributes. The attributes and the values each attribute can take are pre-defined as part of the experimental design. Across choice tasks, the attributes are varied based on a statistical design so that the separate (utility) effects of the attributes can be identified through statistical analysis of the obtained choice data.

In the experiment we constructed, respondents are asked to imagine the following hypothetical situation:

Imagine that you are going to make a city trip to a city you do not know yet. It is a safe, not too crowded and well accessible city. You are traveling together with a person (e.g., partner, adult–child, friend, other family member) who has the same interests as you have. There is much to see and to do in the city that is worthwhile and for sure you will not have enough time if you would want to see and do all. Furthermore, it is good weather for visiting the city.

Next, choice tasks are presented to respondents where the context setting for the trip and choice alternatives are varied simultaneously. The context setting for the trip is varied in terms of the following attributes:

Total duration of the city trip (one afternoon, 1 day, 2 days).
The time moment of doing the activity in the context of the trip (first activity, in-between activity, last activity).
The size and nature of the current need (size: strong and very strong; nature: new experience, entertainment, relaxation, exercise, open air—green environment, socializing, no specific need).

An activity consistently involves visiting a particular POI. The manipulation of needs (the last item) is a key element of this experiment. To avoid needless complexity, it is assumed that a need exists on only one dimension at a time (combinations of needs are not considered). To include a null measurement, the absence of a need is included as a possible level as well; hence this variable has seven levels. The size of the need (if any) has two possible levels—strong and very strong. Literally, the need condition is formulated as:

At this moment you have [size] need for [dimension]

The choice alternatives are optional POIs; they are varied in the following way on the following attributes:

Attraction value of the POI (one star, two stars, three stars).
Extent to which the POI meets the person’s interests (very low, average, very much).
Extent to which the POI fulfils the person’s current need (very low, average, very much).
The costs of visiting the POI (free, 5 € pp, 10 € pp).
Travel time to reach the POI from the person’s current location (on the route, 10 min walking, 20 min walking).

As said, number of stars is an often used labeling system to indicate attraction value in tourist guides and, therefore, is used here.

We use separate designs for varying the contexts and choice alternatives. For the context we use a design in nine profiles. The nine profiles are a fraction of a full factorial design of 3² × 2 profiles. The fraction of nine profiles allows estimation of all main effects independently of all first-order interaction effects between attributes. Secondly, we combine the nine profiles with the seven needs (including ‘no specific need’) resulting in 63 different contexts. To design the activity alternatives, we use a design in 27 profiles. The 27 profiles are a fraction of a full-factorial design consisting of 3⁵ profiles. Just as in the case of the design for contexts, this fraction allows the estimation of main effects of attributes independently of all first-order interaction effects. Each respondent is presented with nine choice tasks that are generated by randomly selecting nine context profiles and per context a choice set is presented including three randomly selected POI profiles. The respondent is asked, given the specific context setting, which POI he/she would prefer or to select the base alternative which is taking a break (not doing any specific activity at the moment).

4.3 Latent class model

A latent class model is used to segment the respondents regarding their city trip activity preferences (e.g., Swait 1994; Boxall and Adamowicz 2002; Greene and Hensher 2002). In the estimation respondents are simultaneously grouped into segments (or latent classes) and separate parameters are estimated for each of these segments. In our study, we assume that individuals derive some utility from choosing a specific POI during their city trip. This utility can vary between different POIs based on the attributes describing the context and the POI itself. For the usual multinomial logit model (MNL), the utility for individual i for POI j on choice occasion t can be written as:

$$U_{ijt} = \beta^{'} X_{ijt} + \varepsilon_{ijt} ,$$

where X_ijt expresses all attributes (defining context and POI) with relative weights (parameters β′) to be estimated. ɛ_ijt is an error term representing unobserved heterogeneity in utilities. This equation assumes that the parameters are the same for all individuals. However, we assume that there exist S different homogeneous latent classes (segments) in the sample. Given that an individual belongs to latent class s (s = 1, …, S), the utility for individual i belonging to class s for activity j on choice occasion t is defined as:

$$U_{ijt} = \beta_{s}^{'} X_{ijt} + \varepsilon_{ijt} ,$$

where $\beta_{s}^{'}$ is a parameter vector for each latent class s. The probabilities of choice can be derived from the utility function, resulting in the latent class multinomial model (LCM). For each latent class, the probability that individual i chooses POI j at choice occasion t is:

$$P\left( {y_{it} = j|segment = s} \right) = \frac{{\exp (\beta_{s}^{'} X_{ijt} )}}{{\mathop \sum \nolimits_{j = 1}^{{J_{i} }} \exp (\beta_{s}^{'} X_{ijt} )}}.$$

For each individual i the probability of belonging to latent class s can be obtained by:

$$P(segment = s) = \frac{{\exp (\theta_{s}^{'} Z_{i} )}}{{\mathop \sum \nolimits_{s = 1}^{s} \exp (\theta_{s}^{'} Z_{i} )}},$$

where Z_i is an optional set of observable characteristics invariant of the individual choice situation. If no such characteristics are included, the class specific probabilities are a set of fixed constants that sum to one. Each individual is assigned to the latent class with the highest probability.

The latent class parameters can be estimated using maximum likelihood estimation (see Greene 2001 for details). The likelihood ratio test statistic [G2 = − 2 (LL(0) − LL(B))] is used to test whether the estimated choice model LL(B) significantly improves the null model LL(0). McFadden’s Rho square (ρ² = 1 − LL(B)/LL(0)) indicates the goodness of fit of the estimated choice model. To select the optimal number of segments, the minimum Akaike Information Criterion [AIC = − 2 (LL(B) − P)] is used (e.g., Kamakura and Russell 1989; Gupta and Chintagunta 1994).

5 Results

In this section we describe the data collection in terms of the survey and the sample, and the results of the estimation of the latent-class model.

5.1 Survey and sample

The choice experiment was implemented in an on-line questionnaire. Apart from the choice experiment, the questionnaire also includes questions to record relevant background variables of the persons. In addition to the usual socio-demographic variables (gender, age, household type, education level, income level, work status), this includes a rating of the felt importance of each of the six basic leisure needs for the benefits the person seeks in a city trip. For these judgements a seven-point rating scale is used. In addition, the nature of the occupation (job, if any) is queried. Respondents indicate the nature of their occupation based on a classification consisting of nine profession types. This set-up allows us to relate pursued needs in leisure time to characteristics of the work activity (job type).

Invitations to participate in the survey were sent to a random sample of a large existing national panel which should be representative for the Dutch population. Only respondents that have made at least one city trip in the last 2 years could proceed with the questionnaire. A city trip is defined as a visit to a city in leisure time with the aim to explore the city. A city trip lasts minimally 4 hours and does not include more than three nights. By this filter, we made sure that the relevant segment of the population was selected.

In total 316 persons completed the survey. Table 1 shows the distribution of the sample for some key socio-demographic characteristics. The distributions are fairly representative for the (Dutch) population. The last row shows the distribution of the respondents across the nine profession types distinguished. Administrative, Commercial and Specialists professions are the largest categories and have shares in the range of 16–20%. Crafts & industry, Transport, Services and Education are smaller with shares ranging from 5–10%. Agricultural is only very small with a share of merely 0.6%.

Table 1 Sample characteristics

Full size table

5.2 The latent-class model

The model specification we use allows us to estimate main effects of all (three-level) attributes of POI choice alternatives including attraction value, match personal interests, match current needs, the costs of the activity and travel time. Consistently, effect coding was used where the highest level is taken as the base. Effect coding means that each three-level variable is coded by two effect-variables: the effect-one variable is coded as [1, 0, − 1] and the effect-two variable as [0, 1, − 1] for the [low, mid, high] level of the original attribute variable. Furthermore, the model enables the estimation of two-way interactions between all the context variables and all POI choice attributes. Of specific interest is the interaction between the nature of the current need (a context variable) and match with current need (a POI attribute). On that level, interaction effects indicate to what extent individuals differentiate between need dimensions. In pre-processing steps, the specification of the latent-class model (number of classes and selection of interactions) was optimized to arrive at a parsimonious model. A three-class model appeared to be optimal. See the Appendix for the details.

Table 2 shows the detailed estimation results for the three-segments model and base model (no segmentation) respectively. Estimation results of the base model represent average behavior across all segments. On this level, the results indicate that all attributes are strongly significant. The difference between utility values of the lowest and highest level indicates the relative importance of the attribute under concern in the choice of activity. Using that criterion, match with personal interests has the largest effect and, hence, is the most important attribute. Attraction value, match needs and activity costs have approximately equal values which are larger than the value of travel time and smaller than the value of match interests. A three-stars attraction is approximately equivalent to 10 € costs suggesting that tourists are willing to pay 10 € pp for a top attraction. They are willing to pay around the same amount for attractions that match their current needs and they are willing to pay more for attractions that match their personal interests. Thus, the results confirm the idea that current needs play a significant role in the preference for an activity. Next turning to context- interaction effects, we see no significant effects of the nature of the current need on match need. This suggests that on average across segments tourists assign approximately an equal weight to needs. Furthermore, the size of the need does not have a significant interaction effect with the match-need attribute in the overall model. This is unexpected as one would expect a stronger impact when the need is very strong as opposed to just strong.

Table 2 Results of the Latent-Class Model estimation

Full size table

We next turn to the model with segmentation (Table 2). A first observation is that on the level of segments several interaction effects with the current need now are significant. Segments differ in terms of which need is considered most important. Furthermore, we see striking differences on the level of main effects of POI attributes and the constant (value of no activity). Considering the patterns of main and interaction effects the segments can be characterized as follows.

Segment-1 individuals assign high values to all attributes—attraction value, match personal interest, match needs, activity costs and travel time. Furthermore, these individuals consider entertainment as a particularly important need as well as being outdoors. However, when the POI matches this need to a large extent the utility of the POI becomes smaller. This is counter intuitive. A possible explanation is that this quality of the POI indicates a situation of a natural environment which they don’t prefer in the context of a city trip.

Segment-two individuals are more selective in terms of the attributes they take into account. For these persons only a match with personal interests is relevant; they are insensitive to attraction value. Apart from personal interests they care about costs and to a lesser extent also travel time. Especially, free entrance (no costs) has a big appealing effect on these tourists. A match with a current need is relevant for this group only if the need concerns new experiences. However, a strong match has a negative effect on the utility of the activity. An explanation might be that this group dislikes the type of POI that strongly addresses new experiences so that only POIs that moderately match the need are appealing to them.

Respondents belonging to segment 3 are also rather selective in terms of the attributes they consider important. They consider personal interest important, but to a much lesser extent than in the other segments. Typical for this segment are the high importance assigned to attraction value and the indifference to costs. They are sensitive to a match with current needs only when the match is strong as opposed to moderate. Furthermore, they assign an above average weight to new experiences. Lastly, this segment is characterized by a high value of the constant indicating that visiting a POI must meet high demands before it is preferred over doing nothing (having a break).

In sum, the three classes emerging from this analysis differ in various respects from each other. The first class consists of tourists who seek to get the maximum experience out of available options for visiting POIs in a city trip—they evaluate POIs thoroughly on all aspects. Needs play a role but there is no differentiation with respect to the nature of the need. The second class consists of tourists who choose POIs primarily based on personal interests taking into account costs and effort. Match with a current need is not taken into account except when the need concerns new experiences. The third and last class consists of tourists who impose high demands on what activity POI has to offer paying attention to attraction value, personal interests and need for new experiences. This class is insensitive to costs. Furthermore a strong match with a particular need is not always considered as positive. The likely explanation for this is that meeting a particular need may correlate with certain qualities of a POI that the person finds unattractive in the context of a city trip.

The classes are not equal in size. In the sample, the shares are 48.3% (segment 1), 13.1% (segment 2) and 38.7% (segment 3). Table 3 shows the composition of the segments in terms of some key personal background variables. The segments differ significantly on gender, age, education level and work type (profession).

Table 3 Relationships between socio-demographics and segment membership

Full size table

5.3 Incorporating the results in a TRS

The model estimation results can be integrated in the user model of a TRS to take into account users’ preferences regarding travel and time-use as well as non-travel characteristics of POIs. The latent class estimation showed that considerable variation exists in how individuals trade-off attributes of POIs. We emphasize that the classes that emerged from this analysis does not necessarily identify general groups of tourists. The differences in preferences may also be related to current circumstances or motivational states (e.g., mood). The classes found do give an indication of the range of variation.

In a TRS, this range can be taken into account by identifying the best fitting class for the trip under concern through a dialogue with the user at the moment of planning a trip. A possible way of doing this is to present short descriptions of the profiles to the user and ask him or her to indicate which description would best fit his or her own profile for the trip. Although none of the standard profiles may fit an individual perfectly, it is expected that the segmentation at least will improve the assessment of the true preferences.

Such a multi-class model would be an advanced feature of a TRS. Even without segmentation, the integration of the preference estimates (i.e., the one-segment solution) already would involve a significant refinement of the user model compared to existing systems. To demonstrate this, the model estimation results were implemented in the c-Space recommender system. The single segment solution was implemented, as the current version of c-Space does not include a method to assess the specific preference profile of a user on this level. The estimation results needed to be further processed before they can be used. For the discrete choice the POI attributes were discretized; the stated choice experiment used three levels for each attribute. For continuous variables such as travel time and entrance costs a TRS needs a continuous function. A continuous function was derived by intra- and extrapolation of the point estimates.

For a first qualitative evaluation of the system, an application was developed for Trento, a popular city-trip destination for tourists in Italy. 35 individuals were approached in a street-survey and volunteered to use the system to plan and implement their trip. After having made the tour they filled out a small survey about their experiences. The responses confirmed the usefulness and added value of the system. Users reported that the content suggested for their trip was indeed of their interest (83%) and that they were not able to find such content using other means (91%). Compared to other recommender systems, which typically recommend popular tours, the tours suggested were found to be more in-line with their interests. The prototype and these test results provide evidence that refinement of the user model in the way proposed in this study is feasible and potentially can improve the quality of tour recommendation.

6 Conclusions and discussion

For recommending optimal personalized tours it is important to know the way individuals make trade-offs between preferences for particular POIs against routing, costs and time-use characteristics. In this study, we described the c-Space tour-recommender system and considered the empirical estimation of utility weights tourists assign to these factors using the stated-choice-experiment and state-of-the-art choice analysis techniques (latent-class model). A random sample from a large national panel participated in the survey. The analysis revealed the influence of motivational state of a tourist on preferences for activities. It also revealed that the way trade-offs are made and the response to current needs differ significantly between individuals. The latent-class analysis indicated that three segments can be identified.

The results of this study can be used to improve user models that are currently used in travel recommender systems. Current models typically assume a process where the selection of POIs and determining a route along the locations of the POIs are performed in separate steps. The multi-attribute utility function estimated in the present study allows the TRS to take the travel and time-use implications of visiting particular POIs into account already in the step of selecting POIs. Thus, using this utility function the selection of POIs that maximize a utility value on the level of a tour can be identified. As we demonstrated by an application of the c-Space TRS, the estimated values can be incorporated in a user profile together with information about personal interests (themes) and needs of a user. A first evaluation demonstrated the efficacy of the approach.

Several problems remain for future research. First, the current c-Space system does not support a process for adapting the user model to a user. Extending the system to handle a multi-class user model is an objective of further development. Second, the estimation results were based on a sample from the Dutch population. It is interesting to replicate the study in other countries to see whether similar segments emerge. Third, our study took into account only a limited number of potentially relevant contextual conditions. For advanced Context-Aware TRS (CARS) the set of conditional factors need to be expanded in order to obtain more refined estimates of utility weights in specific cases. Fourth, tourists’ activities are often conducted by individuals in a group and preferences for selecting certain POIs and activities are the result of a group decision process. Our user model did not account for this social aspect. In order to derive a suitable user model for TRSs that do take group preferences into account—so-called Social TRS, the discrete choice analysis need to be expanded in future research.

References

Adomavicius, Tuzhilin (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17:734–749
Article Google Scholar
Aksenov P, Kemperman A, Arentze T (2016) A personalized recommender system for tourists on city trips: concepts and implementation, International Conference on Smart Digital Futures, KES International, 15–17 June, Tenerife, Spain
Aksenov P, Kemperman ADAM, Arentze TA (2014) Toward personalized and dynamic cultural routing: a three-level approach. Procedia Environ Sci 22:257–269
Article Google Scholar
Arentze TA (2015) LATUS: A dynamic model for leisure activity-travel utility simulation. Paper prepared for presentation at the 94th Transportation Research Board Annual Meeting, January 2015, Washington, D.C
Arentze TA, Timmermans HJP (2009) A need-based model of multi-day, multi-person activity generation. Transp Res Part B Methodol 43(2):251–265
Article Google Scholar
Arentze TA, Ettema D, Timmermans HJP (2010) Incorporating time and income constraints in dynamic agent-based models of activity generation and time use: approach and illustration. Transp Res C 18:71–83
Article Google Scholar
Armbrecht J (2014) Use value of cultural experiences: a comparison of contingent valuation and travel cost. Tour Manag 42:141–148
Article Google Scholar
Ashworth GJ, Page SJ (2011) Urban tourism research: recent progress and current paradoxes. Tour Manag 32(1):1–15
Article Google Scholar
Borras J, Moreno A, Valls A (2014) Intelligent tourism recommender systems: a survey. Expert Syst Appl 41:7370–7389
Article Google Scholar
Boxall PC, Adamowicz WL (2002) Understanding heterogeneous preferences in random utility models: a latent class approach. Environ Resour Econ 23:421–446
Article Google Scholar
Braunhofer M, Ricci F (2017) Selective contextual information acquisition in travel recommender systems. Inform Technol Tour 17:5–29
Article Google Scholar
Braunhofer M, Elahi M, Ricci F (2015) User personality and the new user problem in a context-aware point of interest recommender system. In: Tussyadiah I, Inversini A (eds) Information and communication technologies in tourism. Springer, Switzerland, pp 537–549
Google Scholar
Buhalis D (1998) Strategic use of information technologies in the tourism industry. Tour Manag 19(5):409–421
Article Google Scholar
Buhalis D, Law R (2008) Progress in information technology and tourism management: 20 years on and 10 years after the internet—the state of eTourism research. Tour Manag 29:609–623
Article Google Scholar
CBS (2017) StatLine, electronic databank of Statistics Netherlands, http://statline.cbs.nl/statweb/?LA=en. Accessed 30 Mar 2017
Fonte FAM, López MR, Burguillo JC, Peleteiro A, Martínez AB (2013) A tagging recommender service for mobile terminals. In: Cantoni L, Xiang Z (eds) Information and communication, Technologies in Tourism. Springer-Verlag, Berlin, pp 424–435
Google Scholar
Gavalas D, Konstantopolous C, Mastakas K, Pantziou G (2014) Mobile recommender systems in tourism. J Netw Comput Appl 39:319–333
Article Google Scholar
Greene WH (2001) Fixed and random effects in nonlinear models. Working Paper EC-01-01, Stern School of Business, Department of Economics
Greene WH, Hensher DA (2002) A latent class model for discrete choice analysis: Contrast with mixed logit. Working Paper ITS-WP-02-08, Institute of Transport Studies. The University of Sydney, Australia
Google Scholar
Gretzel U, Mitsche N, Hwang YH, Fesenmaier DR (2004) Tell me who you are and I will tell you where to go: use of travel personalities in destination recommendation systems. Inform Technol Tour 7:3–12
Article Google Scholar
Gupta S, Chintagunta PK (1994) On using demographic variables to determine segment membership in logit mixture models. J Mark Res 31:128–136
Article Google Scholar
Hanani U, Shapira B, Shoval P (2001) Information filtering: overview of issues. Res Syst User Model User-Adapt Interact 11:203–259
Article Google Scholar
Hensher DA, Rose JM, Greene WH (2015) Applied choice analysis, 2nd edn. Cambridge University Press, Cambridge , UK (ISBN: 9781107465923)
Book Google Scholar
Kamakura W, Russell G (1989) A probabilistic choice model for market segmentation and elasticity structure. J Mark Res 26:379–390
Article Google Scholar
Kerkman K, Arentze T, Borgers A, Kemperman A (2012) Car drivers compliance with route advice and willingness to choose socially desirable routes. Transport Res Rec 1:102–109
Article Google Scholar
Lew A, McKercher B (2006) Modeling tourist movements. A local destination analysis. Ann Tour Res 33(2):403–423
Article Google Scholar
Lin Y, Kerstetter D, Nawijn J, Mitas O (2014) Changes in emotions and their interactions with personality in a vacation context. Tour Manag 40:416–424
Article Google Scholar
Linaza MT, Agirregoikoa A, Garcia A, Torres JI, Aranburu K (2011) Image-based travel recommender system for small tourist destinations. In: Law R et al (eds) Information and communication technologies in tourism. Springer-Verlag, Wien, pp 1–11
Google Scholar
Ma J, Gao J, Scott N, Ding P (2013) Customer delight from theme park experiences. The antecedents of delight based cognitive appraisal theory. Ann Tour Res 42:359–381
Article Google Scholar
Mackay K, Vogt C (2012) Information technology in everyday and vacation contexts. Ann Tour Res 39(3):1380–1401
Article Google Scholar
Neidhardt J, Seyfang L, Schuster R, Werthner H (2015) A picture-based approach to recommender systems. Inform Technol Tour 15:49–69
Article Google Scholar
Nijland L, Arentze T, Timmermans H (2010) Eliciting the needs that underlie activity-travel patterns and their covariance structure: results of multimethod analyses. J Transp Res Rec 2157:54–62
Article Google Scholar
Schneider OP, Vogt CA (2012) Applying the 3M model of personality and motivation to adventure travelers. J Travel Res 51:704–716
Article Google Scholar
Simoes B, Aksenov P, Santos P, Arentze T (2015) C-space: fostering new creative paradigms based on recording and sharing “casual” videos through the internet, Multimedia & Expo Workshops (ICMEW), 2015 IEEE International Conference
Steen Jacobsen JK, Munar AM (2012) Tourist information search and destination choice in a digital age. Tour Manag Perspect 1(1):39–47
Article Google Scholar
Swait J (1994) A structural equation model of latent segmentation and product choice for cross-sectional revealed preference data. J Retail Consum Serv 1(2):77–89
Article Google Scholar
Wang D, Park S, Fesenmaier DR (2012) The role of smartphones in mediating the touristic experience. J Travel Res 51(4):371–387
Article Google Scholar
Wörndl W, Hefele A (2016) Generating paths through discovered places-of-interests for city trip planning. In: Inversini A, Schegg R (eds) Information and communication technologies in tourism. Springer, Heidelberg, pp 441–453
Google Scholar
Wynen J (2013) Explaining travel distance during same-day visits. Tour Manag 36:133–140
Article Google Scholar
Yang WS, Hwang SY (2013) iTravel: a recommender system in mobile peer-to-peer environment. J Syst Softw 86:12–20
Article Google Scholar
Yeh DY, Cheng CH (2015) Recommendation system for popular tourist attractions in Taiwan using Delphi panel and repertory grid techniques. Tour Manag 46:164–176
Article Google Scholar

Download references

Acknowledgements

The research leading to these results has received funding from the European Community’s Seventh Framework Program (FP7/2007-2013) under the Grant Agreement number 611040. The author is solely responsible for the information reported in this paper. It does not represent the opinion of the Community. The Community is not responsible for any use that might be made of the information contained in this paper. We furthermore would like to acknowledge Bruno Simoes of Graphitech for his support in the evaluation study.

Author information

Authors and Affiliations

Urban Systems and Real Estate Group, Eindhoven University of Technology, PO Box 513, 5600 MB, Eindhoven, The Netherlands
Theo Arentze, Astrid Kemperman & Petr Aksenov

Authors

Theo Arentze
View author publications
You can also search for this author in PubMed Google Scholar
Astrid Kemperman
View author publications
You can also search for this author in PubMed Google Scholar
Petr Aksenov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Theo Arentze.

Appendix—optimization of the latent-class model specification

Before applying a latent class estimation, the specification of the base model was optimized considering parsimony. Potentially, there are many possible interaction variables that can be considered. To arrive at a parsimonious model, the significance of all interaction variables was tested in a stepwise manner, starting with including all interaction variables in the model and next removing in a stepwise manner the interaction variables that are insignificant. Recall that context variables consist of duration of the city trip, time moment of the activity in the trip and size and nature of the current need. It appeared that none of the interactions concerning duration and time moment are significant and therefore these interaction variables were dropped from the final model. Given the purpose of the present study, all two-way interactions concerning the nature and size of the current need were kept in the final base model so that this factor could be included in the search for significant segments. The needs being in open air, relaxation, physical exercise and social contact were merged into a single category (labeled being outdoors) to increase the parsimony of the model further, as little differentiation between these needs emerged. Hence, in the final model nature of the need has three levels: New experience, Entertainment and Being outdoors.

The latent class estimation was run for several settings of the number of classes to find the optimal number of segments. Table 4 shows goodness-of-fit statistics for the estimated models where the number of classes is varied from one to four classes. According to the AIC index, the 4-segments model is the best possible model on this data. It is noticed, however, that the improvement of the index going from a 3-segments to a 4-segments model is modest. In terms of interpretation of the estimation results, the 3-segments model appears to be more useful than the 4-segments model. In the latter model, the segmentation has become increasingly sensitive to differences regarding a somewhat trivial factor (namely, the constant representing the utility of the null alternative). For these reasons, we selected the 3-segments model as the best model for the analysis purpose.

Table 4 Statistics for the latent class models

Full size table

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Arentze, T., Kemperman, A. & Aksenov, P. Estimating a latent-class user model for travel recommender systems. Inf Technol Tourism 19, 61–82 (2018). https://doi.org/10.1007/s40558-018-0105-z

Download citation

Received: 30 May 2017
Revised: 21 December 2017
Accepted: 17 January 2018
Published: 02 February 2018
Issue Date: June 2018
DOI: https://doi.org/10.1007/s40558-018-0105-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Estimating a latent-class user model for travel recommender systems

Abstract

Similar content being viewed by others

A Personalized Location Aware Multi-Criteria Recommender System Based on Context-Aware User Preference Models

Tourists’ City Trip Activity Program Planning: A Personalized Stated Choice Experiment

Preliminary Analysis and Design of a Customized Tourism Recommender System

1 Introduction

2 Related work

3 The c-Space system

4 Methodology

4.1 Behavioral assumptions

4.1.1 Attributes of POIs

4.1.2 Dynamic needs

4.2 Design of the choice experiment

4.3 Latent class model

5 Results

5.1 Survey and sample

5.2 The latent-class model

5.3 Incorporating the results in a TRS

6 Conclusions and discussion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix—optimization of the latent-class model specification

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Estimating a latent-class user model for travel recommender systems

Abstract

Similar content being viewed by others

A Personalized Location Aware Multi-Criteria Recommender System Based on Context-Aware User Preference Models

Tourists’ City Trip Activity Program Planning: A Personalized Stated Choice Experiment

Preliminary Analysis and Design of a Customized Tourism Recommender System

1 Introduction

2 Related work

3 The c-Space system

4 Methodology

4.1 Behavioral assumptions

4.1.1 Attributes of POIs

4.1.2 Dynamic needs

4.2 Design of the choice experiment

4.3 Latent class model

5 Results

5.1 Survey and sample

5.2 The latent-class model

5.3 Incorporating the results in a TRS

6 Conclusions and discussion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix—optimization of the latent-class model specification

Appendix—optimization of the latent-class model specification

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation