Encyclopedia of Social Network Analysis and Mining

Living Edition
| Editors: Reda Alhajj, Jon Rokne

Spatiotemporal Personalized Recommendation of Social Media Content

  • Bee-Chung ChenEmail author
Living reference work entry
DOI: https://doi.org/10.1007/978-1-4614-7163-9_325-1


Recommender System Latent Dirichlet Allocation News Article Content Item Candidate Item 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.




The situation (which includes time, geographical location, location of a web page, etc.) in which recommendations are made to a user.


Information (about a user, an item, and the context in which the item may be recommended to the user) that can be used to predict the response rate.


A set of nodes connected by a set of edges.


A web page on which recommended items are placed.


A system that recommends items (e.g., news articles, blog posts) to users.

Response rate

The probability that a user would respond positively to (e.g., click, share) a recommended item.


Social media sites (like twitter.com, digg.com, blogger.com) complement traditional media by incorporating content generated by regular people and allowing users to interact with content through sharing, commenting, voting, liking, and other actions. Since the number of content items is usually too large for a person to manually examine to find interesting ones, it is important for social media sites to recommend a small set of items that are worth looking at for each user. To satisfy each individual user, recommended items have to match the user’s personal interests and be relevant to the user’s current spatiotemporal context. For example, a content item about the user’s hometown is usually a better choice than an item about an unknown foreign country, and a content item on a fresh trending topic is usually more interesting than an item on a stale topic.

Spatiotemporal personalized recommendation of social media content refers to techniques used to make personalized recommendation based on:
  • The geographical location of a user and an item (the location of an item can be the location that the item is about or the location of the author of the item)

  • The location of a user in the social space (e.g., the neighborhood of a user in a friendship graph)

  • The position of an item placed on a page and the layout of the page

  • Temporal evolution of user interests

  • Temporal behavior of the popularity of an item

  • Identification of trending and/or geo-related topics


Social media usually refers to a group of Internet-based applications that allow creation and exchange of user-generated content (Kaplan and Haenlein 2010). For example, weblog sites like blogger.com provide regular people the ability of publishing any article (called blog) on the web, microblogging sites like twitter.com facilitate fast distribution of short messages of any topic posted by any one, and social news sites like digg.com allow their users to vote news articles (and other web content) up or down in order to present popular and interesting news stories based on the wisdom of the crowd (i.e., votes from users), just to name a few. Because of the success of such social media sites, almost all online media sites now provide their users with the functionality of sharing and commenting on content items (e.g., news articles, photos, songs, movies, etc.), no matter whether the content items are generated by regular users. Since sharing and commenting are usually considered as social activities, the distinction between social media and traditional online media blurs. In this article, we discuss recommendation methods suitable for any online media with a special emphasis on spatial, temporal, and social characteristics of users and content items.

The large amount of content generated by social media makes it difficult for users to find personally relevant content. To alleviate such information overload, many social media sites recommend a small set of content items to each user based on what they know about the user and the items. We use the term “item” to refer to any candidate objects to be recommended to users, which include (but are not limited to):
  • Publisher-generated items like articles, songs, and movies, which are not generated by regular users, but are voted, shared, liked, or commented on by them

  • User-generated items like blogs, tweets (short messages posted on twitter.com), photos, videos, status updates, and comments on other items

Good recommendations help social media sites keep their users engaged and interested.

Key Points

When recommending items to users, it is important to consider whether an item is relevant to a user in the spatiotemporal context in which recommendations are to be made. We take a broad view of the spatial aspect that includes locations in geographical space, social space, and positions on a web page. A few key reasons for considering spatiotemporal contexts are listed below.
  • Users are likely to be more interested in items about the geographical location of their interests (e.g., their current locations, neighborhoods that they are familiar with, and places that they frequently visit) than items about a random location, which is especially true for mobile applications (see Zheng et al. 2010 for an example).

  • In some applications, users tend to have similar preferences to those who are close to them in the social space, which is especially true when closeness is defined based on a trust network (see Jamali and Ester 2010 for an example).

  • It is generally true that an item placed at a prominent location (e.g., top) on a page generates more responses from users than the same item placed at a non-prominent location (see Agarwal et al. 2009 for an example).

  • Users change their interests in topics over time (see Ahmed et al. 2011 for an example) and their geo-location.

  • Popularity of items also changes over time (see Agarwal et al. 2009 for an example).

Many methods have been developed to exploit these spatiotemporal characteristics to improve the performance of recommenders. A comprehensive review of these methods is beyond the scope of this article. Instead, after providing a brief historical background, we illustrate key ideas in spatiotemporal personalized recommendation through a generic supervised learning approach, which handles spatiotemporal characteristics by (1) defining features that capture those characteristics and (2) learning a function that predicts whether a user would respond to an item positively based on these features from a dataset that records users’ past responses to items. This approach generally applies to recommendation of any kind of item.

Historical Background

There have been many approaches developed to make personalized recommendations. When an item to be recommended is a text article, which may be represented as a bag of words, an early approach is to also represent a user as a bag of words. The user’s bag of words can be constructed by including representative words in the articles that the user likes to read. Then, we can recommend a user the articles which bags of words are most similar to the user’s bag of words through Salton’s vector space model (Salton et al. 1975). For items that are not easily representable as bags of words, how other users respond to an item may provide a clue as to whether to recommend the item to a user who has not yet responded to the item. Agrawal et al. (1993) proposed that, in a retail store setting, products can be recommended based on customers’ co-buying behavior. For example, if the majority of customers who buy product A also buy product B, then we may recommend product B to a customer who only bought product A. This idea was then extended by incorporating a notion of similarity of users or items. For example, when we decide whether to recommend item B to user i, we look at whether users “similar” to user i respond to item B positively. Notice that Agrawal’s method is based on the similarity definition that if two customers buy the same product, then they are similar. A different definition of similarity between users leads to a different method. Furthermore, we can also exploit similarity between items in a similar way – when deciding whether to recommend item B to user i, check whether user i liked items that are “similar” to B in the past. Here, similarity between two items can be defined by looking at whether most users responded to the two similarly. Adomavicius and Tuzhilin (2005) provided a good review of such methods. This kind of methods is generally referred to as collaborative filtering because the recommendations that a user receives depend on other users’ responses to candidate items – this process can be thought of as a collaboration among users to help one another find interesting items (although users may not be aware of the collaboration).

Conceptually, one can put users’ past responses to items into a matrix. Since this matrix-oriented approach is popular in movie recommendation (Koren et al. 2009), we use it as an example in the following discussion. In a movie recommender system, users rate movies. Let y ij denote the rating that user i gives to movie j. For example, y ij may be a numeric value ranging from one to five, representing one star to five stars. Let Y denote the m × n matrix such that the value in the (i, j) entry is y ij , where m is the number of users and n is the number of movies in the system. Notice that there are many entries with missing (i.e., unknown) values in matrix Y because most users only rate a small number of movies. For user i, if we can predict the missing values in the ith row of matrix Y accurately (where the entries with missing values correspond to movies that have not yet rated by user i and are thus candidate items to be recommended to him/her), then we can recommend user i the movies having the highest predicted rating values. One popular way of making such predictions is through matrix factorization – approximate matrix Y as the product UV′ of two low rank matrices U of size m × r and V of size n × r, where V′ denotes the transpose of matrix V and the rank r of matrices U and V is much smaller than the numbers m and n of users and items, respectively. Let u i denote the ith row of matrix U, v j denote the jth row of matrix V, and Ω = {(i, j): user i rated movie j } denote the set of observed entries in matrix Y. This approximation then can be mathematically formulated as the following optimization problem.

Find U and V that minimize
$$ \sum_{\left( i, j\right)\in \varOmega}{\left({y}_{\mathrm{ij}}-{\boldsymbol{u}}_i^{\prime }{\boldsymbol{v}}_j\right)}^2 $$

where \( {\boldsymbol{u}}_i^{\prime }{\boldsymbol{v}}_j \) is the inner product of two vectors u i and v j . Notice that \( {\boldsymbol{u}}_i^{\prime }{\boldsymbol{v}}_j \) is the (i, j) entry of matrix (UV′) and is also the predicted value of y ij . Thus, the above optimization seeks to minimize the difference between matrix Y and matrix (UV′) over only the set Ω of observed entries of Y. Sum of squared differences is a common choice, while other choices are also available for different problem settings. Recent studies, such as Agarwal and Chen (2009) and Koren et al. (2009) and many others, suggest that matrix factorization usually provides superior recommendations than more traditional methods.

A survey of a wide range of approaches to recommender systems can be found in Jannach et al. (2010) and Ricci et al. (2015). Here, we focus on how to make use of spatial, temporal, and social information to make good recommendations of social media content. In particular, we illustrate key ideas in spatiotemporal personalized recommendation through a general supervised learning (or statistical modeling) approach, which generally applies to recommendation of any kind of item.

Supervised Learning Approach

In general, a recommendation problem can be formulated as follows. A recommender is given:
  • A user, who is associated with a vector of user features, e.g., age, gender, location (home, workplace, and other frequently visited places)

  • A set of candidate items, each of which is associated with a vector of item features, e.g., topics, keywords, the time when the item was produced, and the location that the item is about or was produced at

  • A context, which is associated with a vector of context features, e.g., time of day, day of week when the recommendation is to be made, the age of the item (time since the item was produced), and the distance between the user’s current location and the location that the item is about

The goal of the recommender is to rank and pick the top few items from the set of candidate items that best “match” the user’s interests and information needed in the context. The supervised learning approach exploits the fact that, in many recommenders, a dataset of users’ past responses (e.g., click, share) to items can be collected and defines the degree that an item matches a user as the response rate of the user to the item (e.g., the probability that the user would click the item if he/she sees the item on a web page). Such predictions can be made by using a statistical (regression or machine learning) model, which “learns” the user and item behavior that allows accurate predictions from the dataset, where users’ past responses in the dataset “supervise” the learning process via giving desired (e.g., click) and undesired (e.g., no click) examples. When such a model is available, recommendations for a user can be made by picking the top few items having the highest response rates among the set of candidate items. This supervised learning approach applies to recommendation of any kind of item, where spatiotemporal and other characteristics can be incorporated by defining features that capture those characteristics.

To use this supervised learning approach, a developer of a recommender needs to make the following three decisions:
  • What response should the model try to predict?

  • What features should the model use to capture the characteristics of users, items, and the spatiotemporal context?

  • What class of model do we want to use?

After introducing a running example, we discuss how to choose the response, provide a number of useful features, and then introduce two commonly used classes of models, namely, feature-based regression model and latent factor model. See Jannach et al. (2010) and Koren et al. (2009) for other classes of models. See Hastie et al. (2009) for a general introduction to supervised learning.

Example Recommender

For concreteness, we use blog article recommendation as a running example. Consider that we want to develop a recommender for a blog service provider (e.g., blogger.com) that seeks to recommend each user with a set of interesting blog articles posted by other users. To make modeling more interesting, assume that a user can declare friendship with other users, and such friendship connections between users are available to the recommender. In this example, the set of candidate items for each user consists of all of the articles posted within a 1-week time window (to ensure freshness) by any user of this service provider. Notice that the set of candidate items changes over time. For simplicity, we only need to recommend ten articles for each user, once per day, and the recommended articles are displayed in a list on the sidebar of each user’s homepage (they are only visible to the owner of the homepage, not the visitors of the homepage, since the recommendations are made to the homepage owner).

Choice of Response

The choice of response depends on the objective that a recommender is developed for and availability of user feedback that the recommender receives. A common objective is to maximize clicks on recommended items because the fact that a user clicks an item indicates that the user is interested in knowing more about the item. Note that clicks are user feedback that can easily be made available to a recommender through logging whether a user clicks a recommended item. In this case, a natural choice of the response is whether a user would click an item if he/she sees the item being recommended. Here, the goal of learning is to predict the probability that a user would click an item based on a dataset that records what items each user clicked and what items each user did not click in the past.

Beyond clicks, a recommender may be developed for other objectives. For example, if the objective of recommendation is to encourage users to make comments on recommended items, then a natural choice of the response would be whether a user would comment on a recommended item or not. On some sites, users can explicitly rate items (e.g., using one star to five stars), and then a natural choice of the response would be the rating that a user would give to an item. For simplicity, we only consider methods that seek to achieve a single objective and model the response rate of a single type of choice (e.g., modeling either click rate or explicit star rating, but not both). See Agarwal et al. (2011a) for an example of multi-objective recommendation, and see Agarwal et al. (2011b) for an example of joint modeling of multiple types of responses.

Let y ijk denote the response that user i gives to item j in context k. For concreteness, assume that we choose to model whether the user would click the item.

Feature Engineering

Having good features is essential to an accurate model, but one usually does not get good features automatically. It requires domain knowledge, good intuition, and experience in the application to define good features. Here, for illustration purposes, we only show a number of example features that can potentially capture different kinds of spatiotemporal characteristics for our example recommender. Real-life recommenders usually need to use much more features than the following ones.

User Features

Let w i denote the vector of features of user i. For simplicity, we mostly consider binary features, meaning each element in the vector is either 0 or 1. Example features are as follows:
  • Gender: From the user’s registration record when he/she signed up on the site, the recommender obtains the gender of each user. The numeric value of the feature is 1 if the user is a male and 0 if the user if a female.

  • Age: Also from the user’s registration record, the recommender obtains the age of each user. For example, we can group age values into ten age groups, which give ten age features. If the user’s age is in an age group, the value of the feature corresponding to that age group is 1, and the rest of age groups get feature value of 0.

  • City: From the IP address of a user, the recommender can guess the city that the user is in. Here, we use a set of features, one for each city, to represent the user’s geographic location. For example, assume the user lives in New York City. Then, the value of the New York City feature is 1 and the values of the rest city features are all 0 for the user. It is common to only include cities that have at least n users, where n is a threshold that a developer of the recommender can choose to reduce the number of features.

  • Other location features: Similar to the above city feature, we can also generate location features at different granularities (e.g., region, state, country) for different types of user locations (e.g., home, workplace, other frequently visited places).

Item Features

Let x j denote the vector of features of item j. Example features are as follows:

  • Bag of words: It is common to represent the text content of an article as a bag of words, which corresponds to a set of features, one for each keyword. For simplicity, we only consider binary keyword features. The value of a keyword feature is 1 if the article contains the keyword and 0 if the article does not contain the keyword. Since the total number of words in all articles is usually too large, it is also common to reduce the space of all keywords to a relatively small number of important words, e.g., location names or other named entities.

  • Topics: Another way to reduce the space of words in articles is to group words into topics and then assign topics to articles based on the words in articles. This process can be automated through topic models like latent Dirichlet allocation (Blei et al. 2003). One output from such a model is a vector of topic membership for each article, where each element in the vector represents the probability that the article is about a particular topic.

  • Tags: In social media, users tag items based on their interests in the items. These tags can be used to generate features in the same way as bag of words. Geo tags and event tags usually capture the spatiotemporal context of the item.

  • Creation time: This is the time at which the item was produced.

  • Location of the author: This is the location of the author of the item when producing the item.

Context Features

Let z ijk denote the vector of features of the context in which user i is (to be) recommended with item j in context k (which include time and location). Example features are as follows:
  • Day of the week: This is the day of the week (weekday vs. weekend) when the recommendation is to be made. User behavior during the weekday can be quite different from that during the weekend. The value of this feature is 1 for weekday and 0 for weekend.

  • Article age: This is the age of an article (not to be confused with the age of a user), which is the number of days since the article was posted. We put it into the category of context features, instead of item features, because it depends on both the article and time, instead of the article alone. For example, assume the article was posted 2 days ago; then, the value of the feature corresponding to 2 days ago is 1, and the other days get feature value 0. To model finer-grained temporal effect, one may choose a finer time resolution (e.g., hour, instead of day).

  • Position on page: It is well known that the click rate of an item put on the top of a list on a page is usually higher than that of the same item put in the middle or the bottom of the page. To capture this positional bias, we define a set of features, each of which corresponds to a position in the list. For example, assume the article is put at the third position, the value of the feature corresponding to the third position is 1, and all other positions have feature value 0.

  • Friendship: This feature is 1 if user i is connected to the author of item j through a friendship connection and is 0 otherwise.

  • Connection strength: This feature captures the strength of the connection between the user and the author of the item. For example, we can qualify the strength by the number of common friends between the user and the author (and create different strength ranges to generate binary features if desired).

  • Same city: This feature is 1 if user i is in the same city as the author of item j and is 0 otherwise. Features like the same state and same country can be created in a similar way.

  • Geo-distance: This is the distance between the user’s current location and the location that the item is about (if desired, we can use distance ranges to create binary features).

  • Repeated exposure: This is the number of times that the user saw this time in the past. This feature can be used to capture the user’s fatigue after seeing the same item many times.

Note that the above features are only simple examples. The goal here is to provide concrete examples of features for illustration purposes. Different applications may require different sets of features. Recent work on spatiotemporal topic models (Hu et al. 2013; Yin and Cui 2016; Yuan et al. 2015) can also be used to create features that capture items’ spatiotemporal topicality and users’ spatiotemporal interests.

Feature-Based Regression Model

After defining the response and features, we have a standard supervised learning problem. When the response is binary (e.g., either click or no click), we can use logistic regression. See Hastie et al. (2009) for an introduction to logistic regression. Let p ijk denote the probability that user i would respond to item j when he/she sees it in context k. There are many ways in which one can define a function that predicts p ijk based on features. A useful prediction function is as follows:
$$ {p}_{\mathrm{ijk}}=\sigma \left({\boldsymbol{w}}_i^{\prime }{\boldsymbol{Ax}}_j+\boldsymbol{\beta}^{\prime }{\boldsymbol{z}}_{\mathrm{ijk}}\right), $$

where \( \sigma (a)=\frac{1}{1+ \exp \left(- a\right)} \) is the sigmoid function that transforms an unbounded value a into a number between 0 and 1 (since p ijk is a probability), A is a regression coefficient matrix, β is a regression coefficient vector, and \( {w}_i^{\prime } \) and β′ are the row vectors after transposing the two column vectors w i and β, respectively. Given a dataset of users’ past responses to items, where each record is in the form (y ijk , w i , x j , z ijk ), off-the-shelf logistic regression packages (e.g., Photon ML at https://github.com/linkedin/photon-ml) can be applied to learn the regression coefficients A and β.

To better understand this model, we take a closer look at the prediction function. Let A mn denote the (m, n) entry of matrix A, w im denote the mth user feature in vector w i , and x jn denote the nth item feature in vector x j . By definition we have
$$ {\boldsymbol{w}}_i^{\prime }{\boldsymbol{Ax}}_j=\sum_m\sum_n{A}_{\mathrm{mn}}{w}_{\mathrm{im}}{x}_{\mathrm{jn}}. $$

For example, assume w im is the feature that indicates whether user i lives in New York City and x jn is the feature that indicates whether article j contains keyword “new york.” Then, the regression coefficient A mn would try to capture the propensity that users living in the New York City would click an article that contains keyword “newyork” after adjusting for all other factors. Now, assume that the mth and nth context features in z ijk indicate whether article j is posted 1 day ago, and whether j is posted 5 days ago, respectively. Then, the difference between regression coefficients β m − β n would quantify how much the popularity of an article drops from day 1 to day 5 when all other conditions being equal.

Latent Factor Model

Although feature-based regression models are useful for predicting users’ response rates to items, they depend highly on the availability of predictive features, which usually requires a significant feature engineering effort with no guarantee of obtaining predictive features. Also, feature vectors may not be sufficient to capture the differences between users or items. For example, when two users have identical feature vectors, feature-based regression models would be unable to tell the differences between the two. One way of addressing these issues is to add latent factors into the prediction function, i.e.,
$$ {p}_{\mathrm{ijk}}=\sigma \left({\boldsymbol{w}}_i^{\prime }{\boldsymbol{Ax}}_j+\boldsymbol{\beta}^{\prime }{\boldsymbol{z}}_{\mathrm{ijk}}+{\boldsymbol{u}}_i^{\prime }{\mathbf{v}}_j\right) $$

where u i and v j are two r-dimensional vectors both to be learned from data like regression coefficients A and β, where r is much smaller than the number of users and the number of items. Recall that we have seen \( {\boldsymbol{u}}_i^{\prime }{\boldsymbol{v}}_j \) in the matrix factorization method in the historical background section. The difference is that, instead of factorizing the response matrix, here we factorize the residual (i.e., prediction error) matrix of feature-based regression in order to capture the behavior of users and items that the features fail to capture.

Intuitively, one can think of u i and v j as “latent feature” vectors of user i and item j, respectively. We do not determine the values of these r latent features per user or item before learning the model. Instead, u i and v j are treated as variables that can be used to reduce the error of predicting the responses in the dataset used for learning. The inner product \( {\boldsymbol{u}}_i^{\prime }{\boldsymbol{v}}_j \) then represents the affinity between user i and item j; the larger the inner product value, the higher the probability that user i would click item j. After the learning process, we simultaneously obtain the values of these latent features and also the regression coefficients A and β. See Agarwal et al. (2010) for an example of such a latent factor model.

Spatiotemporal contexts can also be involved in a latent factor model. For example, assume we want to model a temporal effect through latent factors. Let context index k represent the kth time period (e.g., day). One way of capturing user or item behavioral changes over time is through the following model:
$$ {p}_{\mathrm{ijk}}=\sigma \left({\boldsymbol{w}}_i^{\prime }{\mathrm{Ax}}_j+\boldsymbol{\beta}^{\prime }{\boldsymbol{z}}_{\mathrm{ijk}}+\left\langle {\boldsymbol{u}}_i,{\boldsymbol{v}}_j,{\boldsymbol{t}}_k\right\rangle \right), $$

where 〈u i , v j , t k 〉 = ∑ l u il v jl t kl is a form of tensor product of three vectors u i , v j, and t k . Note that u il denotes the lth element of vector u i and so on. Similar to the previous model, u i , v j , and t k are all latent feature vectors, which values are to be learned from data. Unlike the previous model where the affinity \( {\boldsymbol{u}}_i^{\prime }{\boldsymbol{v}}_j \) between user i and item j is fixed over time, now the affinity 〈u i , v j , t k 〉 is a function of time period k, which means this model captures the changing behavior of user-item affinity. Specifically, in this model, the user and item latent feature vectors are fixed over time, but the affinity between the two is a weighted sum of the element-wise product of the two latent feature vectors u i and v j , where the weight vector t k changes over time. See Xiong et al. (2010) for an example of such a temporal latent factor model.

We can also use t k to represent the latent vector for location k if we use k to index locations. We can also use k to index clusters of location-time pairs. Then, t k represents the latent vector for a spatiotemporal cluster.

Key Applications

Personalized recommendation is an important mechanism for surfacing social media content. The spatiotemporal context in which a recommendation is made provides a key piece of information that helps a recommender to recommend the right item to the right user at the right time. While many methods have been proposed in the literature, the supervised approach is attractive because of its generality, where spatiotemporal characteristics can be incorporated as features or latent factors. In this article, we introduced a number of example features and two example models. They can be applied to personalized recommendations of new articles, blog articles, tweets, shared items (e.g., articles, videos, photos), status updates, and comments on different kinds of items. In practice, many features need to be evaluated, and a number of different models need to be tried, so that a good recommender can be built.

Future Directions

Personalized content recommendation is currently an active research area in data mining, information retrieval, and machine learning. A lot of progress has been made in this area, but challenges remain.
  • Improving response rate prediction accuracy: Although many models have been proposed to predict response rates and we have seen prediction accuracy improves over time, accurate prediction of the probability that a user would respond to an item is still a challenging problem, especially for users and items that the recommender knows little about. What are the spatial, temporal, social, and other kinds of features that can further improve accuracy? How can a recommender actively collect data to achieve better model learning and evaluation?

  • Multi-objective optimization: A recommender usually is designed to achieve multiple objectives. For example, many web sites put advertisements on article pages to generate revenue. In addition to recommend articles that users like to click, we may also want to recommend articles that can generate high advertising revenue. How can a recommender optimize multiple objectives in a principled way?

  • Multi-type response modeling: In social media, users respond to items in multiple ways, e.g., clicks, shares, tweets, emails, likes, etc. How can we jointly model such different types of user responses in order to find out the items that a user truly want to be recommended?

  • Whole page optimization: On a web page, there can be multiple recommender modules. For example, one recommends news articles, another recommends updates from a user’s friends, and yet another recommends online discussions the user may be interested in. How can we jointly optimize multiple recommender modules on a page to leverage the correlation among modules and to ensure consistency, diversity, and serendipity?

  • Collaborative content creation: Wikipedia demonstrated high-quality content creation through massive collaboration. However, in most recommender systems, items to be recommended are created by a single party (e.g., a publisher or a user). How can we synthesize items at the right level of granularity to recommend to users in a semiautomatic collaborative way?



  1. Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17:734–749CrossRefGoogle Scholar
  2. Agarwal D, Chen BC (2009) Regression-based latent factor models. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, NY, USA, KDD ‘09, pp 19–28, DOI 10.1145/1557019.1557029., URL http://doi.acm.org/10.1145/1557019.1557029
  3. Agarwal D, Chen BC, Elango P (2009) Spatio-temporal models for estimating click-through rate. In: Proceedings of the 18th international conference on World wide web, ACM, New York, NY, USA, WWW ‘09, pp 21–30, DOI 10.1145/1526709.1526713., URL http://doi.acm.org/10.1145/1526709.1526713
  4. Agarwal D, Chen BC, Elango P (2010) Fast online learning through offline initialization for time-sensitive recommendation. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, NY, USA, KDD ‘10, pp 703–712, DOI 10.1145/1835804.1835894., URL http://doi.acm.org/10.1145/1835804.1835894
  5. Agarwal D, Chen BC, Elango P, Wang X (2011a) Click shaping to optimize multiple objectives. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, NY, USA, KDD ‘11, pp 132–140, DOI 10.1145/2020408.2020435., URL http://doi.acm.org/10.1145/2020408.2020435
  6. Agarwal D, Chen BC, Long B (2011b) Localized factor models for multi-context recommendation. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, NY, USA, KDD ‘11, pp 609–617, DOI 10.1145/2020408.2020504., URL http://doi.acm.org/10.1145/2020408.2020504
  7. Agrawal R, Imielin’ski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on Management of data, ACM, New York, NY, USA, SIGMOD ‘93, pp 207–216, DOI 10.1145/170035.170072., URL http://doi.acm.org/10.1145/170035.170072
  8. Ahmed A, Low Y, Aly M, Josifovski V, Smola AJ (2011) Scalable distributed inference of dynamic user interests for behavioral targeting. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, NY, USA, KDD ‘11, pp 114–122, DOI 10.1145/2020408.2020433., URL http://doi.acm.org/10.1145/2020408.2020433
  9. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022. URL http://dl.acm.org/citation.cfm?id=944919.944937 zbMATHGoogle Scholar
  10. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Springer, New YorkCrossRefzbMATHGoogle Scholar
  11. Hu B, Jamali M, Ester M (2013) Spatio-temporal topic modeling in mobile social media for location recommendation. In: 2013 I.E. 13th International Conference on Data Mining, IEEE, pp 1073–1078Google Scholar
  12. Jamali M, Ester M (2010) A matrix factorization technique with trust propagation for recommendation in social networks. In: Proceedings of the fourth ACM conference on Recommender systems, ACM, New York, NY, USA, RecSys ‘10, pp 135–142, DOI 10.1145/1864708.1864736., URL http://doi.acm.org/10.1145/1864708.1864736
  13. Jannach D, Zanker M, Felfernig A, Friedrich G (2010) Recommender systems: an introduction. Cambridge University Press, New York. URL http://books.google.com/books?id=eygTJBd U2cC
  14. Kaplan AM, Haenlein M (2010) Users of the world, unite! the challenges and opportunities of social media. Bus Horiz 53(1):59–68. doi:10.1016/j.bushor.2009.09.003. URL http://www.sciencedirect.com/science/article/pii/S0007681309001232 CrossRefGoogle Scholar
  15. Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30–37CrossRefGoogle Scholar
  16. Ricci F, Rokach L, Shapira B, Kantor PB (eds) (2015) Recommender systems handbook. Springer, New York. http://www.springer.com/us/book/9781489976369
  17. Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620. doi:10.1145/361219.361220. URL http://doi.acm.org/10.1145/361219.361220 CrossRefzbMATHGoogle Scholar
  18. Xiong L, Chen X, Huang TK, Schneider JG, Carbonell JG (2010) Temporal collaborative filtering with bayesian probabilistic tensor factorization. In: Proceedings of the SIAM International Conference on Data Mining, SDM 2010, April 29–May 1, 2010, Columbus, Ohio, USA, pp 211–222Google Scholar
  19. Yin H, Cui B (2016) Spatio-temporal recommendation in Social Media. Springer, Singapore. http://www.springer.com/us/book/9789811007477
  20. Yuan Q, Cong G, Zhao K, Ma Z, Sun A (2015) Who, where, when, and what: a nonparametric bayesian approach to context-aware recommendation and search for twitter users. ACM Trans Inf Syst (TOIS) 33(1):2CrossRefGoogle Scholar
  21. Zheng VW, Cao B, Zheng Y, Xie X, Yang Q (2010) Collaborative filtering meets mobile recommendation: a user-centered approach. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence, pp 236–241Google Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  1. 1.LinkedInSunnyvaleUSA

Section editors and affiliations

  • Gao Cong
    • 1
  • Bee-Chung Chen
    • 2
  1. 1.Nanyang Technological University (NTU)SingaporeSingapore
  2. 2.LinkedInMountain ViewUnited States