# Time Series Forecasting by Recommendation: An Empirical Analysis on Amazon Marketplace

## Abstract

This study proposes a forecasting methodology for univariate time series (TS) using a Recommender System (RS). The RS is built from a given TS as only input data and following an item-based Collaborative Filtering approach. A set of top-*N* values is recommended for this TS which represent the forecasts. The idea is to emulate RS elements (the users, items and ratings triple) from the TS. Two TS obtained from Italy’s Amazon webpage were used to evaluate this methodology and very promising performance results were obtained, even the difficult environment chosen to conduct forecasting (short length and unevenly spaced TS). This performance is dependent on the similarity measure used and suffers from the same problems that other RSs (e.g., cold-start). However, this approach does not require high computational power to perform and its intuitive conception allows for being deployed with any programming language.

## Keywords

Collaborative Filtering Time series Forecasting Data science## 1 Introduction

Broadly speaking, autocorrelation is the comparison of a time series (TS) with itself at a different time. Autocorrelation measures the linear relationship between lagged values of a TS and is central to numerous forecasting models that incorporate autoregression. In this study, this idea is borrowed and incorporated to a recommender system (RS) with a forecasting purpose.

RSs apply knowledge discovery techniques to the problem of helping users to find interesting items. Among the wide taxonomy of recommendation methods, Collaborative Filtering (CF) [1] is the most popular approach for RSs designs [2]. CF is based on a intuitive paradigm by which items are recommended to an user looking at the preferences of the people this user trusts. *User-based* or *item-based* [3] recommendations are two common approaches for performing CF. The first evaluates the interest of a user for an item using the ratings for this item by other users, called neighbours, that have similar rating patterns. On the other hand, item-based CF considers that two items are similar if several users of the system have rated these items in a similar fashion [4]. In both cases, the first task when building a CF process is to represent the user-items interactions in the form of a rating matrix. The idea is that given a rating data by many users for many items it can be predicted a user’s rating for an item not known to the user, or identify a set of *N* items that user will like the most (top-*N* recommendation problem). The latter is the approach followed in this study.

The goal of this study is to introduce a point forecasting methodology for univariate TS using a item-based CF framework. In particular, to study the behaviour of this methodology in short lenght and unvenly spaced TS. On one side, the distinct values of a TS are considered the space of users in the RS. On the other, items are represented by the distinct values of a lagged version of this TS. Ratings are obtained by studying the frequencies of co-occurrences of values from both TS. Basically, the forecast is produced after averaging the top-*N* set of recommended items (distinct values of the shifted TS) to a particular user (a given value in the original TS).

## 2 Related Work

TS forecasting has been incorporated into recommendation processes in several works to improve the users’ experience in e-commerce sites. However, to the best of the authors’ knowledge, the use of a RS framework as a tool for producing forecasts in TS is new in literature.

### 2.1 Item-Based CF Approach

*m*users \(\mathcal {U}=\{ u_{1}, u_{2},...,u_{m} \}\), a set of

*n*items \(\mathcal {I}=\{ i_{1}, i_{2},..., i_{n} \}\), and a user-item matrix Open image in new window, with \(r_{jk}\) representing the rating of the user \(u_{j}\) (\(1 \le j \le m\)) for the item \(i_{k}\) (\(1 \le k \le n\)). In \(\mathbf {R}\), each row represents an user \(u_{j}\), and each column represents an item \(i_{k}\). Some filtering criteria may be applied by removing from \(\mathbf {R}\) those \(r_{jk}\) entries below a predefined

*b*Open image in new window threshold. One of the novelties of this study is the creation of an \(\mathbf {R}\) matrix from a TS, which is explained in the next Sect. 3. Basically, \(\mathbf {R}\) is created after using the TS under study and its lagged copy, and considering them as the space of users and items, respectively. The rating values (\(r_{jk}\)) are obtained by cross-tabulating both series. Instead of studying their relation with an autocorrelation approach, the frequency of co-occurrence of values is considered. In order to clarify how the transformation of a TS forecasting problem is adapted in this study using a RS, Table 1 identifies some assumed equivalences.

Some equivalences used in this study to build the forecasting recommender system (RS) from a given time series (TS).

Concept | Symbol | RS equivalence |
---|---|---|

Set of distinct values in TS | \(\mathcal {U}\) | Set of users with cardinality |

Set of distinct values in TS shifted | \(\mathcal {I}\) | Set of items with cardinality |

Two distinct values in the TS | \(u_{j}, u_{l}\) | A pair of users |

Two distinct values of the shifted TS | \(i_{i}, i_{j}\) | A pair of items |

Number of times a distinct value in the TS and its shifted version co-occurs | \(r_{jk}\) | Rating of user \(u_{j}\) on item \(i_{k}\) |

TS value to which perform a forecasting | \(u_{a}\) | Active user to which recommend an item |

The model-building step begins with determining a similarity between pair of items from \(\mathbf {R}\). Similarities are stored in a new matrix Open image in new window, where \(s_{ij}\) represent a similarity between items *i* and *j* (\(1 \le i,j \le n\)). \(s_{ij}\) is obtained after computing a similarity measure on those users who have rated *i* and *j* items. Sometimes, to compute the similarity is set a minimum number of customers that have selected the (*i*, *j*) pair. This quantity will be referred as the *c* Open image in new window threshold. Traditionally, among the most commonly similarity measures used are the Pearson correlation, cosine, constraint Pearson correlation and mean squared differences [5]. In this study it will be used the Cosine and Pearson correlation measures, but also, the Otsuka-Ochiai coefficient, which is borrowed from Geosciences [8] and used by leading online retailers like Amazon [9].

Some notation follows at this stage of the modelling. The vector of ratings provided for item *i* is denoted by \(\mathbf {r}_{i}\) and \(\bar{r}_{i}\) is the average value of these ratings. The set of users who has rated the item *i* is denoted by \(\mathcal {U}_{i}\), the item *j* by \(\mathcal {U}_{j}\), and the set of users who have rated both by \(\mathcal {U}_{ij}\).

**Forecasting.** The aim of item-based algorithm is to create recommendations for a user, called the active user \(u_{a} \in \mathcal {U}\), by looking into the set of items this user has rated, \(I_{u_{a}}\in \mathcal {I}\). For each item \(i \in I_{u_{a}}\), just the *k* items wich are more similar are retained in a set \(\mathcal {S}(i)\). Then, considering the rating that \(u_{a}\) has made also on items in \(\mathcal {S}(i)\), a weighted prediction measure can be applied. This approach returns a series of estimated rating for items different from those in \(I_{u_{a}}\) that can be scored. Just the top ranked items are included in the list of *N* items to be recommended to \(u_{a}\) (top-*N* recommended list). The *k* and *N* values has to be decided by the experimenter. In this study, each active user (\(u_{a}\)) was randomly selected from \(\mathcal {U}\) following a cross-validation scheme, which is explained next. To every \(u_{a}\), a set of recommendable items is presented (the forecast). The space of items to recommend is represented by the distinct values of the shifted TS. Since the aim is to provide a point forecast, the numerical values included in the top-*N* recommended list are averaged.

**Evaluation of the Recommendation.** The basic structure for offline evaluation of RS is based on the train-test setup common in machine learning [5, 6]. A usual approach is to split users in two groups, the training and test sets of users ( \(\mathcal {U}_{\,train} \cup \,\, \mathcal {U}_{\,test}=\mathcal {U}\) ). Each user in \(\mathcal {U}_{\,test}\) is considered to be an active user \(u_{a}\). Item ratings of users in the test set are split into two parts, the *query set* and the *target set*. Once the RS is built on \(\mathcal {U}_{\,train}\), the RS is provided with the query set as user history and the recommendation produced is validated against the target set. It is assumed that if a RS performs well in predicting the withheld items (target set), it will also perform well in finding good recommendations for unknown items [7]. Typical metrics for evaluation accuracy in RS are root mean square error (RMSE), mean absolute error (MAE) or other indirect functions to estimate the novelty of the recommendation (e.g., serendipity). The MAE quality measure was used in this study.

## 3 Creation of the Rating Matrix

This section describes the steps to transform a given TS (\(\mathbf {TS}\)) in a matrix of user-item ratings (\(\mathbf {R}\)). This should be considered the main contribution of this study. The remaining steps do not differs greatly from a traditional item-based approach beyond the necessary adaptations to the TS forecasting context.

Since the aim it is to emulate \(\mathbf {R}\), the first step is to generate a space of users and items from **TS**. Thus, **TS** values are rounded to the nearest integer which will be called **TS**\(_{\mathbf {0}}\). The distinct (unique) values of **TS**\(_{\mathbf {0}}\) are assumed to be \(\mathcal {U}\).

**TS**\(_{\mathbf {0}}\) is shifted forward in time Open image in new window time stamps. The new TS is called

**TS**\(_{\mathbf {1}}\). The value of

*h*depends on the forecasting horizon intended. Now, the distinct values of

**TS**\(_{\mathbf {1}}\) set \(\mathcal {I}\). Once \(\mathcal {U}\) and \(\mathcal {I}\) have been set, they are studied as a bivariate distribution of discrete random variables. The last step is to compute the joint (absolute) frequency distribution of \(\mathcal {U}\) and \(\mathcal {I}\). Thus, it is assumed that \(n_{jk}\equiv r_{jk}\), where \(n_{jk}\) representes the frequency (co-occurrence) of the \(u_{j}\) value of

**TS**\(_{\mathbf {0}}\) and \(i_{k}\) value of

**TS**\(_{\mathbf {1}}\). These steps are summarized below.

**Some Considerations.** The followed approach experiences the same processing problems that a conventional item-based approach, namely, sparsity in \(\mathbf {R}\) and *cold start* situations (user and items with low or inexistent numbers of ratings). In large RS from e-commerce retailers, usually \(n\ll m\). However, under this approach \(n\simeq m\).

## 4 Experimental Evaluation

This section describes the sequential steps followed in this study for creating a item-based CF RS with the purpose of TS forecasting.

### 4.1 Data Sets

*Amazon devices*category. The second TS (TS-2) represents the price change of the most dynamic best-selling product from this marketplace site. Italy’s Amazon webpage has experienced changes during the eight month period of the crawling process due to commercial reasons. The main change is related to the number of best-selling products being showed for each category (e.g., 20, 50 or 100). This causes the period of the scraping cycles is not similar and the TS derived from this data (TS-1 and TS-2) are not equally spaced at time intervals. In this study, the data obtained in the scraping process will be considered as a sequence of time events. Therefore, it is worth to note that the values of TS-1 were obtained after averaging a different number of products (20, 50, or 100), according to the number of products shown by Amazon at different times. Their main characteristics are shown in Table 2.

TS characteristics used in the methodology testing (P: percentile; min: minimum value, max: maximum value; in €).

Abbreviation | Length | min | P50 | max | P75-P25 | Distinct values ( |
---|---|---|---|---|---|---|

TS-1 | 2169 | 13 | 84 | 142 | 14 | 82 |

TS-2 | 1224 | 7 | 17 | 36 | 8 | 22 |

### 4.2 Creation of the Rating Matrices

A rating matrix \(\mathbf {R}\) was created for TS-1 and TS-2 according to the algorithm described in Sect. 3. As mentioned before, the different duration of scraping cycles causes unevenly spaced observations in the TS. Also, it represents an additional difficulty when setting a constant forecasting horizon (*h*) as described in the algorithm. Therefore, in this study, it will be necessary to assume that the forecasting horizon coincides with the duration of the scraping cycle, independently of its length in time. In practice, this means that the TS representing the items (**TS**\(_{\mathbf {1}}\)) was obtained after lagging one position forward in time with respect the original TS representing the users (**TS**\(_{\mathbf {0}}\)). After empirical observation, *h* approximately takes values 1 h, 2 h or 4 h depending on Amazon shows the 20, 50 or 100 best-selling products for each category, respectively. The lack of proportionality between the duration of scraping cycles and the number of best-selling product shown is explained by technical reasons in the crawling process (structure of the Amazon’s webpage for each product affecting the depth of the scraping process). Those ratings with a value *b* \(\le 3\) were removed from the corresponding \(\mathbf {R}\).

### 4.3 Similarity Matrix Computation

*c*\(\ge 3\), and set the value of \(k=3\).

### 4.4 Generation of the Top *N*-Recommendation

*j*:

*N*recommended list.

### 4.5 Evaluation

The set of users (\(\mathcal {U}\)) was splitted following a 80:20 proportion for obtaining the training and test sets of users (\(\mathcal {U}_{\,train}\) and \(\mathcal {U}_{\,test}\)), respectively. Every user in \(\mathcal {U}_{\,test}\) (20% of distinct values in **TS**\(_{0}\)) was considered an active user (\(u_{a}\)) to whom recommend a set of items. From the history of each \(u_{a}\), again the proportion 80:20 proportion was used to obtain the *query* and *target* sets, respectively. The produced recommendation was validated against the target set of each \(u_{a}\).

**Evaluation Metric.**The MAE value was obtained according to (5):

*n*represent the number of active users to whom a recommendation has been suggested. The

*N*value to generate a top-

*N*recommendation was set dynamically according to the length of items in the

*target*set for each \(u_{a}\). Thus, the value of

*e*is the difference between the average value of the top-

*N*recommended items and the average value of the items in the target set.

### 4.6 Performance Results

The MAE results for assessing the quality of the forecasting proposal is shown in Table 3, for the two analysed TS and three similarity measures studied.

MAE performance on both TS and the different similarity measures (\(s_{\mathbf {1}}\), \(s_{\mathbf {2}}\) and \(s_{\mathbf {3}}\): Pearson correlation, cosine and Otsuka-Ochiai similarities, respectively), in €.

TS-1 | TS-2 | |||||
---|---|---|---|---|---|---|

\(s_{\mathbf {1}}\) | \(s_{\mathbf {2}}\) | \(s_{\mathbf {3}}\) | \(s_{\mathbf {1}} \) | \(s_{\mathbf {2}}\) | \(s_{\mathbf {3}}\) | |

MAE | 6.2 | 6.5 | 5.5 | 6.0 | 3.2 | 3.0 |

It can be seen that Otsuka-Ochiai similarity (\(s_{\mathbf {3}}\)) yields a better performance on both TS studied. It is necessary to remind the characteristics of the TS used for testing this methodology (TS-1 and TS-2). They are characterized by unevenly spaced observations in the TS, but also, the variable forecasting horizon provided by the data acquisition environment. This represents a very complex environment for performing forecasting. In the case of forecasting the average price for a given category, or the price for a given product, the forecasting results could be considered an estimation of the trend of such prices (uptrend or downtrend) more than aiming to forecast an exact value of such prices. Other experiences have been accomplished to test this methodology with ozone (O\(_{3}\)) TS (results now shown). These experiences with conventional TS (evenly spaced observations and constant forecasting horizon) has yield a very good results in term of forecasting. Therefore, TS-1 and TS-2 should be considered to represent an extreme environment in which perform forecasting. One of the advantage of this approach is that does not require high computational power to operate, due to the analysis of the distinct rounded values of a given TS, which is independent of its length. Besides, the proposed approach is very intuitive and allows for further developments to be included and being deployed using any programming language.

### 4.7 Computational Implementation

*ad hoc*designed Python functions except those specified in Table 4. The first two purposes correspond to the creation of the Rating matrix (\(\mathbf {R}\)), and the remaining ones to the similarity matrix computation (\(\mathbf {S}\)). In particular, the last two ones are used in the computation of the cosine similarity measure used in this work.

Python functions for specific tasks.

Purpose | Function | Package |
---|---|---|

| shift | pandas |

Cross-tabulation | crosstab | pandas |

To iterate over pairs of items | itertools | itertools |

Dot product | dot | numpy |

\(l_{2}\) norm | norm | numpy.linalg |

## 5 Conclusions

This study aims to produce a point forecast for a TS adopting a RS approach, in particular an item-based CF. To that end, basic elements in a RS (the users, items and ratings triple) was emulated using a TS as only input data. An autocorrelation-based algorithm is introduced to create a recommendation matrix from a given TS. This methodology was tested using two TS obtained from Italy’s Amazon webpage. Performance results are promising even the analyzed TS represent a very difficult setting in which to conduct forecasting. This is due to the TS are unevenly spaced and the forecasting horizon is not constant. Application of classical forecasting approaches (e.g., autoregression models) to this type of TS is not possible mainly due to the irregular time stamps in which observations are obtained. Thus, the introduced algorithm should be considered a contribution to the forecasting practice in both short length and unevenly spaced TS. Complementary, computational time required to obtain forecasting estimates is short due to such estimates are obtained considering distinct values of the TS are not all the values forming the TS. Further developments include to consider contextual information when building the RS and transforming the sequence of events to a TS with evenly spaced data.

## Notes

### Disclaimer

The views expressed are purely those of the authors and may not in any circumstances be regarded as stating an official position of the European Commission.

## References

- 1.Goldberg, D., Nichols, D., Oki, B.M., Terry, D.: Using collaborative filtering to weave an information tapestry. Commun. ACM Spec. Issue Inf. Filter.
**35**, 61–70 (1992). https://doi.org/10.1145/138859.138867CrossRefGoogle Scholar - 2.Sharma, R., Gopalani, D., Meena, Y.: Collaborative filtering-based recommender system: approaches and research challenges. In: 3rd International Conference on Computational Intelligence & Communication Technology, pp. 1–6. IEEE Press (2017). https://doi.org/10.1109/CIACT.2017.7977363
- 3.Sarwar, B., Karypis, G., Konstan, J., Reidl, J.: Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th International Conference on World Wide Web, Hong Kong, 2001, pp. 285–295. ACM, New York (2001). https://doi.org/10.1145/371920.372071
- 4.Desrosiers, C., Karypis, G.: A comprehensive survey of neighborhood-based recommendation methods. In: Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. (eds.) Recommender Systems Handbook, pp. 107–144. Springer, Boston, MA (2011). https://doi.org/10.1007/978-0-387-85820-3_4CrossRefGoogle Scholar
- 5.Bobadilla, J., Ortega, F., Hernando, A., GutiéRrez, A.: Recommender Systems Survey. Knowl. Based Syst.
**46**, 109–132 (2013). https://doi.org/10.1016/j.knosys.2013.03.012CrossRefGoogle Scholar - 6.Ekstrand, M.D., Riedl, J.T., Konstan, J.A.: Collaborative filtering recommender systems. Found. Trends Hum. Comput. Interact.
**4**(2), 81–173 (2010). https://doi.org/10.1561/1100000009CrossRefGoogle Scholar - 7.Haslher, M., Vereet, B.: recommenderlab: A Framework for Developing and Testing Recommendation Algorithms (2018). https://CRAN.R-project.org/package=recommenderlab
- 8.Ochiai, A.: Zoogeographical studies on the soleoid fishes found in Japan and its neighhouring regions-II. Bull. Japan. Soc. Sci. Fish
**22**, 526–530 (1957). https://doi.org/10.2331/suisan.22.526CrossRefGoogle Scholar - 9.Jacobi, J.A., Benson, E.A., Linden, G.D.: Personalized recommendations of items represented within a database. US Patent US7113917B2 (to Amazon Technologies Inc.) (2006). https://patents.google.com/patent/US7113917B2/en
- 10.Breese, J.S, Heckerman, D., Kadie, C.: Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence, pp. 43–52. Morgan Kaufmann Publishers (1998)Google Scholar

## Copyright information

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.