Skip to main content

Position Bias Estimation for Unbiased Learning-to-Rank in eCommerce Search

  • Conference paper
  • First Online:
String Processing and Information Retrieval (SPIRE 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11811))

Included in the following conference series:

Abstract

The Unbiased Learning-to-Rank framework [16] has been recently proposed as a general approach to systematically remove biases, such as position bias, from learning-to-rank models. The method takes two steps - estimating click propensities and using them to train unbiased models. Most common methods proposed in the literature for estimating propensities involve some degree of intervention in the live search engine. An alternative approach proposed recently uses an Expectation Maximization (EM) algorithm to estimate propensities by using ranking features for estimating relevances [21]. In this work we propose a novel method to directly estimate propensities which does not use any intervention in live search or rely on modeling relevance. Rather, we take advantage of the fact that the same query-document pair may naturally change ranks over time. This typically occurs for eCommerce search because of change of popularity of items over time, existence of time dependent ranking features, or addition or removal of items to the index (an item getting sold or a new item being listed). However, our method is general and can be applied to any search engine for which the rank of the same document may naturally change over time for the same query. We derive a simple likelihood function that depends on propensities only, and by maximizing the likelihood we are able to get estimates of the propensities. We apply this method to eBay search data to estimate click propensities for web and mobile search and compare these with estimates using the EM method [21]. We also use simulated data to show that the method gives reliable estimates of the “true” simulated propensities. Finally, we train an unbiased learning-to-rank model for eBay search using the estimated propensities and show that it outperforms both baselines - one without position bias correction and one with position bias correction using the EM method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Note that keeping only query-document pairs that appeared at two ranks exactly is in no way a requirement of our method. The method is general and can be used for query-document pairs that appeared more than twice. This is just intended to simplify our analysis without a significant loss in data, since it is rare for the same query-document pair to appear at more than two ranks.

  2. 2.

    Note that these ranking models are significantly different from the eBay production ranker, the details of which are proprietary.

  3. 3.

    This is true for our data as discussed in Sect. 4. For the cases when most query-document pairs receive multiple clicks we suggest using a different method, such as computing the ratios of propensities by computing the ratios of numbers of clicks.

References

  1. Agarwal, A., Zaitsev, I., Wang, X., Li, C., Najork, M., Joachims, T.: Estimating position bias without intrusive interventions. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 474–482. ACM (2019)

    Google Scholar 

  2. Ai, Q., Bi, K., Luo, C., Guo, J., Croft, W.B.: Unbiased learning to rank with unbiased propensity estimation. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 385–394. ACM (2018)

    Google Scholar 

  3. Burges, C.J.: From ranknet to lambdarank to lambdamart: An overview. Technical report, June 2010

    Google Scholar 

  4. Carterette, B., Chandar, P.: Offline comparative evaluation with incremental, minimally-invasive online feedback. In: The 41st International ACM SIGIR Conference on Research & #38; Development in Information Retrieval, SIGIR 2018, pp. 705–714. ACM, New York (2018). https://doi.org/10.1145/3209978.3210050

  5. Casella, G., George, E.I.: Explaining the gibbs sampler. Am. Stat. 46(3), 167–174 (1992)

    MathSciNet  Google Scholar 

  6. Chapelle, O., Zhang, Y.: A dynamic bayesian network click model for web search ranking. In: Proceedings of the 18th International Conference on World Wide Web, pp. 1–10. ACM (2009)

    Google Scholar 

  7. Chuklin, A., Markov, I., Rijke, M.D.: Click models for web search. Synth. Lect. Inf. Concepts Retrieval Serv. 7(3), 1–115 (2015)

    Google Scholar 

  8. Craswell, N., Zoeter, O., Taylor, M., Ramsey, B.: An experimental comparison of click position-bias models. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, pp. 87–94. ACM (2008)

    Google Scholar 

  9. Dupret, G.E., Piwowarski, B.: A user browsing model to predict search engine click data from past observations. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 331–338. ACM (2008)

    Google Scholar 

  10. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Statist. 29(5), 1189–1232 (2001). https://doi.org/10.1214/aos/1013203451

    Article  MathSciNet  MATH  Google Scholar 

  11. Guo, F., et al.: Click chain model in web search. In: Proceedings of the 18th International Conference on World Wide Web, pp. 11–20. ACM (2009)

    Google Scholar 

  12. Guo, F., Liu, C., Wang, Y.M.: Efficient multiple-click models in web search. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining, pp. 124–131. ACM (2009)

    Google Scholar 

  13. He, J., Zhai, C., Li, X.: Evaluation of methods for relative comparison of retrieval systems based on clickthroughs. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 2029–2032. ACM (2009)

    Google Scholar 

  14. Hofmann, K., Whiteson, S., De Rijke, M.: A probabilistic method for inferring preferences from clicks. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 249–258. ACM (2011)

    Google Scholar 

  15. Joachims, T., Granka, L., Pan, B., Hembrooke, H., Gay, G.: Accurately interpreting clickthrough data as implicit feedback. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 154–161. SIGIR 2005. ACM, New York (2005). https://doi.org/10.1145/1076034.1076063

  16. Joachims, T., Swaminathan, A., Schnabel, T.: Unbiased learning-to-rank with biased feedback. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. WSDM 2017, pp. 781–789. ACM, New York (2017). https://doi.org/10.1145/3018661.3018699

  17. Joachims, T., et al.: Evaluating retrieval performance using clickthrough data (2003)

    Google Scholar 

  18. Li, H.: A short introduction to learning to rank. IEICE Trans. Inf. Syst. 94(10), 1854–1862 (2011)

    Article  Google Scholar 

  19. Radlinski, F., Joachims, T.: Minimally invasive randomization for collecting unbiased preferences from clickthrough logs (2006)

    Google Scholar 

  20. Radlinski, F., Kleinberg, R., Joachims, T.: Learning diverse rankings with multi-armed bandits. In: Proceedings of the 25th International Conference on Machine Learning, pp. 784–791. ACM (2008)

    Google Scholar 

  21. Wang, X., Golbandi, N., Bendersky, M., Metzler, D., Najork, M.: Position bias estimation for unbiased learning to rank in personal search. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, WSDM 2018, pp. 610–618. ACM, New York (2018). https://doi.org/10.1145/3159652.3159732

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Grigor Aslanyan .

Editor information

Editors and Affiliations

Appendices

A Likelihood Function Simplification

There are multiple approaches that one can take to estimate the propensities depending on the data itself. Let us first consider the query-document pairs that appeared only at one rank. The parameters \(p_i\) and \(z_j\) appear only as a product of each other in the likelihood function (2). These query-document pairs could be helpful in estimating the product of the propensity at the rank that they appeared at and the relevance \(z_j\) but not each one individually. With \(z_j\) unknown, this would not help to estimate the propensity. We should mention that in the presence of a reliable prior for \(z_j\) and/or \(p_i\) the likelihood function above can be used even for those query-document pairs that appeared only at one rank. In this case it would be more useful to take a Bayesian approach and estimate the posterior distribution for the propensities, for example using Gibbs sampling [5].

From now on we will assume that the query-document pairs appear at least at two different ranks. Another extreme is the case when each query-document pair appears a large number of times at different ranks. This will mean that we will get a large number of query-document pairs at each rank. In this case the propensity ratios for two ranks can be simply estimated by taking the ratio of click through rates of same query-document pairs at these ranks.

Let us now consider the case when the data consists of a large number of query-document pairs that appeared a few times (can be as few as twice) at different ranks, but the query-document pairs do not appear a large enough number of times to be able to get reliable estimates of propensities from taking the ratio of click through rates. In this case we will actually need to maximize the likelihood above and somehow eliminate the nuisance parameters \(z_j\) to get estimates for the \(p_i\). We will focus the rest of this work on this case. Also, the data we have collected from eBay search logs falls in this category, as discussed in Sect. 4.

If a query-document pair appeared only a few times there is a good chance that it did not receive any clicks. These query-document pairs will not help in estimating the propensities by likelihood maximization because of the unknown parameter \(z_j\). Specifically, for such query-document pairs we will have the terms \(\prod _{k=1}^{m_j}(1-p_{r_{jk}}z_j)\). If we use the maximum likelihood approach for estimating the parameters then the maximum will be reached by \(z_j=0\) for which the terms above will be 1. So the query-document pairs without any clicks will not change the maximum likelihood estimate of the propensities. For that reason we will only keep query-document pairs that received at least one click. However, we cannot simply drop the terms from the likelihood function for query-document pairs that did not receive any clicks. Doing so would bias the data towards query-document pairs with a higher likelihood of click. Instead, we will replace the likelihood function above by a conditional probability. Specifically, the likelihood function (2) computes the probability of the click data \(\{c_{jk}\}\) obtained for that query-document pair. We need to replace that probability by a conditional probability - the probability of the click data \(\{c_{jk}\}\) under the condition that there was at least one click received: \(\sum _kc_{jk}>0\). The likelihood function for the query-document pair \(x_j\) will take the form:

$$\begin{aligned} \begin{aligned} \mathcal {L}_j(p_i,z_j|D_j)&=P\left( D_j|\sum _k c_{jk}>0\right) \\&=\frac{P(D_j\cap \sum _k c_{jk}>0)}{P(\sum _k c_{jk}>0)}=\frac{P(D_j)}{P(\sum _k c_{jk}>0)}\\&=\frac{\prod _{k=1}^{m_j}\left[ c_{jk}p_{r_{jk}}z_j+(1-c_{jk})(1-p_{r_{jk}}z_j)\right] }{1-\prod _{k=1}^{m_j}(1-p_{r_{jk}}z_j)}\,. \end{aligned} \end{aligned}$$
(4)

Here \(\mathcal {L}_j\) denotes the likelihood function for the query-document pair \(x_j\), \(D_j=\{c_{jk}\}\) denotes the click data for query-document pair j, and P denotes probability. \(\sum _k c_{jk} > 0\) simply means that there was at least one click. In the first line above we have replaced the probability of data \(D_j\) by a conditional probability. The second line uses the formula for conditional probability. The probability of \(D_j\) and at least one click just equals to probability of \(D_j\) since we are only keeping query-document pairs that received at least one click. This is how the second equality of the second line is derived. Finally, in the last line we have explicitly written out \(P(D_j)\) in the numerator as in (2) and the probability of at least one click in the denominator (the probability of no click is \(\prod _{k=1}^{m_j}(1-p_{r_{jk}}z_j)\) so the probability of at least one click is 1 minus that).

The full likelihood is then the product of \(\mathcal {L}_j\) for all query-document pairs:

$$\begin{aligned} \mathcal {L}(p_i,z_j|D)=\prod _{\begin{array}{c} j=1\\ \sum _k c_{jk} > 0 \end{array}}^N\frac{\prod _{k=1}^{m_j}\left[ c_{jk}p_{r_{jk}}z_j+(1-c_{jk})(1-p_{r_{jk}}z_j)\right] }{1-\prod _{k=1}^{m_j}(1-p_{r_{jk}}z_j)}\,. \end{aligned}$$
(5)

From now on we will assume by default that our dataset contains only query-document pairs that received at least one click and will omit the subscript \(\sum _k c_{jk} > 0\).

Our last step will be to simplify the likelihood function (5). Typically the click probabilities \(p_iz_j\) are not very large (i.e. not close to 1). This is the probability that the query-document pair j will get a click when displayed at rank i. To simplify the likelihood for each query-document pair we will only keep terms linear in \(p_iz_j\) and drop higher order terms like \(p_{i_1}z_{j_1}p_{i_2}z_{j_2}\). We have verified this simplifying assumption for our data in Sect. 4. In general, we expect this assumption to be valid for most search engines. It is certainly a valid assumption for lower ranks since click through rates are typically much smaller for lower ranks. Since we are dropping product terms the largest ones would be between ranks 1 and 2. For most search engines the click through rates at rank 2 are around 10% or below, which we believe is small enough to be able to safely ignore the product terms mentioned above (they would be at least 10 times smaller than linear terms). We empirically show using simulations in Appendix B that this assumption works very well for data similar to eBay data. If for other search engines the click through rates are much larger for topmost ranks we suggest keeping only those query-document pairs that appeared at least once at a lower enough rank. Also, using the methodology of simulations from Appendix B one can verify how well this assumption works for their particular data.

Under the simplifying assumption we get for the denominator in (5):

$$\begin{aligned} 1-\prod _{k=1}^{m_j}(1-p_{r_{jk}}z_j)\simeq 1-\left( 1-\sum _{k=1}^{m_j}p_{r_{jk}}z_j\right) =z_j\sum _{k=1}^{m_j}p_{r_{jk}}\,. \end{aligned}$$
(6)

Let us now simplify the numerator of (5). Firstly, since the click probabilities are not large and each query-document pair appears only a few times we can assume there is only one click per query-document pairFootnote 3. We can assume \(c_{jl_j}=1\) and \(c_{jk}=0\) for \(k\ne l_j\). The numerator then simplifies to

$$\begin{aligned} \prod _{k=1}^{m_j}\left[ c_{jk}p_{r_{jk}}z_j+(1-c_{jk})(1-p_{r_{jk}}z_j)\right] =p_{r_{jl_j}}z_j\prod _{\begin{array}{c} k=1\\ k\ne l_j \end{array}}^{m_j}(1-p_{r_{jk}}z_j) \simeq p_{r_{jl_j}}z_j\,. \end{aligned}$$
(7)

Using (6) and (7) the likelihood function (5) simplifies to

$$\begin{aligned} \mathcal {L}(p_i,z_j|D)=\prod _{j=1}^N\frac{p_{r_{jl_j}}z_j}{z_j\sum _{k=1}^{m_j}p_{r_{jk}}}=\prod _{j=1}^N\frac{p_{r_{jl_j}}}{\sum _{k=1}^{m_j}p_{r_{jk}}}\,. \end{aligned}$$
(8)

In the last step \(z_j\) cancels out from the numerator and the denominator. Our assumption of small click probabilities, together with keeping only query-document pairs that received at least one click allowed us to simplify the likelihood function to be only a function of propensities. Now we can simply maximize the likelihood (8) to estimate the propensities.

Equation (8) makes it clear why we need to include the requirement that each query-document pair should appear more than once at different ranks. If we have a query-document pair that appeared only once (or multiple times but always at the same rank) then the numerator and the denominator would cancel each other out in (8). For that reason we will keep only query-document pairs that appeared at two different ranks at least.

It is numerically better to maximize the log-likelihood, which takes the form:

$$\begin{aligned} \log \mathcal {L}(p_i|D)=\sum _{j=1}^N\left( \log (p_{r_{jl_j}})-\log \sum _{k=1}^{m_j}p_{r_{jk}}\right) \,. \end{aligned}$$
(9)
Fig. 2.
figure 2

Propensity estimated from simulated data. The green solid curve shows the “true” propensity (10). The blue solid curve is the estimated propensity using the direct estimation method. The red dashed curve is the estimation using interpolation. (Color figure online)

B Results on Simulations

In this Appendix we use simulated data to verify that the method of estimating propensities developed in Sect. 3 works well. For our simulations we choose the following propensity function as truth:

$$\begin{aligned} p_i^{\mathrm {sim}}=\min \left( \frac{1}{\log {i}},1\right) \end{aligned}$$
(10)

which assigns propensity of 1 for ranks 1 and 2, and then decreases as the inverse of the log of the rank.

Other than choosing our own version of propensities we simulate the data to be as similar to the eBay dataset as possible. We generate a large number of query-document pairs and randomly choose a mean rank \(rank_{mean}\) for each query-document pair uniformly between 1 and 500. We randomly generate a click probability z for that query-document pair depending on the mean rank \(rank_{mean}\). We choose the distribution from which the click probabilities are drawn such that the click through rates at each rank match closely with the click through rates for real data, taking into account the “true” propensities (10). We then generate two different ranks drawn from \(\mathcal {N}(rank_{mean}, (rank_{mean} / 5)^2)\). For each rank i we compute the probability of a click as \(zp_i^{\mathrm {sim}}\). Then we keep only those query-document pairs which appeared at two different ranks and got at least one click, in agreement with our method used for real eBay data. Finally, we keep about 40,000 query-document pairs so that the simulated data is similar to the eBay web search data in size. This becomes the simulated data.

The estimated propensities on the simulated dataset are shown in Fig. 2. The green solid curve shows the true propensity (10), the blue solid curve shows the estimated propensity using the direct estimation method, and the red dashed curve is the estimated propensity using interpolation. As we can see, the estimations closely match with the truth. Furthermore, we can see that the interpolation method gives a better result by reducing the noise in the estimate. These results show that the propensity estimation method developed in this paper works well.

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Aslanyan, G., Porwal, U. (2019). Position Bias Estimation for Unbiased Learning-to-Rank in eCommerce Search. In: Brisaboa, N., Puglisi, S. (eds) String Processing and Information Retrieval. SPIRE 2019. Lecture Notes in Computer Science(), vol 11811. Springer, Cham. https://doi.org/10.1007/978-3-030-32686-9_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32686-9_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32685-2

  • Online ISBN: 978-3-030-32686-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics