Learning to rank using multiple loss functions


Learning to rank has attracted much attention in the domain of information retrieval and machine learning. Prior studies on learning to rank mainly focused on three types of methods, namely, pointwise, pairwise and listwise. Each of these paradigms focuses on a different aspect of input instances sampled from the training dataset. This paper explores how to combine them to improve ranking performance. The basic idea is to incorporate the different loss functions and enrich the objective loss function. We present a flexible framework for multiple loss function incorporation and based on which three loss-weighting schemes are given. Moreover, in order to get good performance, we define several candidate loss functions and select them experimentally. The performance of the three types of weighting schemes is compared on LETOR3.0 dataset, which demonstrates that with a good weighting scheme, our method significantly outperforms the baselines which use single loss function, and it is at least comparable to the state-of-the-art algorithms in most cases.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2


  1. 1.

    Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: Proceedings of the ICML. ACM, pp 89–96

  2. 2.

    Cao Z, Qin T, Liu TY, Tsai MF, Li H (2007) Learning to rank: From pairwise approach to listwise approach. In: Proceedings of the ICML. ACM, pp 129–136

  3. 3.

    Cao YB, Xu J, Liu TY, Li H, Huang YL, Hon WH (2006) Adaptive ranking SVM to document retrieval. In: Proceedings of the SIGIR Conference. ACM, pp 186–193

  4. 4.

    Chakrabarti S, Khanna R, Sawant U, Bhattacharyya C (2008) Structured learning for non-smooth ranking losses. In: Proceedings of the SIGKDD. ACM, pp 88–96

  5. 5.

    Cossock D, Zhang T (2006) Subset ranking using regression. In: Proceedings of the COLT, pp 605–619

  6. 6.

    Crammer K, Singer Y (2002) PRanking with ranking. In: Proceedings of the NIPS, 14, pp 641–647

  7. 7.

    Cui C, Ma J, Lian T et al (2015) Improving image annotation via ranking-oriented neighbor search and learning-based keyword propagation. J Assoc Inf Sci Technol 66(1):82–98

    Article  Google Scholar 

  8. 8.

    Cui C, Shen J, Chen Z et al (2017) Learning to rank images for complex queries in concept-based search. Neurocomputing

  9. 9.

    Freund Y, Iyer R, Schapire RE, Singer Y (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4:933–969

    MathSciNet  MATH  Google Scholar 

  10. 10.

    Fuhr N (1989) Optimum polynomial retrieval functions based on the probability ranking principle. Acm T Inform Syst 7:183–204

    Article  Google Scholar 

  11. 11.

    Haykin S (2008) Neural networks and learning machines, 3rd edn. Prentice Hall, Upper Saddle River

    Google Scholar 

  12. 12.

    Herbrich R, Graepel T, Obermayer K (2000) Large margin rank boundaries for ordinal regression. Advances in large margin classifiers. MIT Press, Cambridge, pp 115–132

    Google Scholar 

  13. 13.

    Ifada N, Nayak R (2016) How relevant is the irrelevant data: leveraging the tagging data for a learning-to-rank model[C]. Web Search and Data Mining

  14. 14.

    Jeffreys H (1946) An invariant form for the prior probability in estimation problems. Proc R Soc Lond Ser A 186(1007):453–461

    MathSciNet  Article  Google Scholar 

  15. 15.

    Joachims T (2002) Optimizing search engines using clickthrough data. In: Proceedings of the SIGKDD. ACM, pp 133–142

  16. 16.

    Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86

    MathSciNet  Article  Google Scholar 

  17. 17.

    Kleinberg JM (1999) Authoritative sources in a hyperlinked environment. J ACM 46(5):604–632

    MathSciNet  Article  Google Scholar 

  18. 18.

    Liu TY, Xu J, Qin T, Xiong WY, Li H (2007) LETOR: benchmark collection for research on learning to rank for information retrieval. In: Proceedings of the Learning to Rank Workshop in conjunction with SIGIR. ACM SIGIR Forum 41(2):58–62. ACM

    Article  Google Scholar 

  19. 19.

    Liu TY (2009) Learning to rank for information retrieval. Found Trends Inf Retr 3(3):225–331

    Article  Google Scholar 

  20. 20.

    Lin Y, Lin H, Xu K et al. Group-enhanced ranking. Neurocomputing, 2015: 99–105

  21. 21.

    Luce RD (1959) Individual choice behavior. Wiley, New York

    MATH  Google Scholar 

  22. 22.

    Moon T, Smola A, Chang Y, Zheng ZH (2010) IntervalRank—isotonic regression with listwise and pairwise constraints. In: Proceedings of the WSDM, pp 151–159

  23. 23.

    Nallapati R (2004) Discriminative models for information retrieval. In: Proceedings of the SIGIR Conference. ACM, pp 64–71

  24. 24.

    Niu S, Lan Y, Guo J et al (2014) What makes data robust: a data analysis in learning to rank[C]. International ACM SIGIR Conference on Research and Development in Information Retrieval

  25. 25.

    Page L, Brin S, Motwani R, Winograd T (1998) The pagerank citation ranking: Bringing order to the web, Technical Report, Stanford Digital Library Technologies Project

  26. 26.

    Qin T, Zhang XD, Tsai MF, Wang DS, Liu TY, Li H (2008) Query-level loss functions for information retrieval. Inf Process Manage 44:838–855

    Article  Google Scholar 

  27. 27.

    Plackett RL (1975) The analysis of permutations. Appl Stat 24:193–202

    MathSciNet  Article  Google Scholar 

  28. 28.

    Robertson SE (1997) Over view of the okapi projects. J Doc 53:3–7

    Article  Google Scholar 

  29. 29.

    Tax N, Bockting S, Hiemstra D et al (2015) A cross-benchmark comparison of 87 learning to rank methods. Inf Process Manage 51(6):757–772

    Article  Google Scholar 

  30. 30.

    Taylor M, Guiver J, Robertson S, Minka T (2008) SoftRank: optimising non-smooth rank metrics. In: Proceedings of the WSDM, pp 77–86

  31. 31.

    Tsai MF, Liu TY, Qin T, Chen HH, Ma WY (2007) Frank: a ranking method with fidelity loss. International Conference on Research and Development in Information Retrieval, pp 383–390

  32. 32.

    Wang X, Xing H, Li Y et al (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654

    Article  Google Scholar 

  33. 33.

    Wang XZ, Ashfaq RAR, Fu AM (2015) Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Syst 1–12

  34. 34.

    Wu M, Zha H, Zheng Z, Chang Y (2009) Smoothing DCG for learning to rank: a novel approach using smoothed hinge functions. In: Proceedings of the CIKM (Short Paper). ACM, pp 1923–1926

  35. 35.

    Xia F, Liu TY, Wang J, Zhang W, Li H (2008) Listwise approach to learning to rank—Theorem and algorithm. In: Proceedings of the ICML. ACM, pp 1192–1199

  36. 36.

    Xu J, Liu T-Y, Lu M, Li H, Ma W-Y (2008) Directly optimizing IR evaluation measures in learning to rank. In: Proceedings of the SIGIR Conference. ACM, pp 107–114

  37. 37.

    Xu B, Lin H, Lin Y et al (2015) Assessment of learning to rank methods for query expansion. Journal of the Association for Information Science and Technology

  38. 38.

    Yue Y, Finley T, Radlinski F, Joachims T (2007) A support vector method for optimizing average precision. In: Proceedings of the SIGIR Conference. ACM, pp 271–278

  39. 39.

    Zeng XJ, Zhang YK (2003) Machine learning. China Machine, pp 60–94

  40. 40.

    Zhai CX (2008) Statistical language models for information retrieval a critical review. Found Trends Inf Retr 2(3):137–213

    MathSciNet  Article  Google Scholar 

  41. 41.

    Zhu H, Tsang ECC, Wang XZ et al (2016) Monotonic classification extreme learning machine. Neurocomputing 225(C):205–213

    Google Scholar 

Download references


This work is partially supported by grant from the Natural Science Foundation of China (No. 61402075, 61602078, 61572102, 61572098), Natural Science Foundation of Liaoning Province, China (No.201202031, 2014020003), the Ministry of Education Humanities and Social Science Project (No. 16YJCZH12), the Fundamental Research Funds for the Central Universities.

Author information



Corresponding author

Correspondence to Hongfei Lin.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lin, Y., Wu, J., Xu, B. et al. Learning to rank using multiple loss functions. Int. J. Mach. Learn. & Cyber. 10, 485–494 (2019). https://doi.org/10.1007/s13042-017-0730-4

Download citation


  • Learning to rank
  • Loss function
  • Gradient descent
  • Incorporation
  • Weighting scheme