Investigating Result Usefulness in Mobile Search

  • Jiaxin MaoEmail author
  • Yiqun Liu
  • Noriko Kando
  • Cheng Luo
  • Min Zhang
  • Shaoping Ma
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10772)


The existing evaluation approaches for search engines usually measure and estimate the utility or usefulness of search results by either the explicit relevance annotations from external assessors or implicit behavior signals from users. Because the mobile search is different from the desktop search in terms of the search tasks and the presentation styles of SERPs, whether the approaches originated from the desktop settings are still valid in the mobile scenario needs further investigation. To address this problem, we conduct a laboratory user study to record users’ search behaviors and collect their usefulness feedbacks for search results when using mobile devices. By analyzing the collected data, we investigate and characterize how the relevance, as well as the ranking position and presentation style of a result, affects its user-perceived usefulness level. A moderating effect of presentation style on the correlation between relevance and usefulness as well as a position bias affecting the usefulness in the initial viewport are identified. By correlating result-level usefulness feedbacks and relevance annotations with query-level satisfaction, we confirm the findings that usefulness feedbacks can better reflect user satisfaction than relevance annotations in mobile search. We also study the relationship between users’ usefulness feedbacks and their implicit search behavior, showing that the viewport features can be used to estimate usefulness when click signals are absent. Our study highlights the difference between desktop and mobile search and sheds light on developing a more user-centric evaluation method for mobile search.


Mobile search Evaluation User behavior analysis 


  1. 1.
    Belkin, N.J., Cole, M., Liu, J.: A model for evaluation of interactive information retrieval. In: Proceedings of the SIGIR 2009 Workshop on the Future of IR Evaluation, pp. 7–8 (2009)Google Scholar
  2. 2.
    Cleverdon, C., Mills, J., Keen, M.: Aslib cranfield research project: factors determining the performance of indexing systems (1966)Google Scholar
  3. 3.
    Cohen, J.: Weighted kappa: nominal scale agreement provision for scaled disagreement or partial credit. Psychol. Bull. 70(4), 213 (1968)CrossRefGoogle Scholar
  4. 4.
    Cole, M., Liu, J., Belkin, N., Bierig, R., Gwizdka, J., Liu, C., Zhang, J., Zhang, X.: Usefulness as the criterion for evaluation of interactive information retrieval. In: Proceedings of the HCIR, pp. 1–4 (2009)Google Scholar
  5. 5.
    Fox, S., Karnawat, K., Mydland, M., Dumais, S., White, T.: Evaluating implicit measures to improve web search. ACM TOIS 23(2), 147–168 (2005)CrossRefGoogle Scholar
  6. 6.
    Granka, L.A., Joachims, T., Gay, G.: Eye-tracking analysis of user behavior in WWW search. In: SIGIR 2004, pp. 478–479. ACM (2004)Google Scholar
  7. 7.
    Guo, Q., Jin, H., Lagun, D., Yuan, S., Agichtein, E.: Mining touch interaction data on mobile devices to predict web search result relevance. In: SIGIR 2013, pp. 153–162. ACM (2013)Google Scholar
  8. 8.
    Guo, Q., Song, Y.: Large-scale analysis of viewing behavior: towards measuring satisfaction with mobile proactive systems. In: CIKM 2016, pp. 579–588. ACM (2016)Google Scholar
  9. 9.
    Harvey, M., Pointon, M.: Searching on the go: the effects of fragmented attention on mobile web search tasks. In: SIGIR 2017, pp. 155–164. ACM (2017)Google Scholar
  10. 10.
    Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. (TOIS) 20(4), 422–446 (2002)CrossRefGoogle Scholar
  11. 11.
    Jiang, J., He, D., Kelly, D., Allan, J.: Understanding ephemeral state of relevance. In: CHIIR 2017, pp. 137–146. ACM (2017)Google Scholar
  12. 12.
    Joachims, T.: Optimizing search engines using clickthrough data. In: KDD 2002, pp. 133–142. ACM (2002)Google Scholar
  13. 13.
    Kamvar, M., Baluja, S.: A large scale study of wireless search behavior: Google mobile search. In: SIGCHI 2006, pp. 701–709. ACM (2006)Google Scholar
  14. 14.
    Kim, J., Thomas, P., Sankaranarayana, R., Gedeon, T., Yoon, H.J.: Eye-tracking analysis of user behavior and performance in web search on large and small screens. JASIST 66(3), 526–544 (2015)Google Scholar
  15. 15.
    Kim, J.Y., Teevan, J., Craswell, N.: Explicit in situ user feedback for web search results. In: SIGIR 2016, pp. 829–832. ACM (2016)Google Scholar
  16. 16.
    Lagun, D., Hsieh, C.H., Webster, D., Navalpakkam, V.: Towards better measurement of attention and satisfaction in mobile search. In: SIGIR 2014, pp. 113–122. ACM (2014)Google Scholar
  17. 17.
    Lagun, D., McMahon, D., Navalpakkam, V.: Understanding mobile searcher attention with rich ad formats. In: CIKM 2016, pp. 599–608. ACM (2016)Google Scholar
  18. 18.
    Liu, Y., Chen, Y., Tang, J., Sun, J., Zhang, M., Ma, S., Zhu, X.: Different users, different opinions: predicting search satisfaction with mouse movement information. In: SIGIR 2015, pp. 493–502. ACM (2015)Google Scholar
  19. 19.
    Luo, C., Liu, Y., Sakai, T., Zhang, F., Zhang, M., Ma, S.: Evaluating mobile search with height-biased gain. In: SIGIR 2017. ACM (2017)Google Scholar
  20. 20.
    Mao, J., Liu, Y., Zhou, K., Nie, J.Y., Song, J., Zhang, M., Ma, S., Sun, J., Luo, H.: When does relevance mean usefulness and user satisfaction in web search? In: SIGIR 2016, pp. 463–472. ACM (2016)Google Scholar
  21. 21.
    Ong, K., Järvelin, K., Sanderson, M., Scholer, F.: Using information scent to understand mobile and desktop web search behavior. In: SIGIR 2017, pp. 295–304. ACM (2017)Google Scholar
  22. 22.
    Shokouhi, M., Jones, R., Ozertem, U., Raghunathan, K., Diaz, F.: Mobile query reformulations. In: SIGIR 2014, pp. 1011–1014. ACM (2014)Google Scholar
  23. 23.
    Song, Y., Ma, H., Wang, H., Wang, K.: Exploring and exploiting user search behavior on mobile and tablet devices to improve search relevance. In: WWW 2013, pp. 1201–1212. ACM (2013)Google Scholar
  24. 24.
    Verma, M., Yilmaz, E.: Characterizing relevance on mobile and desktop. In: Ferro, N., Crestani, F., Moens, M.-F., Mothe, J., Silvestri, F., Di Nunzio, G.M., Hauff, C., Silvello, G. (eds.) ECIR 2016. LNCS, vol. 9626, pp. 212–223. Springer, Cham (2016). CrossRefGoogle Scholar
  25. 25.
    Voorhees, E.M., Harman, D.K., et al.: TREC: Experiment and Evaluation in Information Retrieval, vol. 1. MIT press Cambridge, Cambridge (2005)Google Scholar
  26. 26.
    Williams, K., Kiseleva, J., Crook, A.C., Zitouni, I., Awadallah, A.H., Khabsa, M.: Detecting good abandonment in mobile search. In: WWW 2016, pp. 495–505 (2016)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Jiaxin Mao
    • 1
    Email author
  • Yiqun Liu
    • 1
  • Noriko Kando
    • 2
  • Cheng Luo
    • 1
  • Min Zhang
    • 1
  • Shaoping Ma
    • 1
  1. 1.Tsinghua UniversityBeijingChina
  2. 2.National Institute of InformaticsTokyoJapan

Personalised recommendations