Information Retrieval Journal

, Volume 22, Issue 1–2, pp 159–187 | Cite as

Those were the days: learning to rank social media posts for reminiscence

  • Kaweh Djafari Naini
  • Ricardo Kawase
  • Nattiya Kanhabua
  • Claudia Niederée
  • Ismail Sengor AltingovdeEmail author
Social Media for Personalization and Search


Social media posts are a great source for life summaries aggregating activities, events, interactions and thoughts of the last months or years. They can be used for personal reminiscence as well as for keeping track with developments in the lives of not-so-close friends. One of the core challenges of automatically creating such summaries is to decide which posts are memorable, i.e., should be considered for retention and which ones to forget. To address this challenge, we design and conduct user evaluation studies and construct a corpus that captures human expectations towards content retention. We analyze this corpus to identify a small set of seed features that are most likely to characterize memorable posts. Next, we compile a broader set of features that are leveraged to build general and personalized machine-learning models to rank posts for retention. By applying feature selection, we identify a compact yet effective subset of these features. The models trained with the presented feature sets outperform the baseline models exploiting an intuitive set of temporal and social features.


Learning to rank Letor Social media Personalization Personalized ranking Content retention Social features Feature selection Facebook 



I.S. Altingovde is supported by Turkish Academy of Sciences Distinguished Young Scientist Award (TUBA-GEBIP 2016). This work was partially funded by the DFG Project “Managed Forgetting” (Contract Number NI-1760/1-1).


  1. Aragón, P., Gómez, V., García, D., & Kaltenbrunner, A. (2017). Generative models of online discussion threads: State of the art and research challenges. Journal of Internet Services and Applications, 8(1), 15:1–15:17.Google Scholar
  2. Badache, I. & Boughanem, M. (2014). Social priors to estimate relevance of a resource. In Fifth information interaction in context symposium, IIiX ’14, Regensburg, Germany, August 26–29, 2014 (pp. 106–114).Google Scholar
  3. Badache, I., & Boughanem, M. (2015a). Document priors based on time-sensitive social signals. In Advances in information retrieval: 37th European conference on IR research, ECIR 2015, Vienna, Austria, March 29–April 2, 2015 (pp. 617–622).Google Scholar
  4. Badache, I., & Boughanem, M. (2015b). A priori relevance based on quality and diversity of social signals. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, Santiago, Chile, August 9–13, 2015 (pp. 731–734).Google Scholar
  5. Badache, I., & Boughanem, M. (2017). Fresh and diverse social signals: Any impacts on search? In Proceedings of the 2017 conference on conference human information interaction and retrieval, CHIIR 2017, Oslo, Norway, March 7–11, 2017 (pp. 155–164).Google Scholar
  6. Bauer, L., Cranor, L. F., Komanduri, S., Mazurek, M. L., Reiter, M. K., Sleeper, M., & Ur, B. (2013). The post anachronism: The temporal dimension of facebook privacy. In Proceedings of the 12th ACM workshop on privacy in the electronic society, WPES ’13 (pp. 1–12). ACM: New York.Google Scholar
  7. Berendsen, R., Tsagkias, M., Weerkamp, W., & de Rijke, M. (2013). Pseudo test collections for training and tuning microblog rankers. In The 36th international ACM SIGIR conference on research and development in information retrieval, SIGIR ’13, Dublin, Ireland, July 28–August 01, 2013 (pp. 53–62).Google Scholar
  8. Bowen, S., & Petrelli, D. (2011). Remembering today tomorrow: Exploring the human-centred design of digital mementos. International Journal of Human-Computer Studies, 69(5), 324–337.Google Scholar
  9. Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., & Hullender, G. (2005). Learning to rank using gradient descent. In Proceedings of the 22nd international conference on machine learning (pp. 89–96). ACM.Google Scholar
  10. Can, E. F., Croft, W. B., & Manmatha, R. (2014). Incorporating query-specific feedback into learning-to-rank models. In Proceedings of SIGIR ’14 (pp. 1035–1038).Google Scholar
  11. Cao, Z., Qin, T., Liu, T.-Y., Tsai, M.-F., & Li, H. (2007). Learning to rank: From pairwise approach to listwise approach. In ICML.Google Scholar
  12. Ceroni, A., Solachidis, V., Niederée, C., Papadopoulou, O., Kanhabua, N., & Mezaris, V. (2015). To keep or not to keep: An expectation-oriented photo selection method for personal photo collections. In Proceedings of ICMR ’15 (pp. 187–194).Google Scholar
  13. Ceroni, A., Solachidis, V., Niederée, C., Papadopoulou, O., & Mezaris, V. (2017). Expo: An expectation-oriented system for selecting important photos from personal collections. In Proceedings of ICMR ’17 (pp. 452–456).Google Scholar
  14. Chakraborty, A., Ghosh, S., Ganguly, N., & Gummadi, K. P. (2015). Can trending news stories create coverage bias? on the impact of high content churn in online news media. In Computation and journalism symposium.Google Scholar
  15. Chakraborty, A., Ghosh, S., Ganguly, N., & Gummadi, K. P. (2017). Optimizing the recency-relevancy trade-off in online news recommendations. In Proceedings of WWW ’17 (pp. 837–846).Google Scholar
  16. Chang, Y.-W., & Lin, C.-J. (2008). Feature ranking using linear SVM. In Proceedings of WCCI causation and prediction challenge (pp. 53–64).Google Scholar
  17. Chapelle, O., & Chang, Y. (2011). Yahoo! learning to rank challenge overview. In Chapelle, O., Chang, Y., Liu, T.-Y. (Eds.), Proceedings of the learning to rank challenge, volume 14 of proceedings of machine learning research (pp. 1–24). Haifa: PMLR.Google Scholar
  18. Chelaru, S., Orellana-Rodriguez, C., & Altingovde, I. S. (2014). How useful is social feedback for learning to rank youtube videos? World Wide Web, 17(5), 997.Google Scholar
  19. Chen, Y. (2005). Information valuation for information lifecycle management. In Proceedings of international conference on autonomic computing.Google Scholar
  20. Cohen, E., & Strauss, M. J. (2006). Maintaining time-decaying stream aggregates. Journal of Algorithms, 59(1), 19–36.MathSciNetzbMATHGoogle Scholar
  21. Coleman, T. F., & Moré, J. J. (1983). Estimation of sparse jacobian matrices and graph coloring blems. SIAM Journal on Numerical Analysis, 20(1), 187–209.MathSciNetzbMATHGoogle Scholar
  22. Coman, A., & Hirst, W. (2012). Cognition through a social network: The propagation of induced forgetting and practice effects. Journal of Experimental Psychology: General, 141(2), 321–36.Google Scholar
  23. Crete-Nishihata, M., Baecker, R. M., Massimi, M., Ptak, D., Campigotto, R., Kaufman, L. D., et al. (2012). Reconstructing the past: Personal memory technologies are not just personal and not just for memory. Human-Computer Interaction, 27(1–2), 92–123.Google Scholar
  24. Dang, V., & Croft, W. B. (2010). Feature selection for document ranking using best first search and coordinate ascent. In Proceedings of SIGIR’10 workshop on feature generation and selection for information retrieval.Google Scholar
  25. Duan, Y., Jiang, L., Qin, T., Zhou, M., & Shum, H. (2010). An empirical study on learning to rank of tweets. In Proceedings of COLING ’10 (pp. 295–303).Google Scholar
  26. Dumais, S., Cutrell, E., Cadiz, J., Jancke, G., Sarin, R., & Robbins, D. C. (2003). Stuff i’ve seen: A system for personal information retrieval and re-use. In: SIGIR ’03 (pp. 72–79).Google Scholar
  27. Ellison, N. B., Gray, R., Vitak, J., Lampe, C., & Fiore, A. T. (2013). Calling all facebook friends: Exploring requests for help on facebook. In Proceedings of ICWSM ’13.Google Scholar
  28. Ellison, N. B., Steinfield, C., & Lampe, C. (2011). Connection strategies: Social capital implications of facebook-enabled communication practices. New Media & Society, 13(6), 873–892.Google Scholar
  29. Gadiraju, U., Kawase, R., Dietze, S., & Demartini, G. (2015). Understanding malicious behavior in crowdsourcing platforms: The case of online surveys. In Proceedings of CHI ’15 (pp. 1631–1640).Google Scholar
  30. Geng, X., Liu, T.-Y., Qin, T., Arnold, A., Li, H., & Shum, H.-Y. (2008). Query dependent ranking using k-nearest neighbor. In Proceedings of SIGIR’08 (pp. 115–122).Google Scholar
  31. Geng, X., Liu, T.-Y., Qin, T., & Li, H. (2007). Feature selection for ranking. In Proceedings of SIGIR’07 (pp. 407–414).Google Scholar
  32. Gigli, A., Lucchese, C., Nardini, F. M., & Perego, R. (2016). Fast feature selection for learning to rank. In Proceedings of ICTIR ’16 (pp. 167–170).Google Scholar
  33. Gomes, A. K., & da Graça Campos Pimentel, M. (2014). Evaluation of media-based social interactions: Linking collective actions to media types, applications, and devices in social networks. In N. Agarwal, M. Lim, & R. Wigand (Eds.), Online Collective Action (pp. 75–95). Vienna: Springer.Google Scholar
  34. Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems (TOIS), 20(4), 422–446.Google Scholar
  35. Joachims, T. (2002). Optimizing search engines using clickthrough data. In Proceedings of KDD’02 (pp. 133–142).Google Scholar
  36. Joinson, A. N. (2008). Looking at, looking up or keeping up with people?: Motives and use of facebook. In Proceedings of CHI ’08.Google Scholar
  37. Jones, W. (2008). Keeping found things found: The study and practice of personal information management. San Francisco: Morgan Kaufmann Publishers Inc.Google Scholar
  38. Kalnikaité, V., & Whittaker, S. (2011). A saunter down memory lane: Digital reflection on personal mementos. International Journal of Human-Computer Studies, 69(5), 298–310.Google Scholar
  39. Kanhabua, N., Niederée, C., & Siberski, W. (2013). Towards concise preservation by managed forgetting: Research issues and case study. In Proceedings of the 10th international conference on preservation of digital objects, iPres ’13.Google Scholar
  40. Kanhabua, N., & Nørvåg, K. (2012). Learning to rank search results for time-sensitive queries. In 21st ACM international conference on information and knowledge management, CIKM’12, Maui, HI, USA, October 29–November 02, 2012 (pp. 2463–2466).Google Scholar
  41. Kirk, D. S., & Sellen, A. (2010). On human remains: Values and practice in the home archiving of cherished objects. ACM Transactions on Computer-Human Interaction, 17(3), 10:1–10:43.Google Scholar
  42. Knoll, S., Hoff, A., Fisher, D., Dumais, S., & Cutrell, E. (2009). Viewing personal data over time. In Proceedings of CHI’2009 workshop on interacting with temporal data.Google Scholar
  43. Krishnamurthy, B., & Wills, C. E. (2008). Characterizing privacy in online social networks. In Proceedings of the first workshop on online social networks, WOSN ’08 (pp. 37–42). ACM: New York.Google Scholar
  44. Lampe, C., Ellison, N. B., & Steinfield, C. (2008). Changes in use and perception of facebook. In Proceedings of CSCW ’08.Google Scholar
  45. Li, H. (2011). Learning to rank for information retrieval and natural language processing. In: Synthesis lectures on human language technologies. Morgan and Claypool Publishers.Google Scholar
  46. Liu, T. (2011). Learning to rank for information retrieval. Berlin: Springer.zbMATHGoogle Scholar
  47. Liu, Y., Gummadi, K. P., Krishnamurthy, B., & Mislove, A. (2011). Analyzing facebook privacy settings: User expectations vs. reality. In Proceedings of the 2011 ACM SIGCOMM conference on internet measurement conference, IMC ’11 (pp. 61–70). ACM: New York.Google Scholar
  48. Macdonald, C., Santos, R. L. T., & Ounis, I. (2012). On the usefulness of query features for learning to rank. In Proceedings of CIKM ’12 (pp. 2559–2562).Google Scholar
  49. Marshall, C. C. (2011). Challenges and opportunities for personal digital archiving. In C. A. Lee (Ed.), I, Digital: Personal Collections in the Digital Era (pp. 90–114). Chicago: Society of American Archivists.Google Scholar
  50. Mason, W. A., & Watts, D. J. (2009). Financial incentives and the “performance of crowds”. SIGKDD Explorations, 11(2), 100–108.Google Scholar
  51. McGeoch, J. A., & McDonald, W. T. (1931). Meaningful relation and retroactive inhibition. American Journal of Psychology, 43(4), 579–588.Google Scholar
  52. Naini, K. D., & Altingovde, I. S. (2014). Exploiting result diversification methods for feature selection in learning to rank. In Proceedings of ECIR’14 (pp. 455–461).Google Scholar
  53. Naini, K. D., Kawase, R., Kanhabua, N., & Niederée, C. (2014). Characterizing high-impact features for content retention in social web applications. In Proceedings of WWW (Companion Volume) (pp. 559–560).Google Scholar
  54. Niederée, C., Kanhabua, N., Tran, T., & Naini, K. D. (2018). Preservation value and managed forgetting. In V. Mezaris, C. Niederée, & R. H. Logie (Eds.), Personal Multimedia Preservation: Remembering or Forgetting Images and Video (pp. 101–129). Cham: Springer.Google Scholar
  55. Palpanas, T., Vlachos, M., Keogh, E., Gunopulos, D., & Truppel, W. (2004). Online amnesic approximation of streaming time series. In Proceedings of ICDE ’04.Google Scholar
  56. Pantel, P., Gamon, M., Alonso, O., & Haas, K. (2012). Social annotations: Utility and prediction modeling. In The 35th International ACM SIGIR conference on research and development in information retrieval, SIGIR ’12, Portland, OR, USA, August 12–16, 2012 (pp. 285–294).Google Scholar
  57. Sauermann, L., Dengel, A., Elst, L. V., Lauer, A., & Schwarz, M. S. (2006). Personalization in the EPOS project. In Proceedings of ESWC (pp. 42–52).Google Scholar
  58. Sparrow, B., Liu, J., & Wegner, D. M. (2011). Google effects on memory: Cognitive consequences of having information at our fingertips. Science, 333, 776–778.Google Scholar
  59. Spiliotopoulos, T., & Oakley, I. (2013). Understanding motivations for facebook use: Usage metrics, network structure, and privacy. In Proceedings of CHI ’13.Google Scholar
  60. Tarjan, R. (1972). Depth-first search and linear graph algorithms. SIAM Journal on Computing, 1(2), 146–160.MathSciNetzbMATHGoogle Scholar
  61. Tulving, E. (2002). Episodic memory: From mind to brain. Annual Review of Psychology, 53(1), 1–25.Google Scholar
  62. Underwood, B. (1957). Interference and forgetting. Psychological Review, 64(1), 49–60.Google Scholar
  63. Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of small-worldnetworks. Nature, 393(6684), 440–442.zbMATHGoogle Scholar
  64. Wu, Q., Burges, C. J., Svore, K. M., & Gao, J. (2010). Adapting bboosting for information retrieval measures. Information Retrieval, 13, 254–270.Google Scholar
  65. Xu, J., & Li, H. (2007). Adarank: A boosting algorithm for information retrieval. In SIGIR.Google Scholar
  66. Yin, D., Hu, Y., Tang, J., Jr., T. D., Zhou, M., Ouyang, H., Chen, J., Kang, C., Deng, H., Nobata, C., Langlois, J., & Chang, Y. (2016). Ranking relevance in yahoo search. In Proceedings of KDD ’16 (pp. 323–332).Google Scholar
  67. Zhang, X., He, B., Luo, T., & Li, B. (2012). Query-biased learning to rank for real-time twitter search. In Proceedings of CIKM ’12 (pp. 1915–1919).Google Scholar
  68. Zhao, X., Salehi, N., Naranjit, S., Alwaalan, S., Voida, S., & Cosley, D. (2013). The many faces of facebook: Experiencing social media as performance, exhibition, and personal archive. In Proceedings of CHI ’13.Google Scholar

Copyright information

© Springer Nature B.V. 2018

Authors and Affiliations

  • Kaweh Djafari Naini
    • 1
  • Ricardo Kawase
    • 1
  • Nattiya Kanhabua
    • 2
  • Claudia Niederée
    • 1
  • Ismail Sengor Altingovde
    • 3
    Email author
  1. 1.L3S Research CenterHannoverGermany
  2. 2.NTENT Inc.BarcelonaSpain
  3. 3.Computer Engineering DepartmentMiddle East Technical UniversityAnkaraTurkey

Personalised recommendations