You Are What You Search: Attribute Inference Attacks Through Web Search Queries

  • Tianyu DuEmail author
  • Tao Tao
  • Bijing Liu
  • Xueqi Jin
  • Jinfeng Li
  • Shouling Ji
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 895)


Most, if not all, existing attribute inference attacks leverage users’ social friendship information and/or social behavioral information to infer the attributes of a target user. In this paper, we study this problem in a novel angle. Specifically, we study whether a user’s private attributes (e.g., age, gender, and education) can be inferred based on his or her query history, which is the first such attempt to the best of our knowledge. We present a thorough description of our query-based attribute inference attack and experimentally evaluate our method on a real-world dataset provided by Sogou. Experimental results show that our method can achieve 70.21% for the precision, 68.82% for the recall, and 69.50% for the F1-score on average. When predicting users’ gender, the proposed method has precision of 84.56%. This suggests that query records indeed disclose a significant amount of information about users.


Attribute inference Privacy Query classification Ensemble learning 



This work was partly supported by NSFC under No. 61772466, the Zhejiang Provincial Natural Science Foundation for Distinguished Young Scholars under No. R19F020013, the Technology Project of State Grid Zhejiang Electric Power co. LTD under NO. 5211HZ17000J, the Provincial Key Research and Development Program of Zhejiang, China under No. 2017C01055, the Fundamental Research Funds for the Central Universities, and the Alibaba-ZJU Joint Research Institute of Frontier Technologies.


  1. 1.
  2. 2.
    Bartunov, S., Korshunov, A., Park, S.T., Ryu, W., Lee, H.: Joint link-attribute user identity resolution in online social networks. In: Proceedings of the Workshop on Social Network Mining and Analysis in the 6th International Conference on Knowledge Discovery and Data Mining. ACM (2012)Google Scholar
  3. 3.
    Beitzel, S.M., Jensen, E.C., Chowdhury, A., Frieder, O.: Varying approaches to topical web query classification. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 783–784. ACM (2007)Google Scholar
  4. 4.
    Beitzel, S.M., Jensen, E.C., Frieder, O., Grossman, D., Lewis, D.D., Chowdhury, A., Kolcz, A.: Automatic web query classification using labeled and unlabeled training data. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 581–582. ACM (2005)Google Scholar
  5. 5.
    Cao, H., Hu, D.H., Shen, D., Jiang, D., Sun, J.T., Chen, E., Yang, Q.: Context-aware query classification. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 3–10. ACM (2009)Google Scholar
  6. 6.
    Chaabane, A., Acs, G., Kaafar, M.A.: You are what you like! Information leakage through users’ interests. In: Network and Distributed System Security Symposium, pp. 1–14 (2012)Google Scholar
  7. 7.
    Dietterich, T.G.: Ensemble methods in machine learning. In: International Workshop on Multiple Classifier Systems, pp. 1–15. Springer (2000)Google Scholar
  8. 8.
    Fang, Q., Sang, J., Xu, C., Hossain, M.S.: Relational user attribute inference in social media. IEEE Trans. Multimed. 17(7), 1031–1044 (2015)CrossRefGoogle Scholar
  9. 9.
    Gong, N.Z., Liu, B.: You are who you know and how you behave: attribute inference attacks via users’ social friends and behaviors. In: USENIX Security Symposium, pp. 979–995 (2016)Google Scholar
  10. 10.
    Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. 101(suppl 1), 5228–5235 (2004)CrossRefGoogle Scholar
  11. 11.
    Gupta, P., Gottipati, S., Jiang, J., Gao, D.: Your love is public now: questioning the use of personal information in authentication. In: Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer and Communications Security, pp. 49–60. ACM (2013)Google Scholar
  12. 12.
    He, J., Chu, W.W., Liu, Z.V.: Inferring privacy information from social networks. In: International Conference on Intelligence and Security Informatics, pp. 154–165. Springer (2006)Google Scholar
  13. 13.
    Hu, J., Wang, G., Lochovsky, F., Sun, J.T., Chen, Z.: Understanding user’s query intent with wikipedia. In: Proceedings of the 18th International Conference on World Wide Web, pp. 471–480. ACM (2009)Google Scholar
  14. 14.
    Lindamood, J., Heatherly, R., Kantarcioglu, M., Thuraisingham, B.: Inferring private information using social network data. In: Proceedings of the 18th International Conference on World Wide Web, WWW 2009, vol. 10, p. 1145 (2009).
  15. 15.
    Miskin, J., MacKay, D.J.: Ensemble learning for blind image separation and deconvolution. In: Advances in Independent Component Analysis, pp. 123–141. Springer (2000)Google Scholar
  16. 16.
    Thomas, K., Grier, C., Nicol, D.M.: unFriendly: multi-party privacy risks in social networks. In: International Symposium on Privacy Enhancing Technologies Symposium. LNCS, vol. 6205, pp. 236–252. Springer (2010)Google Scholar
  17. 17.
    Verbaeten, S., Van Assche, A.: Ensemble methods for noise elimination in classification problems. In: International Workshop on Multiple Classifier Systems, pp. 317–325. Springer (2003)Google Scholar
  18. 18.
    Weinsberg, U., Bhagat, S., Ioannidis, S., Taft, N.: BlurMe: inferring and obfuscating user gender based on ratings. In: Proceedings of the 6th ACM Conference on Recommender Systems, pp. 195–202. ACM (2012)Google Scholar
  19. 19.
    Wolpert, D.H.: Stacked generalization. Neural Netw. 5(2), 241–259 (1992)CrossRefGoogle Scholar
  20. 20.
    Xia, R., Zong, C., Li, S.: Ensemble of feature sets and classification algorithms for sentiment classification. Inf. Sci. 181(6), 1138–1152 (2011)CrossRefGoogle Scholar
  21. 21.
    Zhou, Z.H.: Ensemble Methods: Foundations and Algorithms. Chapman and Hall/CRC, New York (2012)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Tianyu Du
    • 1
    Email author
  • Tao Tao
    • 2
  • Bijing Liu
    • 3
  • Xueqi Jin
    • 4
  • Jinfeng Li
    • 1
  • Shouling Ji
    • 1
    • 5
  1. 1.Zhejiang UniversityHangzhouChina
  2. 2.State Grid Hangzhou Power Supply CompanyHangzhouChina
  3. 3.NARI Group CorporationBeijingChina
  4. 4.State Grid Zhejiang Electric Power Co., LTD.HangzhouChina
  5. 5.Alibaba-Zhejiang University Joint Research Institute of Frontier TechnologiesHangzhouChina

Personalised recommendations