Advertisement

PrivacySearch: An End-User and Query Generalization Tool for Privacy Enhancement in Web Search

  • Francisco-Javier Rodrigo-Ginés
  • Javier Parra-Arnau
  • Weizhi Meng
  • Yu WangEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11058)

Abstract

Web search engines capitalize on, or lend themselves to, the construction of user interest profiles to provide personalized search results. The lack of transparency about what information is stored, how it is used and with whom it is shared, limits the perception of privacy that users have about the search service. In this paper, we investigate a technology that allows users to replace specific queries with more general but semantically similar search terms. Through the generalization of queries, user profile could become less precise and therefore more private, although evidently at the expense of a degradation in the accuracy of search results. In this work, we design and develop a tool of PrivacySearch that implements this principle in real practice. Our tool, developed as a browser plug-in for Google Chrome, enables users to generalize the queries sent to a search engine in an automated fashion, without the need for any kind of infrastructure or external databases, and in real time, according to simple and intuitive privacy criteria. Experimental results demonstrate the technical feasibility and suitability of our solution.

Notes

Acknowledgments

Partial support to this work has been received from the European Commission (projects H2020-644024 “CLARUS” and H2020-700540 “CANVAS”), and the Spanish Government (projects TIN2014-57364-C2-1-R “Smart-Glacis” and TIN2016-80250-R “Sec-MCloud”). J. Parra-Arnau is the recipient of a Juan de la Cierva postdoctoral fellowship, IJCI-2016-28239, from the Spanish Ministry of Economy and Competitiveness.

References

  1. 1.
    Cao, B., Sun, J.-T., Xiang, E.W., Hu, D.H., Yang, Q., Chen, Z.: PQC: Personalized query classification. In: ACM Eighteenth Conference on Information and Knowledge Management, pp. 1217-1226 (2017)Google Scholar
  2. 2.
    Aktolga, E., Jain, A., Velipasaoglu, E.: Building rich user search queries profiles. In: Carberry, S., Weibelzahl, S., Micarelli, A., Semeraro, G. (eds.) UMAP 2013. LNCS, vol. 7899, pp. 254–266. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-38844-6_21CrossRefGoogle Scholar
  3. 3.
    Ortiz-Cordova, A., Jansen, B.J.: Classifying web search queries to identify high revenue generating customers. J. Assoc. Inf. Sci. Technol. 63(7), 426–1441 (2012)Google Scholar
  4. 4.
    Zetter, K.: Yahoo issues takedown notice for spying price list. In: Wired (2009). https://www.wired.com/2009/12/yahoo-spy-prices/
  5. 5.
    Pariser, E.: El filtro burbuja: Cómo la web decide lo que leemos and lo que pensamos. Penguin Random House Grupo Editorial (2017)Google Scholar
  6. 6.
    European Commission: Media pluralism and democracy: outcomes of the 2016 Annual Colloquium on Fundamental Rights. In: Annual Colloquium on Fundamental Rights, pp. 14–15 (2016)Google Scholar
  7. 7.
    Purcell, K., Brenner, J., Rainie, L.: Search engine use 2012. In: Pew Internet (2012)Google Scholar
  8. 8.
    Penn, M.: Views from Around the Globe: 2nd Annual Poll on How Personal Technology is Changing Our Lives (2015)Google Scholar
  9. 9.
    Mozilla Foundation: Online Privacy & Security Survey (2017)Google Scholar
  10. 10.
    Sánchez, D., Castellà-Roca, J., Viejo, A.: Knowledge-based scheme to create privacy-preserving but semantically-related queries for web search engines. Inf. Sci. 218, 17–30 (2013)CrossRefGoogle Scholar
  11. 11.
    Shen, X., Tan, B., Zhai, C.: Privacy protection in personalized search. In: ACM SIGIR Forum, vol. 41, no. 1 (2007)CrossRefGoogle Scholar
  12. 12.
    Howe, D.C., Nissenbaum, H.: TrackMeNot: resisting surveillance in web search. Lessons Identity Trail: Priv. Anonymity Identity Netw. Soc. 290, 417–436 (2006)Google Scholar
  13. 13.
    Chow, R., Golle, P.: Faking contextual data for fun, profit, and privacy. In: Proceedings of the 8th ACM Workshop on Privacy in the Electronic Society, pp. 105-109 (2009)Google Scholar
  14. 14.
    Domingo-Ferrer, J., Solanas, A., Castellà-Roca, J.: h (k)-Private information retrieval from privacy-uncooperative queryable databases. Online Inf. Rev. 33(4), 720–744 (2009)CrossRefGoogle Scholar
  15. 15.
    Balsa, E., Troncoso, C., Díaz, C.: OB-PWS: obfuscation-based private web search. In: IEEE Symposium on Security and Privacy (SP), pp. 491–505. IEEE (2012)Google Scholar
  16. 16.
    Xu, Y., Wang, K., Zhang, B., Chen, Z.: Privacy-enhancing personalized web search. In: Proceedings of the 16th International Conference on World Wide Web, pp. 591–600 (2007)Google Scholar
  17. 17.
    Rebollo-Monedero, D., Parra-Arnau, J., Forné, J.: An Information-Theoretic Privacy Criterion for Query Forgery in Information Retrieval. In: Kim, T., Adeli, H., Fang, W., Villalba, J.G., Arnett, K.P., Khan, M.K. (eds.) SecTech 2011. CCIS, vol. 259, pp. 146–154. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-27189-2_16CrossRefGoogle Scholar
  18. 18.
    Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)CrossRefGoogle Scholar
  19. 19.
    Snasel, V., Moravec, P., Pokorny, J.: WordNet ontology based model for web retrieval. In: Web Information Retrieval and Integration, pp. 220–225 (2005)Google Scholar
  20. 20.
    Stevenson, M., Wilks, Y.: Word-sense disambiguation. In: The Oxford Handbook of Computational Linguistics, pp. 249–265 (2003)Google Scholar
  21. 21.
    Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In: Proceedings of the 5th Annual International Conference on Systems Documentation SIGDOC 1986, pp. 24–26 (1986)Google Scholar
  22. 22.
    Scott, S., Matwins, S.: Text classification using WordNet hypernyms. In: Usage of WordNet in Natural Language Processing Systems (1998)Google Scholar
  23. 23.
    Ganesan, P., Garcia-Molina, H., Widom, J.: Exploiting hierarchical domain structure to compute similarity. ACM Trans. Inf. Syst. (TOIS) 21(1), 64–93 (2003)CrossRefGoogle Scholar
  24. 24.
    Mansuy, T., Hilderman, R.J.: A characterization of WordNet features in Boolean models for text classification. In: Proceedings of the Fifth Australasian Conference on Data Mining and Analytics, vol. 61 (2006)Google Scholar
  25. 25.
    Arrington, M.: AOL proudly releases massive amounts of private data. In: TechCrunch (2006). www.techcrunch.com/2006/08/06/aol-proudly-releases-massive-amounts-of-user-search-data
  26. 26.
    Silverstein, C., Henzinger, M., Marais, H., Moricz, M.: Analysis of a very large web search engine query log. ACm SIGIR Forum 33(1), 6–12 (1999)CrossRefGoogle Scholar
  27. 27.
    Dou, Z., Song, R., Wen, J.-R.: A large-scale evaluation and analysis of personalized search strategies. In: Proceedings of the 16th International Conference on World Wide Web, pp. 581–590 (2007)Google Scholar
  28. 28.
    Reiter, M., Rubin, A.D.: Crowds: anonymity for Web transactions. ACM Trans. Inf. Syst. Secur. I, 66–92 (1998)CrossRefGoogle Scholar
  29. 29.
    Beales, H.: The value of behavioral targeting, Tech. rep., Netw. Advertising Initiative, March 2010. http://www.networkadvertising.org/pdfs/Beales_NAI_Study.pdf. Accessed 15 Jan 2016
  30. 30.
    Avi, A., Efraimidis, P.S., Drosatos, G.: A query scrambler for search privacy on the internet. Inf. Retr. 16(6), 657–679 (2013)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Francisco-Javier Rodrigo-Ginés
    • 1
  • Javier Parra-Arnau
    • 1
  • Weizhi Meng
    • 2
  • Yu Wang
    • 3
    Email author
  1. 1.Department of Computer Science and Mathematics, CYBERCAT-Center for Cybersecurity Research of CataloniaUniversitat Rovira i VirgiliTarragonaSpain
  2. 2.Department of Applied Mathematics and Computer ScienceTechnical University of DenmarkKongens LyngbyDenmark
  3. 3.Department of Computer ScienceGuangzhou UniversityGuangzhouChina

Personalised recommendations