Skip to main content

Automated and Personalized Privacy Policy Extraction Under GDPR Consideration

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11604))

Abstract

Along with the popularity of mobile devices, people share a growing amount of personal data to a variety of mobile applications for personalized services. In most cases, users can learn their data usage from the privacy policy along with the application. However, current privacy policies are always too long and obscure to provide readability and comprehensibility to users. To address this issue, we propose an automated privacy policy extraction system considering users’ personal privacy concerns under different contexts. The system is implemented on Android smartphones and evaluated feedbacks from a group of users (\(n=96\)) as a field study. Experiments are conducted on both our dataset, which is the first user privacy concern profile dataset to the best of our knowledge, and a public dataset containing 115 privacy policies with 23K data practices. We achieve 0.94 precision for privacy category classification and 0.81 accuracy for policy segment extraction, which attests to the significance of our work as a direction towards meeting the transparency requirement of the General Data Protection Regulation (GDPR).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://youtu.be/-0x-HQRnYwQ.

  2. 2.

    https://www.wjx.top/jq/33235531.aspx.

  3. 3.

    https://keras.io.

  4. 4.

    https://scikit-learn.org/.

References

  1. Angst, C.M., Agarwal, R.: Adoption of electronic health records in the presence of privacy concerns: the elaboration likelihood model and individual persuasion. MIS Q. 33(2), 339–370 (2009)

    Google Scholar 

  2. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)

    Google Scholar 

  3. Degeling, M., Utz, C., Lentzsch, C., Hosseini, H., Schaub, F., Holz, T.: We value your privacy... now take some cookies: Measuring the gdpr’s impact on web privacy. arXiv preprint arXiv:1808.05096 (2018)

  4. Harkous, H., Fawaz, K., Lebret, R., Schaub, F., Shin, K.G., Aberer, K.: Polisis: automated analysis and presentation of privacy policies using deep learning. In: 27th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 18), pp. 531–548 (2018)

    Google Scholar 

  5. Holtz, L.E., Zwingelberg, H., Hansen, M.: Privacy policy icons. In: Camenisch, J., Fischer-Hübner, S., Rannenberg, K. (eds.) Privacy and Identity Management for Life, pp. 279–285. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20317-6_15

    Google Scholar 

  6. Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)

  7. Kittur, A., Chi, E.H., Suh, B.: Crowdsourcing user studies with mechanical turk. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 453–456. ACM (2008)

    Google Scholar 

  8. Li, H., Zhu, H., Du, S., Liang, X., Shen, X.S.: Privacy leakage of location sharing in mobile social networks: attacks and defense. IEEE Trans. Dependable Secure Comput. 15(4), 646–660 (2018)

    Google Scholar 

  9. Li, H., Zhu, H., Ma, D.: Demographic information inference through meta-data analysis of Wi-Fi traffic. IEEE Trans. Mob. Comput. 17(5), 1033–1047 (2018)

    Google Scholar 

  10. Li, P., Zhao, F., Li, Y., Zhu, Z.: Law text classification using semi-supervised convolutional neural networks. In: 2018 Chinese Control and Decision Conference (CCDC), pp. 309–313. IEEE (2018)

    Google Scholar 

  11. Linden, T., Harkous, H., Fawaz, K.: The privacy policy landscape after the GDPR. arXiv preprint arXiv:1809.08396 (2018)

  12. Liu, B., et al.: Follow my recommendations: a personalized privacy assistant for mobile app permissions. In: Symposium on Usable Privacy and Security (2016)

    Google Scholar 

  13. Malhotra, N.K., Kim, S.S., Agarwal, J.: Internet users’ information privacy concerns (IUIPC): the construct, the scale, and a causal model. Inf. Syst. Res. 15(4), 336–355 (2004)

    Google Scholar 

  14. Mayer, J.R., Mitchell, J.C.: Third-party web tracking: policy and technology. In: 2012 IEEE Symposium on Security and Privacy (SP), pp. 413–427. IEEE (2012)

    Google Scholar 

  15. McDonald, A.M., Cranor, L.F.: The cost of reading privacy policies. ISJLP 4, 543 (2008)

    Google Scholar 

  16. Meng, Y., Zhang, W., Zhu, H., Shen, X.S.: Securing consumer IoT in the smart home: architecture, challenges, and countermeasures. IEEE Wirel. Commun. 25(6), 53–59 (2018)

    Google Scholar 

  17. Oltramari, A., et al.: PrivOnto: a semantic framework for the analysis of privacy policies. Semantic Web 9, 1–19 (2017)

    Google Scholar 

  18. Sathyendra, K.M., Wilson, S., Schaub, F., Zimmeck, S., Sadeh, N.: Identifying the provision of choices in privacy policy text. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2774–2779 (2017)

    Google Scholar 

  19. The General Data Protection Regulation. https://gdpr-info.eu/

  20. Wilson, S., et al.: The creation and analysis of a website privacy policy corpus. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1330–1340 (2016)

    Google Scholar 

  21. Zhang, Z., Zou, Y., Gan, C.: Textual sentiment analysis via three different attention convolutional neural networks and cross-modality consistent regression. Neurocomputing 275, 1407–1415 (2018)

    Google Scholar 

  22. Zhou, L., Du, S., Zhu, H., Chen, C., Ota, K., Dong, M.: Location privacy in usage-based automotive insurance: attacks and countermeasures. IEEE Trans. Inf. Forensics Secur. 14(1), 196–211 (2019)

    Google Scholar 

  23. Zimmeck, S., Bellovin, S.M.: Privee: an architecture for automatically analyzing web privacy policies. In: Proceedings of the 23rd USENIX Conference on Security Symposium, pp. 1–16. USENIX Association (2014)

    Google Scholar 

Download references

Acknowledgments

This work was supported in part by National Science Foundation of China under Grant 71671114 and Grant 61672350, and in part by the China Scholarship Council (201806230109).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haojin Zhu .

Editor information

Editors and Affiliations

A An Example of the System Output

A An Example of the System Output

If a user queries the privacy policy of the WeChat app, and her privacy concern profile shows solicitude for whether her personal data is secure. Then the system will return a text segment in the WeChat privacy policy:

We use a variety of security technologies and procedures for the purpose of preventing loss, misuse, unauthorised access, or disclosure of Information – for example... But no data security measures can guarantee 100% security at all times. We do not warrant or guarantee the security of WeChat or any information you provide to us through WeChat.

And the related regulation items in GDPR is also presented to the user:

...the controller and the processor shall implement appropriate technical and organisational measures to ensure a level of security appropriate to the risk, including inter alia as appropriate:...

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chang, C., Li, H., Zhang, Y., Du, S., Cao, H., Zhu, H. (2019). Automated and Personalized Privacy Policy Extraction Under GDPR Consideration. In: Biagioni, E., Zheng, Y., Cheng, S. (eds) Wireless Algorithms, Systems, and Applications. WASA 2019. Lecture Notes in Computer Science(), vol 11604. Springer, Cham. https://doi.org/10.1007/978-3-030-23597-0_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-23597-0_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-23596-3

  • Online ISBN: 978-3-030-23597-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics