Abstract
Along with the popularity of mobile devices, people share a growing amount of personal data to a variety of mobile applications for personalized services. In most cases, users can learn their data usage from the privacy policy along with the application. However, current privacy policies are always too long and obscure to provide readability and comprehensibility to users. To address this issue, we propose an automated privacy policy extraction system considering users’ personal privacy concerns under different contexts. The system is implemented on Android smartphones and evaluated feedbacks from a group of users (\(n=96\)) as a field study. Experiments are conducted on both our dataset, which is the first user privacy concern profile dataset to the best of our knowledge, and a public dataset containing 115 privacy policies with 23K data practices. We achieve 0.94 precision for privacy category classification and 0.81 accuracy for policy segment extraction, which attests to the significance of our work as a direction towards meeting the transparency requirement of the General Data Protection Regulation (GDPR).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Angst, C.M., Agarwal, R.: Adoption of electronic health records in the presence of privacy concerns: the elaboration likelihood model and individual persuasion. MIS Q. 33(2), 339–370 (2009)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Degeling, M., Utz, C., Lentzsch, C., Hosseini, H., Schaub, F., Holz, T.: We value your privacy... now take some cookies: Measuring the gdpr’s impact on web privacy. arXiv preprint arXiv:1808.05096 (2018)
Harkous, H., Fawaz, K., Lebret, R., Schaub, F., Shin, K.G., Aberer, K.: Polisis: automated analysis and presentation of privacy policies using deep learning. In: 27th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 18), pp. 531–548 (2018)
Holtz, L.E., Zwingelberg, H., Hansen, M.: Privacy policy icons. In: Camenisch, J., Fischer-Hübner, S., Rannenberg, K. (eds.) Privacy and Identity Management for Life, pp. 279–285. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20317-6_15
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Kittur, A., Chi, E.H., Suh, B.: Crowdsourcing user studies with mechanical turk. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 453–456. ACM (2008)
Li, H., Zhu, H., Du, S., Liang, X., Shen, X.S.: Privacy leakage of location sharing in mobile social networks: attacks and defense. IEEE Trans. Dependable Secure Comput. 15(4), 646–660 (2018)
Li, H., Zhu, H., Ma, D.: Demographic information inference through meta-data analysis of Wi-Fi traffic. IEEE Trans. Mob. Comput. 17(5), 1033–1047 (2018)
Li, P., Zhao, F., Li, Y., Zhu, Z.: Law text classification using semi-supervised convolutional neural networks. In: 2018 Chinese Control and Decision Conference (CCDC), pp. 309–313. IEEE (2018)
Linden, T., Harkous, H., Fawaz, K.: The privacy policy landscape after the GDPR. arXiv preprint arXiv:1809.08396 (2018)
Liu, B., et al.: Follow my recommendations: a personalized privacy assistant for mobile app permissions. In: Symposium on Usable Privacy and Security (2016)
Malhotra, N.K., Kim, S.S., Agarwal, J.: Internet users’ information privacy concerns (IUIPC): the construct, the scale, and a causal model. Inf. Syst. Res. 15(4), 336–355 (2004)
Mayer, J.R., Mitchell, J.C.: Third-party web tracking: policy and technology. In: 2012 IEEE Symposium on Security and Privacy (SP), pp. 413–427. IEEE (2012)
McDonald, A.M., Cranor, L.F.: The cost of reading privacy policies. ISJLP 4, 543 (2008)
Meng, Y., Zhang, W., Zhu, H., Shen, X.S.: Securing consumer IoT in the smart home: architecture, challenges, and countermeasures. IEEE Wirel. Commun. 25(6), 53–59 (2018)
Oltramari, A., et al.: PrivOnto: a semantic framework for the analysis of privacy policies. Semantic Web 9, 1–19 (2017)
Sathyendra, K.M., Wilson, S., Schaub, F., Zimmeck, S., Sadeh, N.: Identifying the provision of choices in privacy policy text. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2774–2779 (2017)
The General Data Protection Regulation. https://gdpr-info.eu/
Wilson, S., et al.: The creation and analysis of a website privacy policy corpus. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1330–1340 (2016)
Zhang, Z., Zou, Y., Gan, C.: Textual sentiment analysis via three different attention convolutional neural networks and cross-modality consistent regression. Neurocomputing 275, 1407–1415 (2018)
Zhou, L., Du, S., Zhu, H., Chen, C., Ota, K., Dong, M.: Location privacy in usage-based automotive insurance: attacks and countermeasures. IEEE Trans. Inf. Forensics Secur. 14(1), 196–211 (2019)
Zimmeck, S., Bellovin, S.M.: Privee: an architecture for automatically analyzing web privacy policies. In: Proceedings of the 23rd USENIX Conference on Security Symposium, pp. 1–16. USENIX Association (2014)
Acknowledgments
This work was supported in part by National Science Foundation of China under Grant 71671114 and Grant 61672350, and in part by the China Scholarship Council (201806230109).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A An Example of the System Output
A An Example of the System Output
If a user queries the privacy policy of the WeChat app, and her privacy concern profile shows solicitude for whether her personal data is secure. Then the system will return a text segment in the WeChat privacy policy:
We use a variety of security technologies and procedures for the purpose of preventing loss, misuse, unauthorised access, or disclosure of Information – for example... But no data security measures can guarantee 100% security at all times. We do not warrant or guarantee the security of WeChat or any information you provide to us through WeChat.
And the related regulation items in GDPR is also presented to the user:
...the controller and the processor shall implement appropriate technical and organisational measures to ensure a level of security appropriate to the risk, including inter alia as appropriate:...
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Chang, C., Li, H., Zhang, Y., Du, S., Cao, H., Zhu, H. (2019). Automated and Personalized Privacy Policy Extraction Under GDPR Consideration. In: Biagioni, E., Zheng, Y., Cheng, S. (eds) Wireless Algorithms, Systems, and Applications. WASA 2019. Lecture Notes in Computer Science(), vol 11604. Springer, Cham. https://doi.org/10.1007/978-3-030-23597-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-23597-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23596-3
Online ISBN: 978-3-030-23597-0
eBook Packages: Computer ScienceComputer Science (R0)