Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Static tainting extraction approach based on information flow graph for personally identifiable information

  • 14 Accesses


Personally identifiable information (PII) is widely used for many aspects such as network privacy leak detection, network forensics, and user portraits. Internet service providers (ISPs) and administrators are usually concerned with whether PII has been extracted during the network transmission process. However, most studies have focused on the extractions occurring on the client side and server side. This study proposes a static tainting extraction approach that automatically extracts PII from large-scale network traffic without requiring any manual work and feedback on the ISP-level network traffic. The proposed approach does not deploy any additional applications on the client side. The information flow graph is drawn via a tainting process that involves two steps: inter-domain routing and intra-domain infection that contains a constraint function (CF) to limit the “over-tainting”. Compared with the existing semantic-based approach that uses network traffic from the ISP, the proposed approach performs better, with 92.37% precision and 94.04% recall. Furthermore, three methods that reduce the computing time and the memory overhead are presented herein. The number of rounds is reduced to 0.0883%, and the execution time overhead is reduced to 0.0153% of the original approach.

This is a preview of subscription content, log in to check access.


  1. 1

    Krishnamurthy B, Wills C E. On the leakage of personally identifiable information via online social networks. In: Proceedings of the 2nd ACM Workshop on Online Social Networks, 2009. 7–12

  2. 2

    Mccallister E, Grance T, Scarfone K A. Guide to Protecting the Confidentiality of Personally Identifiable Information (PII). Special Publication (NIST SP)-800-122. 2010

  3. 3

    Liu Y, Song H H, Bermudez I, et al. Identifying personal information in internet traffic. In: Proceedings of ACM on Conference on Online Social Networks, 2015. 59–70

  4. 4

    Enck W, Gilbert P, Chun B G, et al. TaintDroid: an information flow tracking system for real-time privacy monitoring on smartphones. Commun ACM, 2014, 57: 99–106

  5. 5

    Ball J, Schneier B, Greenwald G. NSA and GCHQ target Tor network that protects anonymity of web users. Guardian Web, 2013.

  6. 6

    Yang Z, Yang M, Zhang Y, et al. Appintent: analyzing sensitive data transmission in android for privacy leakage detection. In: Proceedings of ACM Sigsac Conference on Computer & Communications Security, 2013. 1043–1054

  7. 7

    Arzt S, Rasthofer S, Fritz C, et al. Flowdroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. In: Proceedings of ACM Sigplan Conference on Programming Language Design and Implementation, 2014. 259–269

  8. 8

    Au K W Y, Zhou Y F, Huang Z, et al. Pscout: analyzing the android permission specification. In: Proceedings of ACM Conference on Computer and Communications Security, 2012. 217–228

  9. 9

    Egele M, Kruegel C, Kirda E, et al. PiOS: detecting privacy leaks in IOS applications. In: Proceedings of NDSS, 2011. 177–183

  10. 10

    Cao Y, Fratantonio Y, Bianchi A, et al. Edgeminer: automatically detecting implicit control flow transitions through the android framework. In: Proceedings of Network and Distributed System Security Symposium, 2015

  11. 11

    Babil G S, Mehani O, Boreli R, et al. On the effectiveness of dynamic taint analysis for protecting against private information leaks on android-based devices. In: Proceedings of International Conference on Security and Cryptography (SECRYPT), 2013

  12. 12

    Song Y, Hengartner U. Privacyguard: a VPN-based platform to detect information leakage on android devices. In: Proceedings of ACM CCS Workshop on Security and Privacy in Smartphones and Mobile Devices, 2015

  13. 13

    Ren J, Rao A, Lindorfer M, et al. Recon: revealing and controlling PII leaks in mobile network traffic. In: Proceedings of International Conference on Mobile Systems, Applications, and Services, 2016. 361–374

  14. 14

    Razaghpanah A, Vallina-Rodriguez N, Sundaresan S, et al. Haystack: in Situ mobile traffic analysis in user space. 2015. ArXiv:1510.01419

  15. 15

    Le A, Varmarken J, Langhoff S, et al. Antmonitor: a system for monitoring from mobile devices. In: Proceedings of ACM SIGCOMM Workshop on Crowdsourcing and Crowdsharing of Big, 2015. 15–20

  16. 16

    Continella A, Fratantonio Y, Lindorfer M, et al. Obfuscation-resilient privacy leak detection for mobile apps through differential analysis. In: Proceedings of Network and Distributed System Security Symposium, 2017

  17. 17

    Englehardt S, Han J, Narayanan A. I never signed up for this! Privacy implications of email tracking. In: Proceedings of Privacy Enhancing Technologies, 2018

  18. 18

    Srivastava G, Bhuwalka K, Sahoo S K, et al. Privacyproxy: leveraging crowdsourcing and in situ traffic analysis to detect and mitigate information leakage. 2017. ArXiv: 1708.06384

  19. 19

    Seneviratne S, Kolamunna H, Seneviratne A. A measurement study of tracking in paid mobile applications. In: Proceedings of the 8th ACM Conference on Security & Privacy in Wireless and Mobile Networks, 2015

  20. 20

    Chen T, Ullah I, Kaafar M A, et al. Information leakage through mobile analytics services. In: Proceedings of Workshop on Mobile Computing Systems & Applications, 2014

  21. 21

    Leontiadis I, Efstratiou C, Picone M, et al. Don’t kill my ads! Balancing privacy in an ad-supported mobile application market. In: Proceedings of the 12th Workshop on Mobile Computing Systems & Applications, 2012

  22. 22

    Georgiev M, Iyengar S, Jana S, et al. The most dangerous code in the world: validating ssl certificates in non-browser software. In: Proceedings of ACM Conference on Computer and Communications Security, 2012. 38–49

  23. 23

    Fahl S, Harbach M, Muders T, et al. Why Eve and Mallory love Android: an analysis of Android SSL (in) security. In: Proceedings of ACM Conference on Computer and Communications Security, 2012. 50–61

  24. 24

    Ren J J, Lindorfer M, Dubois D J, et al. Bug fixes, improvements, … and privacy leaks — a longitudinal study of PII leaks across Android app versions. In: Proceedings of Network and Distributed System Security Symposium (NDSS), 2018

  25. 25

    Lindorfer M, Neugschwandtner M, Weichselbaum L, et al. Andrubis — 1,000,000 apps later: a view on current android malware behaviors. In: Proceedings of International Workshop on Building Analysis Datasets & Gathering Experience Returns for Security, 2016. 3–17

  26. 26

    Bell J, Kaiser G. Phosphor: illuminating dynamic data flow in commodity JVMs. ACM Sigplan Notice, 2014, 10: 83–101

  27. 27

    Rastogi V, Qu Z Y, Mcclurg J, et al. Uranine: real-time privacy leakage monitoring without system modification for Android. In: Proceedings of International Conference on Security and Privacy in Communication Systems, 2015. 256–276

  28. 28

    Hornyack P, Han S, Jung J, et al. “These aren’t the droids you’re looking for”: retrofitting Android to protect data from imperious applications. In: Proceedings of ACM Conference on Computer and Communications Security (CCS), 2011

  29. 29

    Zhu D Y, Jung J, Song D, et al. TaintEraser: protecting sensitive data leaks using application-level taint tracking. SIGOPS Oper Syst Rev, 2011, 45: 142

  30. 30

    Arefi Meisam N, Alexander G, Crandall J R. PIITracker: automatic tracking of personally identifiable information in windows. In: Proceedings of the 11th European Workshop on Systems Security (EuroSec’18), 2018

  31. 31

    Machiry A, Tahiliani R, Naik M. Dynodroid: an input generation system for Android apps. In: Proceedings of Joint Meeting on Foundations of Software Engineering, 2013. 224–234

  32. 32

    Carter P, Mulliner C, Lindorfer M, et al. Curiousdroid: automated user interface interaction for android application analysis sandboxes. In: Proceedings of International Conference on Financial Cryptography and Data Security, 2016. 231–249

  33. 33

    Hao S, Liu B, Nath S, et al. Puma: programmable Ui-automation for large-scale dynamic analysis of mobile apps. In: Proceedings of the 12th Annual International Conference on Mobile Systems, Applications, and Services, 2014. 204–217

  34. 34

    Starov O, Nikiforakis N. Extended tracking powers: measuring the privacy diffusion enabled by browser extensions. In: Proceedings of International Conference on World Wide Web, 2017. 1481–1490

  35. 35

    Liu Y. Design and implementation of high performance IP network traffic capture system. J Yanan Univ (Natl Sci Edit), 2017, 36: 22–24

  36. 36

    Liu Y, Zhan Y H. Research on mobile terminal equipment recognition method based on HTTP traffic. Modern Electron Tech, 2018, 41: 93–95

  37. 37

    Dai S F, Tongaonkar A, Wang X Y, et al. NetworkProfiler: towards automatic fingerprinting of Android apps. In: Proceedings of IEEE INFOCOM, 2013. 809–817

Download references


This work was supported by National Natural Science Foundation of China (Grant Nos. 61672101, U1636119, 61866038, 61962059).

Author information

Correspondence to Tian Song.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Liu, Y., Liao, L. & Song, T. Static tainting extraction approach based on information flow graph for personally identifiable information. Sci. China Inf. Sci. 63, 132104 (2020).

Download citation


  • personally identifiable information
  • network privacy leak detection
  • static tainting
  • network traffic analysis
  • information flow graph
  • inter-domain routing
  • intra-domain infection
  • constraint function