Abstract
The healthcare services industry has seen a huge transformation since the prominent rise of the Internet of Things (IoT). IoT in healthcare services includes a large number of unified and interconnected sensors, and medical devices that generate and exchange sensitive information. Thus, an enormous amount of data is transmitted through the network which raises an alarming concern for the privacy of patient information. Therefore, privacy preserving data collection (PPDC) is on-demand to ensure the privacy of patient data. Several pieces of research on PPDC have been proposed recently. However, the research literatures have fallen short in privacy requirements and are prone to various privacy attacks. In this paper, we propose a novel privacy-preserving data collection scheme for IoT based healthcare services systems. A clustering-based anonymity model is utilized to develop an efficient privacy-preserving scheme to meet privacy requirements and to prevent healthcare IoT from various privacy attacks. We formulated the threat model as client-server-to-user to ensure privacy on both ends. On the client-side, a modified clustering-based k-anonymity model with α-deassociation is used to anonymize the data generated from the IoT nodes. The base-level privacy is then ensured through a bottom-up clustering method which generates clusters of records as per the privacy requirements. On the server-side, the cluster-combination method-UPGMA is utilized to reduce communication costs and to achieve a better level of privacy. The proposed scheme is efficient in tackling privacy attacks such as attribute disclosure, identity disclosure, membership disclosure, sensitivity attacks, similarity attacks, and skewness attacks. The effectiveness and efficiency of the proposed scheme are proven through theoretical and experimental analyses.
Similar content being viewed by others
References
Atzori L, Iera A, Morabito G (2010) The internet of things: a survey. Comput Netw 54(15):2787–2805
Islam SMR, Kwak D, Kabir MH, Hossain M, Kwak KS (2015) The internet of things for health care: a comprehensive survey. IEEE Access 3:678–708
Ge C, Yin C, Liu Z, Fang L, Zhu J, Ling H (2020) A privacy preserve big data analysis system for wearable wireless sensor network. Comput Secur 96:101887
Mukhopadhyay SC (2015) Wearable sensors for human activity monitoring: a review. IEEE Sensors J 15(3) Institute of Electrical and Electronics Engineers Inc.:1321–1330
Demuynck L, De Decker B (2005) Privacy-preserving electronic health records. In IFIP International Conference on Communications and Multimedia Security (pp. 150–159). Springer, Berlin, Heidelberg
Andrew J, Karthikeyan J (2020) Privacy-preserving big data publication:(K, L) anonymity. In: Intelligence in big data technologies—beyond the hype. Springer, pp. 77–88
Agrawal R, Srikant R (2000) Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data – SIGMOD ′00, pp. 439–450
Theoharidou M, Tsalis N, Gritzalis D (2016) Smart home solutions: privacy issues. In: Handbook of smart homes, health care and well-being. Springer International Publishing, pp. 67–81
Xue M, Papadimitriou P, Raïssi C, Kalnis P, Pung HK (2011) Distributed privacy preserving data collection. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol. 6587 LNCS, no. Part 1, pp. 93–107
Fung BCM, Wang K, Chen R, Yu PS (2010) Privacy-preserving data publishing: a survey of recent developments. ACM Comput Surv 42(4)
Summary of the HIPAA Security Rule|HHS.gov. [Online]. Available: https://www.hhs.gov/hipaa/for-professionals/security/laws-regulations/index.html. [Accessed: 03-Jul-2020]
Data protection in the EU|European Commission. [Online]. Available: https://ec.europa.eu/info/law/law-topic/data-protection/data-protection-eu_en. [Accessed: 03-Jul-2020]
Krishnamurthy B, Wills CE (2009) On the leakage of personally identifiable information via online social networks.
Andrew J, Karthikeyan J (2019) Privacy-preserving internet of things: techniques and applications. Int J Eng Adv Technol 8(6):3229–3234
Karthikeyan AJJ, Jebastin J (2019) Privacy preserving big data publication on cloud using Mondrian anonymization techniques and deep neural networks. In: 2019 5th international conference on advanced computing & communication systems (ICACCS), pp. 722–727
Onesimu JA, Karthikeyan J (2021) An efficient privacy-preserving deep learning scheme for medical image analysis. The Importance of Human Computer Interaction: Challenges, Methods and Applications. J Inf Technol Manag 12:50–67
Mohana S, Mary SASA (2016) Preserving privacy in health care information: a memetic approach. J Med Imaging Heal Informatics 6(3):779–783
Guan Z, Zhang Y, Wu L, Wu J, Li J, Ma Y, Hu J (2019) APPA: an anonymous and privacy preserving data aggregation scheme for fog-enhanced IoT. J Netw Comput Appl 125:82–92
Lu R, Heung K, Lashkari AH, Ghorbani AA (2017) A lightweight privacy-preserving data aggregation scheme for fog computing-enhanced IoT. IEEE Access 5:3302–3312
Song T, Li R, Mei B, Yu J, Xing X, Cheng X (2017) A privacy preserving communication protocol for IoT applications in smart homes. IEEE Internet Things J 4(6):1844–1852
Jayaraman PP, Yang X, Yavari A, Georgakopoulos D, Yi X (2017) Privacy preserving internet of things: from privacy techniques to a blueprint architecture and efficient implementation. Futur Gener Comput Syst.
Sharma S, Chen K, Sheth A (2018) Toward practical privacy-preserving analytics for IoT and cloud-based healthcare systems. IEEE Internet Comput 22(2):42–51
Andrew J, Mathew SS, Mohit B (2019) “A comprehensive analysis of privacy-preserving techniques in deep learning based disease prediction systems,” pp. 0–9
Ge C, Susilo W, Liu Z, Xia J, Szalachowski P, Liming F (2020) Secure keyword search and data sharing mechanism for cloud computing. IEEE Trans Dependable Secur Comput pp. 1–1
Ren Y et al (2020) Data query mechanism based on hash computing power of blockchain in Internet of Things. Sensors 20(1):207
Sweeney L (2002) k-anonymity: a model for protecting privacy. Int J Uncertain Fuzziness Knowl-Based Syst 10(05):557–570
Machanavajjhala A, Gehrke J, Kifer D, Venkitasubramaniam M (2006) L-diversity: privacy beyond k-anonymity. In: 22nd international conference on data engineering (ICDE’06), pp. 24–24
Ninghui L, Tiancheng, L, Venkatasubramanian S (2007), t-Closeness: privacy beyond k-anonymity and ℓ-diversity. In: Proceedings – international conference on data engineering, pp. 106–115
Prakash M, Singaravel G (2012) A new model for privacy preserving sensitive Data Mining. In: 2012 3rd international conference on computing, communication and networking technologies, ICCCNT 2012
Li N, Li T, Venkatasubramanian S (2010) Closeness: a new privacy measure for data publishing. IEEE Trans Knowl Data Eng 22(7):943–956
Prakash M, Singaravel G (2015) An approach for prevention of privacy breach and information leakage in sensitive data mining. Comput Electr Eng 45:134–140
Abdelhameed SA, Moussa SM, Khalifa ME (2019) Restricted sensitive attributes-based sequential anonymization (RSA-SA) approach for privacy-preserving data stream publishing. Knowl-Based Syst 164:1–20
Rana ME, Jayabalan M, Aasif MA (2016), Privacy preserving anonymization techniques for patient data: an overview. In: Third international congress on technology, communication and knowledge (ICTCK 2016
Guo K, Zhang Q (2013) Fast clustering-based anonymization approaches with time constraints for data streams. Knowl-Based Syst 46:95–108
He X, Chen HH, Chen Y, Dong Y, Wang P, Huang Z (2012) Clustering-based k-anonymity. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 7301 LNAI (1): 405–417
Wong R, Li J, Fu A, Wang K (2009) (Α, K)-anonymous data publishing. J Intell Inf Syst 33(2):209–234
Al Ameen M, Liu J, Kwak K (2012) Security and privacy issues in wireless sensor networks for healthcare applications. J Med Syst 36(1):93–101
Meingast M, Roosta T, Sastry S (2006) Security and privacy issues with health care information technology. In: 2006 international conference of the IEEE engineering in medicine and biology society, vol. 1, pp. 5453–5458
Li H, Guo F, Zhang W, Wang J, Xing J (2018) (a,k)-anonymous scheme for privacy-preserving data collection in IoT-based healthcare services systems. J Med Syst 42(3):56
Zhang N, Wang S, Zhao W (2005) A new scheme on privacy-preserving data classification. In: Proceeding of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining – KDD ′05, p. 374
Kim JW, Jang B, Yoo H (2018) Privacy-preserving aggregation of personal health data streams. PLoS One 13(11):e0207639
Huang M, Chen Y, Chen BW, Liu J, Rho S, Ji W (2016) A semi-supervised privacy-preserving clustering algorithm for healthcare. Peer-to-Peer Netw Appl 9(5):864–875
Sajjad H, Kanwal T, Anjum A, Malik SR, Khan A, Khan A, Manzoor U (2019) An efficient privacy preserving protocol for dynamic continuous data collection. Comput Secur 86:358–371
Sei Y, Okumura H, Takenouchi T, Ohsuga A (2019) Anonymization of sensitive quasi-identifiers for l-diversity and t-closeness. IEEE Trans Dependable Secur Comput 16(4):580–593
Wang G, Lu R, Huang C, Guan YL (2019) An efficient and privacy-preserving pre-clinical guide scheme for mobile eHealthcare. J Inf Secur Appl 46:271–280
Odelu V, Saha S, Prasath R, Sadineni L, Conti M, Jo M (2019) Efficient privacy preserving device authentication in WBANs for industrial e-health applications. Comput Secur 83:300–312
Arfaoui A, Kribeche A, Senouci S-M (2019) Context-aware anonymous authentication protocols in the internet of things dedicated to e-health applications. Comput Netw 159:23–36
Laurent M, Leneutre J, Chabridon S, Laaouane I (2019) Authenticated and privacy-preserving consent management in the Internet of Things. Procedia Comput Sci 151:256–263
Zhu Y, Li X (2020) Privacy-preserving k-means clustering with local synchronization in peer-to-peer networks. Peer-to-Peer Netw Appl, pp. 1–13
Lu Y, Sinnott RO (2018) Semantic privacy-preserving framework for electronic health record linkage. Telemat Informatics 35(4):737–752
Truta TM, Campan A, Sun X (2012) An overview of p-sensitive k-anonymity models for microdata anonymization. Int J Uncertain Fuzziness Knowl-Based Syst 20(06):819–837
Anjum A, Malik SR, Choo KKR, Khan A, Haroon A, Khan S, Khan SU, Ahmad N, Raza B (2018) An efficient privacy mechanism for electronic health records. Comput Secur 72:196–211
Jiang H-W, Wang Y-F, Xiong H-L (2016) The k-anonymity approach for data-publishing based on clustering partition. In: Wireless communication and sensor network, pp. 423–428
Boussada R, Hamdane B, Elhdhili ME, Saidane LA (2019) Privacy-preserving aware data transmission for IoT-based e-health. Comput Netw 162:106866
Yang Y, Zheng X, Guo W, Liu X, Chang V (2018) Privacy-preserving fusion of IoT and big data for e-health. Futur Gener Comput Syst 86:1437–1455
Li T, Gao C, Jiang L, Pedrycz W, Shen J 2018 Publicly verifiable privacy-preserving aggregation and its application in IoT. J Netw Comput Appl.
Zhang Y, Deng RH, Han G, Zheng D (2018) Secure smart health with privacy-aware aggregate authentication and access control in Internet of Things. J Netw Comput Appl 123:89–100
Sweeney L (2002) Achieving k-anonymity privacy protection using generalization and suppression. Int J Uncertain Fuzziness Knowl-Based Syst 10(5):571–588
Miyakawa S, Saji N, Mori T (2012) Location L-diversity against multifarious inference attacks. In: Proceedings – 2012 IEEE/IPSJ 12th international symposium on applications and the internet, SAINT 2012 1:1–10
Soria-Comas J, Domingo-Ferrer J, Sanchez D, Martinez S (2016) T-closeness through microaggregation: strict privacy with enhanced utility preservation. In: 2016 IEEE 32nd international conference on data engineering, ICDE 2016, pp. 1464–1465
Truta TM, Vinay B (2006) Privacy protection: P-sensitive k-anonymity property. In: ICDEW 2006 – proceedings of the 22nd international conference on data engineering workshops
Sun X, Sun L, Wang H (2011) Extended k-anonymity models against sensitive attribute disclosure. Comput Commun 34(4):526–535
Wong RC-W, Li J, Fu AW-C, Wang K (2006) (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing. BT – proceedings of the twelfth ACM SIGKDD international conference on knowledge discovery and data mining. Philadelphia, PA, USA, August 20–23, 2006, pp. 754–759
Abdelhameed SA, Moussa SM, Khalifa ME (2018) Privacy-preserving tabular data publishing: a comprehensive evaluation from web to cloud. Comput Secur 72:74–95
Samarati P (2001) Protecting respondents’ identities in microdata release. IEEE Trans Knowl Data Eng 13(6):1010–1027
Amiri F, Yazdani N, Shakery A, Chinaei AH (2016) Hierarchical anonymization algorithms against background knowledge attack in data releasing. Knowl-Based Syst 101:71–89
Shu X, Yao D, Bertino E (2015) Privacy-preserving detection of sensitive data exposure. IEEE Trans Inf Forensics Secur 10(5):1092–1103
Gronau I, Moran S (2007) Optimal implementations of UPGMA and other common clustering algorithms. Inf Process Lett 104(6):205–210
Iyengar VS (2002) Transforming data to satisfy privacy constraints. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining – KDD ′02, p. 279
Koch C, Karras P, Kalnis P, Mamoulis N (2007) Fast data anonymization with low information loss. Proc. 33rd Int. Conf. Very large data bases, p. 1444
Bayardo RJ, Agrawal R (2005), Data privacy through optimal k-anonymization. In: 21st international conference on data engineering (ICDE’05), pp. 217–228
Lin JL, Wei MC (2008) An efficient clustering method for k-anonymization. In: ACM international conference proceeding series, vol. 331, pp. 46–50
Xu Y, Ma T, Tang M, Tian W (2014) A survey of privacy preserving data publishing using generalization and suppression. Appl Math Inf Sci 8(3):1103
Maheshwarkar N, Pathak K (2011) Privacy issues for K-anonymity model. Vivekananad Chourey/Int J Eng Res Appl wwwijeracom 1:1857–1861
Meyerson A, Williams R (2004) On the complexity of optimal k-anonymity. In: Proceedings of the ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, vol. 23, pp. 223–228
Diaz C, Troncoso C, Danezis G (2007) Does additional information always reduce anonymity?. In: WPES’07 – proceedings of the 2007 ACM workshop on privacy in electronic society, pp. 72–75
A mathematical theory of communication – Shannon – 1948 – bell system technical Journal – Wiley Online Library. [Online]. Available: https://onlinelibrary.wiley.com/doi/10.1002/j.1538-7305.1948.tb01338.x. Accessed: 12-Mar-2020
Zargar ST, Joshi J, Tipper D (2013) A survey of defense mechanisms against distributed denial of service (DDOS) flooding attacks. IEEE Commun Surv Tutorials 15(4):2046–2069
Huraj L, Šimon M, Horák T (2020) Resistance of IoT sensors against DDoS attack in smart home environment. Sensors 20(18):5298
Ravi N, Shalinie SM (2020) Learning-driven detection and mitigation of DDoS attack in IoT via SDN-cloud architecture. IEEE Internet Things J 7(4):3559–3570
Ngo Q-D, Nguyen H-T, Nguyen L-C, Nguyen D-H (2020) A survey of IoT malware and detection methods based on static features. ICT Express
Kumar A, Lim TJ (2019) EDIMA: early detection of IoT malware network activity using machine learning techniques. In: IEEE 5th world forum on Internet of Things, WF-IoT 2019 – conference proceedings, pp. 289–294
Pudukotai Dinakarrao SM, Sayadi H, Makrani HM, Nowzari C, Rafatirad S, Homayoun H (2019) Lightweight node-level malware detection and network-level malware confinement in IoT networks. In: Proceedings of the 2019 design, automation and test in Europe conference and exhibition, date 2019, pp. 776–781
UCI machine learning repository: adult data set. [Online]. Available: https://archive.ics.uci.edu/ml/datasets/adult. [Accessed: 02-Mar-2019]
Zakerzadeh H, Osborn SL (2011) FAANST: fast anonymizing algorithm for numerical streaming data. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol. 6514 LNCS, pp. 36–50
Cao J, Carminati B, Ferrari E, Tan KL (2011) CASTLE: continuously anonymizing data streams. IEEE Trans Dependable Secur Comput 8(3):337–352
Poulis G, Loukides G, Gkoulalas-Divanis A, Skiadopoulos S (2013) Anonymizing data with relational and transaction attributes. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol. 8190 LNAI, no. PART 3, pp. 353–369
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the Topical Collection: Special Issue on Privacy-Preserving Computing
Guest Editors: Kaiping Xue, Zhe Liu, Haojin Zhu, Miao Pan and David S.L. Wei
Rights and permissions
About this article
Cite this article
Onesimu, J.A., Karthikeyan, J. & Sei, Y. An efficient clustering-based anonymization scheme for privacy-preserving data collection in IoT based healthcare services. Peer-to-Peer Netw. Appl. 14, 1629–1649 (2021). https://doi.org/10.1007/s12083-021-01077-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12083-021-01077-7