Soft Computing

, Volume 22, Issue 9, pp 3111–3121 | Cite as

Improving file locality in multi-keyword top-k search based on clustering

  • Lanxiang Chen
  • Nan Zhang
  • Kuan-Ching Li
  • Shuibing He
  • Linbing Qiu
Methodologies and Application
  • 28 Downloads

Abstract

Nowadays, fast growing number of users and business are motivated to outsource their private data to public cloud servers. Taking into consideration security issues, private data should be encrypted before being outsourced to remote servers, though this makes traditional plaintext keyword search rather difficult. For this reason, there exists an urgent need of an efficient and secure searchable encryption technology. In this paper, an affinity propagation (AP) K-means clustering method (CAK-means, a combination of AP and K-means clustering) is proposed to realize fast searchable encryption in Big Data environments. CAK-means clustering utilizes affinity propagation to initialize K-means clustering, thereby making the clustering process faster, stable and effectively improving the initial clustering center quality of the K-means. As the AP algorithm identifies the clustering center with much lower errors than other methods, it significantly improves the search accuracy. Simultaneously, the related files in one cluster are stored at the contiguous locality of disks which will substantially improve the file locality and speedup the read and write disk I/O. Additionally, the coordinated matching measure is utilized to support accurate ranking of search results. Experimental results show that the proposed CAK-means-based multi-keyword ranked searchable encryption scheme (MRSE-CAK) has higher search efficiency and accuracy while simultaneously ensuring equivalent security.

Keywords

Searchable symmetric encryption CAK-means clustering File locality Multi-keyword Ranked search 

Notes

Acknowledgements

This work was supported by the Natural Science Foundation of China (Nos. 61602118, 61572010 and 61472074), Fujian Normal University Innovative Research Team (No. IRTL1207), Natural Science Foundation of Fujian Province (Nos. 2015J01240, 2017J01738), Science and Technology Projects of Educational Office of Fujian Province (No. JK2014009), and Fuzhou Science and Technology Plan Project (No. 2014-G-80).

Compliance with ethical standards

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

References

  1. Asharov G, Naor M, Segev G, et al (2016) Searchable symmetric encryption: optimal locality in linear space via two-dimensional balanced allocations. In: Proceedings of the international conference on ACM symposium on theory of computing, Cambridge, MA, USA, pp 1101–1114Google Scholar
  2. Cao N, Wang C, Li M et al (2014) Privacy-preserving multi-keyword ranked search over encrypted cloud data. IEEE Trans Parallel Distrib Syst 25(1):222–233CrossRefGoogle Scholar
  3. Cash D, Tessaro S (2014) The locality of searchable symmetric encryption. In: Proceedings of the international conference on the theory and applications of cryptographic techniques, Copenhagen, Denmark, pp 351–368Google Scholar
  4. Chen C, Zhu X, Shen P et al (2016) An efficient privacy-preserving ranked keyword search method. IEEE Trans Parallel Distrib Syst 27(4):951–963CrossRefGoogle Scholar
  5. Chen L, Qiu L, Li KC et al (2017) DMRS: an efficient dynamic multi-keyword ranked search over encrypted cloud data. Soft Comput 21(16):4829–4841CrossRefGoogle Scholar
  6. Chen L, Qiu L, Li K-C, Zhou S (2018) A secure multi-keyword ranked search over encrypted cloud data against memory leakage attack. J Internet Technol 19(1):179–188Google Scholar
  7. Curtmola R, Garay J, Kamara S, et al (2006) Searchable symmetric encryption: improved definitions and efficient constructions. In: Proceedings of the international conference on ACM conference on computer and communications security, Alexandria, VA, USA, pp 79–88Google Scholar
  8. Demertzis I, Papamanthou C (2017) Fast searchable encryption with tunable locality. In: Proceedings of the international conference ACM international conference on management of data, Chicago, Illinois, USA, pp 1053–1067Google Scholar
  9. Feingold DG, Varga RS (1962) Block diagonally dominant matrices and generalizations of the Gerschgorin circle theorem. Pac J Math 12(4):1241–1250MathSciNetCrossRefMATHGoogle Scholar
  10. Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976MathSciNetCrossRefMATHGoogle Scholar
  11. Fu Z, Sun X, Liu Q, Zhou L, Shu J (2015) Achieving efficient cloud search services: multi-keyword ranked search over encrypted cloud data supporting parallel computing. IEICE Trans Commun 98(1):190–200CrossRefGoogle Scholar
  12. Huang Z (1998) Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min Knowl Disc 2(3):283–304MathSciNetCrossRefGoogle Scholar
  13. Ishai Y, Kushilevitz E, Ostrovsky R (2006) Cryptography from anonymity. In: Proceedings of the international conference on foundations of computer science, Washington, DC, USA, pp 239–248Google Scholar
  14. Kamara S, Moataz T (2017) Boolean searchable symmetric encryption with worst-case sub-linear complexity. In: Proceedings of the international conference on the theory and applications of cryptographic techniques, Paris, France, pp 94–124Google Scholar
  15. MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the international conference on Berkeley symposium on mathematical statistics and probability, California, USA, pp 281–297Google Scholar
  16. Miers I, Mohassel P (2017) IO-DSSE: scaling dynamic searchable encryption to millions of indexes by improving locality. In: Proceedings of the international conference on network and distributed system security symposium, San Diego, California, pp 1–13Google Scholar
  17. Poh GS, Chin JJ, Yau WC et al (2017) Searchable symmetric encryption: designs and challenges. ACM Comput Surv 50(3):40CrossRefGoogle Scholar
  18. Wang J, Chen X, Li J et al (2017) Towards achieving flexible and verifiable search for outsourced database in cloud computing. Future Gener Comput Syst 67:266–275CrossRefGoogle Scholar
  19. Wang B, Yu S, Lou W, et al (2014) Privacy-preserving multi-keyword fuzzy search over encrypted data in the cloud. In: Proceedings of the international conference on computer communications, Toronto, Canada, pp 2112–2120Google Scholar
  20. Witten IH, Moffat A, Bell TC (1999) Managing gigabytes: compressing and indexing documents and images. Morgan Kaufmann Publishing, San FranciscoMATHGoogle Scholar
  21. Xia Z, Wang X, Sun X et al (2016) A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans Parallel Distrib Syst 27(2):340–352CrossRefGoogle Scholar
  22. Zhu Y, Yu J, Jia C (2009) Initializing K-means clustering using affinity propagation. In: Proceedings of the international conference on hybrid intelligent systems, Shenyang, China, pp 338–343Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • Lanxiang Chen
    • 1
  • Nan Zhang
    • 1
  • Kuan-Ching Li
    • 2
  • Shuibing He
    • 3
  • Linbing Qiu
    • 1
  1. 1.Fujian Provincial Key Laboratory of Network Security and Cryptology, College of Mathematics and InformaticsFujian Normal UniversityFuzhouChina
  2. 2.Department of Computer Science and Information Engineering (CSIE)Providence UniversityTaichungTaiwan
  3. 3.Computer SchoolWuhan UniversityWuhanChina

Personalised recommendations