Skip to main content

Research on Key Technology of Distributed Indexing and Retrieval System Based on Lucene

  • Conference paper
  • First Online:
Computational Intelligence and Intelligent Systems (ISICA 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 874))

Included in the following conference series:

  • 676 Accesses

Abstract

Taking Chinese as the language object, after analyzing the current Chinese word segmentation algorithm and Lucene relevance ranking algorithm, an improved word segmentation algorithm and an improved relevance ranking algorithm based on Lucene full-text search toolkit were proposed. This paper also uses distributed storage, parallel computing, inverted indexing and retrieval techniques to analyze and design a search engine for digital information in the network to provide users with fast and accurate search service for massive digital information. The experimental analysis compares the speed of word segmentation and word segmentation by comparing various word segmentation algorithms and compares their response time, the number of hits, the accuracy and the recall rate of the keyword search results. The experimental results show that the system greatly improves the information Search speed to ensure the accuracy of search results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Erciyes, K.: Parallel and distributed computing. In: Erciyes, K. (ed.) Distributed and Sequential Algorithms for Bioinformatics. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-319-24966-7_4

    Chapter  MATH  Google Scholar 

  2. Ding, G.Q., Lin, M.: Research the key technologies of the Mongolian full-text retrieval based on Lucene. Appl. Mech. Mater. 347–350, 2185–2190 (2013)

    Article  Google Scholar 

  3. Malekimajd, M., Ardagna, D., Ciavotta, M., et al.: Optimal map reduce job capacity allocation in cloud systems. ACM SIGMETRICS Perform. Eval. Rev. 42(4), 51–61 (2015)

    Article  Google Scholar 

  4. Wang, H.W., Wang, W., Meng, Y.: Countering page ranking spam for search engine based on text content and link structure analysis. Syst. Eng. Theory Pract. 35(2), 445–457 (2015). Xitong Gongcheng Lilun Yu Shijian

    Google Scholar 

  5. Gennaro, C.: Large scale deep convolutional neural network features search with Lucene (2016)

    Google Scholar 

  6. Stalnaker, D., Zanibbi, R.: Math expression retrieval using an inverted index over symbol pairs. In: Proceedings of SPIE - The International Society for Optical Engineering, vol. 9402, pp. 940207–940207-12 (2015)

    Google Scholar 

  7. Procházka, P., Holub, J.: Positional inverted self-index. In: Data Compression Conference, pp. 627–627. IEEE (2016)

    Google Scholar 

  8. Wei, D., Hong, M., Song, Y.: Research of the Mongolian synergistic index technology based on Lucene. In: IEEE International Conference on Software Engineering and Service Science, pp. 322–325. IEEE (2015)

    Google Scholar 

  9. Gupta, D., Singh, D.: User preference based page ranking algorithm. In: International Conference on Computing, Communication and Automation, pp. 166–171. IEEE (2017)

    Google Scholar 

  10. Beebe, N.L., Liu, L.: Ranking algorithms for digital forensic string search hits. Digit. Investig. 11(S2), S124–S132 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rongrong Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, R. (2018). Research on Key Technology of Distributed Indexing and Retrieval System Based on Lucene. In: Li, K., Li, W., Chen, Z., Liu, Y. (eds) Computational Intelligence and Intelligent Systems. ISICA 2017. Communications in Computer and Information Science, vol 874. Springer, Singapore. https://doi.org/10.1007/978-981-13-1651-7_23

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-1651-7_23

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-1650-0

  • Online ISBN: 978-981-13-1651-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics