Skip to main content

A Multi-pattern Matching Algorithm for Chinese-Hmong Mixed Strings

  • Conference paper
  • First Online:
Natural Language Processing and Chinese Computing (NLPCC 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11839))

Abstract

To solve the problem of rapid retrieval of Chinese-Hmong mixed text, a multi-pattern matching algorithm in double-bytes unit combined with the idea of AC algorithm and the mismatch processing strategy of Horspool algorithm is proposed for the Chinese-Hmong mixed strings. In this algorithm, a deterministic finite automaton is constructed based on the pattern-set according to the idea of AC algorithm, and the moving distance of the pattern is calculated by the bad-character rule of the Horspool algorithm, and the text is only traversed once to complete the quick search task of all patterns by using the finite automata. The experimental results show that the proposed algorithm has a good performance in multi-pattern matching for Chinese-Hmong mixed texts in different scale, even for the mixed texts containing more than 100,000 characters, the matching efficiency is also significantly higher than the AC algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Yang, Z.B., Luo, H.Y.: On the folk coinage of characters of the Miao people in Xiangxi area. J. Jishou Univ. (Soc. Sci. Edn.) 29(6), 130–134 (2008)

    Google Scholar 

  2. Zeng, L., Mo, L.P., Liu, B.Y., et al.: Extended Horspool algorithm and its application in square Hmong string pattern matching. J. Jishou Univ. (Nat. Sci. Edn.) 39(4), 150–156 (2018)

    Google Scholar 

  3. Aho, A.V., Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Commun. ACM 18(6), 333–340 (1975)

    Article  MathSciNet  Google Scholar 

  4. Han, G.H., Zeng, C.: Theoretical research of KMP algorithm. Microelectron. Comput. 30(4), 30–33 (2013)

    Google Scholar 

  5. Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Commun. ACM 20(10), 762–772 (1977)

    Article  Google Scholar 

  6. Cole, R., Hariharan, R., Paterson, M., Zwick, U.: Tighter lower bounds on the exact complexity of string matching. SIAM J. Comput. 24(6), 30–45 (1995)

    Article  MathSciNet  Google Scholar 

  7. Cole, R., Hariharan, R.: Tighter upper bounds on the exact complexity of string matching. SIAM J. Comput. 26(3), 803–856 (1997)

    Article  MathSciNet  Google Scholar 

  8. Zhao, X., He, L.F., Wang, X., et al.: An efficient pattern matching algorithm for string searching. J. Shanxi Univ. Sci. Technol. (Nat. Sci. Edn.) 35(1), 183–187 (2017)

    Google Scholar 

  9. Guibas, L.J., Odlyzko, A.M.: A new proof of the linearity of the Boyer-Moore string searching algorithm. SIAM J. Comput. 9(4), 672–682 (1980)

    Article  MathSciNet  Google Scholar 

  10. Sunday, D.M.: A very fast substring search algorithm. Commun. ACM 33(8), 132–142 (1990)

    Article  Google Scholar 

  11. Wang, W.X.: Research and improvement of the BM pattern matching algorithm. J. Shanxi Normal Univ. (Nat. Sci. Edn.) 32(1), 37–39 (2017)

    Google Scholar 

  12. Horspool, R.N.: Practical fast searching in strings. Softw.-Pract. Exper. 10(6), 501–506 (1980)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of Hunan Province (No. 2019JJ40234), the Natural Science Foundation of China (No. 61462029), the Research Study and Innovative Experimental Project for College Students in Hunan Province (No. 20180599) and the Research Study and Innovative Experimental Project for College Students in Jishou University (No. JDCX20180122).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li-Ping Mo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

He, SP., Mo, LP., Kang, DW. (2019). A Multi-pattern Matching Algorithm for Chinese-Hmong Mixed Strings. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32236-6_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32235-9

  • Online ISBN: 978-3-030-32236-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics