Skip to main content

Research on the Text Length’s Effect of the Text Similarity Measurement

  • Conference paper
Information and Automation (ISIA 2010)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 86))

Included in the following conference series:

  • 1190 Accesses

Abstract

Similarity measurement plays the fundamental role in the classification of information resources and transmission of network information. According to the research of text-based similarity algorithm on three-layer structure, add the word difference factors to the measurement method of the original text similarity factor, thereby reducing the similarity measurement error resulted by semantics and words difference. The results demonstrate that compare with the improved algorithm and the similarity measurement method base on the original three-layer structure, the measurement accuracy can be improved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Nirenburg, S., Domashnev, C., Grannes, D.J.: Two approaches to matching in example-based machine translation. In: Proceedings of TMI 1993, Kyoto, Japan, vol. 7, pp. 47–57 (1993)

    Google Scholar 

  2. Levenshtein, V.I.: Binary codes capable of correcting spurious insertions and deletions of ones (orginal in Russian). Russian Problemy Peredachi informatsii 1, 12–25 (1965)

    Google Scholar 

  3. Peter, N.Y., Kirk, G.K.: The like it intelligent string comparison facility. NEC Institute Tech. Report, 093 (1997)

    Google Scholar 

  4. Lambros, C., Harris, P., Stelios, P.: A Matching Technique in Example-based Machine Translation. In: Proceeding of COLING 1994 (1994)

    Google Scholar 

  5. Salton, G., Mcgill, M.: Introduction to Modern Information Retrival. McGraw-Hill, New York (1983)

    MATH  Google Scholar 

  6. Salton, G., Chris, B.: Term Weighting Approaches in Automatic Text. Retrieval Information Processing and Management 24(5), 513–523 (1988)

    Article  Google Scholar 

  7. Ding, C.H.Q., He, X., Zha, H.y., Gu, M., Simon, H.D.: A Min-max Cut Algorithm for Graph partitioning and Data Clustering. IEEE, Los Alamitos (2001)

    Book  Google Scholar 

  8. Belkin, N., Croft, W.B.: Information filtering and information retrieval, two sides of the same coin. Communications of the ACM 33(12), 29–38 (1992)

    Article  Google Scholar 

  9. Yu, J.-l., Zhou, C., Liu, R.-j.: Experimental research on premixed gases explosion in overpressure. Journal of Dalian University of Technology 45(2), 291–297 (2005)

    Google Scholar 

  10. Yu, G., Pei, Y.-j., Zhu, Z.-y., Chen, H.-y.: Research of text similarity based on word similarity computing. Computer Engineering and Design 27(2), 241–244 (2006)

    Google Scholar 

  11. Che, W., Liu, T., Qin, B., Li, S.: Chinese Sentences Similarity Computation Oriented the Searching in Bilingual Sentence Pair [A]. JSCL, 81–88 (2003)

    Google Scholar 

  12. Liu, Q., Li, S.-J.: The words similarity and sentence similarity research. In: Third Chinese Lexical Semantics Workshop TECHNOLOGY, pp. 59–76 (2002)

    Google Scholar 

  13. Yu, S., Duan, H., Tian, J.: Machinery Digest principle and implementation of automatic evaluation. In: LeQuan, W. (ed.) Intelligent Computer Interface and Application——Third China Computer Intelligent Interface and Intelligent Applications Conference Technology, pp. 230–233. Electronic Industry Press, BeiJing (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Niu, Y., Chen, Y. (2011). Research on the Text Length’s Effect of the Text Similarity Measurement. In: Qi, L. (eds) Information and Automation. ISIA 2010. Communications in Computer and Information Science, vol 86. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19853-3_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19853-3_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19852-6

  • Online ISBN: 978-3-642-19853-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics