Skip to main content

Text Extraction for Spam-Mail Image Filtering Using a Text Color Estimation Technique

  • Conference paper
New Trends in Applied Artificial Intelligence (IEA/AIE 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4570))

Abstract

In this paper, we propose an algorithm for extracting text regions from images in spam-mails. The Color Layer-Based Text Extraction(CLTE) algorithm divides the input image into eight planes as color layers. It extracts connected components on the eight planes, and then classifies them into either text regions or non-text. We also propose an algorithm to recover damaged text strokes in Korean text images. There are two types of damaged strokes: (1) middle strokes such as ‘⌉’ or ‘—’ are deleted, and (2) the first and last strokes such as ‘∘’ or ‘□’ are filled with black pixels. An experiment with 200 spammail images shows that the proposed approach is more accurate than conventional methods by over 10%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zhong, Y., Zhang, H., Jain, A.K.: Automatic Caption Localization in Compressed Video. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(4), 385–392 (2000)

    Article  Google Scholar 

  2. Zhong, Y., Karu, K., Jain, A.K.: Locating Text in Complex Color Images. Pattern Recognition 28(10), 1523–1535 (1995)

    Article  Google Scholar 

  3. Wolf, C., Jolion, J.M.: Extraction and Recognition of Artificial Text in Multimedia Documents. Pattern Analysis and Applications 6(4), 306–326 (2003)

    Google Scholar 

  4. Wang, X., Ding, X., Liu, C.: Character Extraction and Recognition in Natural Scene Images. In: Proc. Sixth ICDAR, pp. 1084–1088 (2001)

    Google Scholar 

  5. Kim, J.S., Park, S.C., Kim, S.H.: Text locating from Natural Scene Images Using Image Intensities. In: Proc. 8th ICDAR, Seoul, Korea, pp. 655–659 (August 2005)

    Google Scholar 

  6. Choi, Y.U.: Scene Text Extraction in Natural Images Using Hierarchical Feature Combining and Verification. In: The 2nd KAIST-Tsinghua JWPR, Daejeon, Korea, pp. 76–102 (2003)

    Google Scholar 

  7. Ballard, D.H., Brown, C.M.: Computer Vision. Prentice-Hall, Englewood Cliffs (1982)

    Google Scholar 

  8. Kim, S.H., Park, S.C., Jeong, C.B., Kim, J.S., Park, H.R., Lee, G.S.: Keyword Spotting on Korean Document Images by Matching the Keyword Image. In: Fox, E.A., Neuhold, E.J., Premsmit, P., Wuwongse, V. (eds.) ICADL 2005. LNCS, vol. 3815, pp. 158–166. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Hiroshi G. Okuno Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Kim, JS., Kim, S.H., Yang, H.J., Son, H.J., Kim, W.P. (2007). Text Extraction for Spam-Mail Image Filtering Using a Text Color Estimation Technique. In: Okuno, H.G., Ali, M. (eds) New Trends in Applied Artificial Intelligence. IEA/AIE 2007. Lecture Notes in Computer Science(), vol 4570. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73325-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73325-6_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73322-5

  • Online ISBN: 978-3-540-73325-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics