Skip to main content

Image Based Mail Piece Identification Using Unsupervised Learning

  • Conference paper
  • First Online:
  • 2894 Accesses

Abstract

Next generation postal sorting machines reuse once extracted mail piece addresses in different sorting steps by means of the mail piece image. Based on the mail piece uniqueness, characteristics derived from the image guarantee the assignment of stored addresses. During the first sorting step mail piece characteristics are extracted and stored together with the target address in a database. In subsequent sorting steps the address is accessed by determining the corresponding mail piece characteristics in the database. Appropriate mail piece image characteristics and procedures for their distance measurement were presented in a previous work.

Image based mail piece identification poses a challenge by a constantly changing and non-deterministic mail spectrum and the differentiation of nearly identical bulk mail.

In particular, the rejection of unknown mail pieces requires the definition of carefully chosen rejection classes depending on the current mail spectrum. In this paper we present an approach for distance based mail piece identification using a two-stage classification process. Bulk and private mail are handled individually by an unsupervised learning process which clusters similar mail piece characteristics. Based on these clusters specific rejection classes can be estimated within each cluster. The first step in the identification process is the determination of the corresponding cluster for a given mail piece. Using the cluster specific rejection classes a mail piece is either identified or rejected. Experimental results obtained on real-world data sets prove the applicability of the proposed method.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Doermann, D. S., Li, H., & Kia, O. E. (2003). The detection of duplicates in document image databases. In IEEE International Conference on Document Analysis and Recognition (pp. 314–318). Ulm, Germany.

    Google Scholar 

  • Foo, J. J., Zobel, J., & Sinha, R. (2007). Clustering near-duplicate images in large collections. In ACM SIGMM International Workshop on Multimedia Information Retrieval (pp. 21–30). Augsburg, Germany.

    Google Scholar 

  • Hu, J., Kashi, R. S., & Wilfong, G. T. (1999). Document image layout comparison and classification. In IEEE International Conference on Document Analysis and Recognition (pp. 285–288). Bangalore, India.

    Google Scholar 

  • Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data – An introduction to cluster analysis. New York: Wiley.

    Google Scholar 

  • Milligan, G., & Cooper, M. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50(2), 159–179.

    Article  Google Scholar 

  • Peng, H., Long, F., & Chi, Z. (2003). Document image recognition based on template matching of component block projections. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(9), 1188–1192.

    Article  Google Scholar 

  • Peng, H., Long, F., Siu, W., Chi, Z., & Feng, D. D. (2000). Document image matching based on component blocks. In IEEE International Conference on Image Processing (pp. 601–604). Vancouver, Canada.

    Google Scholar 

  • van Beusekom, J., Shafait, F., & Breuel, T. M. (2007). Image-matching for revision detection in printed historical documents. In Springer Symposium of the German Association for Pattern Recognition (pp. 507–516). Heidelberg, Germany.

    Google Scholar 

  • Worm, K., & Meffert, B. (2008). Robust image based document comparison using attributed relational graphs. In IASTED International Conference on Signal Processing, Pattern Recognition, and Applications (pp. 116–121). Innsbruck, Austria.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Katja Worm .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Worm, K., Meffert, B. (2009). Image Based Mail Piece Identification Using Unsupervised Learning. In: Fink, A., Lausen, B., Seidel, W., Ultsch, A. (eds) Advances in Data Analysis, Data Handling and Business Intelligence. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01044-6_35

Download citation

Publish with us

Policies and ethics