Segmentation of Digital Images

Pavlidis, George

doi:10.1007/978-981-10-2830-4_3

George Pavlidis²

Part of the book series: Signals and Communication Technology ((SCT))

558 Accesses
1 Citations

Abstract

Most of the world’s cultural and scientific heritage is still available in printed form. In the past decades a number of actions and projects targeted the digitization of the printed material and the disclosure of the wealth of this content through Web technologies. One of the obstacles in this distribution is the lack of an efficient method of managing the large volume of data involved in archives of scanned documents. Since, digital image segmentation is identified as of paramount importance in mixed raster content compression, this chapter introduces to the field of segmentation and provides a brief description of known segmentation techniques and their applications.

—We bring the interpretation process into awareness through tricks. First, we degrade the image, making interpretation difficult. Second, we provide competing organizations, making possible several conflicting interpretations of the same image. Third, we provide organization without meaning to see how past experience affects the process.

Peter Lindsay and Donald Norman

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The term ‘requires’ is used to denote the requirements relating to the HVS modeling.
2.
Fisher linear discriminant (Brown 1999), is a clustering method, in which data of high dimensionality are projected on a line and a clustering is applied on the one dimensional space. The projection maximizes the distance between the average values of the classes, while, at the same time, it minimizes the variance within each class. The Fisher criterion that is being maximized for all line projections is defined for two classes as \(J(w)=\frac{|m_1-m_2|^2}{\sigma _1^2+\sigma _2^2}\), where m is the mean and \(\sigma ^2\) the variance of the classes.
3.
Dynamic Programming is a method for problem solving, in which the solution path is designed in reverse (from the required outcome backwards to the beginning).
4.
Region growing methods are bottom-up approaches, since they start by a single pixel and scale up to the overall image, whereas region splitting methods are top-down approaches, since they start by examining the whole image and scale down to the single pixel.
5.
The term feature images denotes images that consist only of detected features and not typical pixel values.
6.
In essence, Monte-Carlo statistics support the creation of a probability density function for the study of the effect of noise in the data. Monte-Carlo processes may prove useful in the study of the characteristics of a distribution, which are affected by noise, along with the study of characteristics that are crucial for the interpretation of the data.
7.
This corresponds to a family of methodologies based on the analysis of repeated texture structures. In general, a number of matrices is calculated for a texture, and from those various other features are derived. The co-occurrence matrix is defined by a magnitude measure and an angle, and its mathematical expression is: \(C_{\theta ,d}(x,y)=|\{(m,n) \in (M \times N) \times (M \times N):d(m,n)=d, tan^{-1}(m-n)=\theta \mathrm {\;or\;} \pi - \theta \mathrm {\;and\;} f(m)=x, f(n)=y\}|\) where d(a,b) a distance measure, usually \(d[(a,b),(c,d)]=max(|a-c|,|b-d|)\) and \(f[(M \times N)\mapsto N(0,255)]\) the image under analysis. This definition guarantees that the matrix is diagonal.
8.
Gabor expressed but not proved that the from all real-valued functions, the Hermite functions have the smallest product of implicit uncertainty. It holds that: \(g_n(x)=H_n(x)e^{-\frac{1}{2}x^2}\), with \(H_n(x)=(-1)^n e^{x^2} \frac{\mathrm {d}^n}{\mathrm {d}x^n} e^{-x^2}\) the Hermite polynomials.
9.
MLP is the most common type of neural network. It is simple yet with powerful mathematical basis. The input data pass through layers of neurons, with the input layer consisting of as many neurons as the variables of the problem. The output layer has so many neurons as the desired output variables (often just one). The intermediate layers are called hidden.
10.
Assuming S a closed subspace, then for any \(\epsilon >0\) and \(N(\epsilon )\) the minimum number of spheres of radius \(\le \epsilon \) necessary to cover all of S, if there exists a \({\delta }\) such that: \({\delta } = -\lim _{\epsilon \rightarrow 0^{+}} \frac{log N(\epsilon )}{log\epsilon }\), then \({\delta }\) is called the fractal dimension of S. In other words, the fractal dimension can be calculated by the limit of the ratio of the logarithm of change in the size of an object to the logarithm of the change in the measurement scale, as this scale tends to zero. In practice the following relation is used: \({\delta }=log(\mathrm {number\;of\;self-similar\;pieces})/log (\mathrm {magnification\;factor})\).

References

Adams, R., & Bischof, L. (1994). Seeded region growing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(6), 641–647.
Article Google Scholar
Ahuja, N., Rosenfeld, A., & Haralick, R. M. (1980). Neighbor gray levels as features in pixel classification. Pattern Recognition, 12(4), 251–260.
Article Google Scholar
Atsalakis, A., Papamarkos, N., & Andreadis, I. (2002a). On estimation of the number of image principal colors and color reduction through self-organized neural networks. International Journal of Imaging Systems and Technology, 12(3), 117–127.
Google Scholar
Atsalakis, A., Kroupis, N., Soudris, D., & Papamarkos, N. (2002b). A window-based color quantization technique and its embedded implementation. IEEE International Conference on Image Processing ICIP 2002, Rochester, USA.
Google Scholar
Basu, S. (1987). Image segmentation by semantic method. Pattern Recognition, 20(5), 497–511.
Article Google Scholar
Beaulieu, J. M., & Goldberg, M. (1989). Hierarchy in picture segmentation: A stepwise optimization approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(2), 150–163.
Article Google Scholar
Bhanu, B., & Faugeras, O. D. (1982). Segmentation of images having unimodal distributions. IEEE Transactions on Pattern Recognition and Machine Intelligence, 4(4), 408–419.
Article Google Scholar
Bhanu, B., & Parvin, B. A. (1987). Segmentation of natural scenes. Pattern Recognition, 20(5), 487–496.
Article Google Scholar
Bottou, L., Haffner, P., Howard, P., Simard, P., Bengio, Y., & LeCunn, Yann. (1998). High quality document image compression with DjVu. Journal of Electronic Imaging, 7(3), 410–425.
Article Google Scholar
Braquelaire, J. P., & Brun, L. (1998). Image segmentation with topological maps and inter-pixel representation. Journal of Visual Communication and Image Representation, 9(1), 62–79.
Article Google Scholar
Brodatz, P. (1966). Textures: A photographic album for artists and designers. 1 edn. Dover Publications, Inc., ASIN: B000ZGO6XQ.
Google Scholar
Brown, M. (1999). Fisher’s Linear Discriminant.
Google Scholar
Buckley, R., Venable, D., & McIntyre, L. (1997) (November 17–20). New developments in color facsimile and Internet fax. Proceedings of the Fifth Color Imaging Conference: Color Science, Systems, and Applications (pp. 296–300).
Google Scholar
Campbell, N. W., Thomas, B. T., & Troscianko, T. (1997). Automatic segmentation and classification of outdoor images using neural networks. International Journal of Neural Systems, 8(1), 137–144.
Article Google Scholar
Canny, J., & Computational, A. (1986). Approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6), 679–698.
Article Google Scholar
Chan, F. H. Y., Lam, F. K., & Zhu, H. (1998). Adaptive thresholding by variational method. IEEE Transactions on Image Processing, 2(3), 168–174.
Google Scholar
Chang, Y. L., & Li, X. (1994). Adaptive image region-growing. IEEE Transactions on Image Processing, 3(6), 868–873.
Article MathSciNet Google Scholar
Chen, W., & Chen, S. (1998). Adaptive page segmentation for color technical journals’ cover images. Elsevier Image and Vision Computing, 16, 855–877.
Article Google Scholar
Cheng, H., & Bouman, C. A. (1998). Trainable context model for multiscale segmentation. Proceedings of IEEE International Conference on Image Processing (ICIP 98) (vol. 1(October 4–7), pp. 610–614).
Google Scholar
Cheng, H., Bouman, C. A., & Allebach, J. (1997) (May 18–23). Multiscale document segmentation. Proceedings of IS&T’s 50th Annual Conference (pp. 417–425).
Google Scholar
Cheng, H., Bouman, C. A., & Bouman, A. (2001). Document compression using rate-distortion optimized segmentation. Journal of Electronic Imaging, 10(2), 460–474.
Article Google Scholar
Cheriet, M., Said, J. N., & Suen, C. Y. (1998). Recursive thresholding technique for image segmentation. IEEE Transactions on Image Processing, 7(6), 918–920.
Google Scholar
Cho, K., & Meer, P. (1997). Image segmentation from consensus information. Computer Vision and Image Understanding, 68(1), 72–89.
Article Google Scholar
Christopoulos, C., Ebrahimi, T., & Lee, S. U. (2002). JPEG2000 Special Issue. Elsevier signal processing: Image communication (vol. 17). Elsevier.
Google Scholar
Comer, M. L., & Delp, E. J. (1999). Segmentation of textured images using a multi-resolution Gaussian autoregressive model. IEEE Transactions on Image Processing, 8(3), 408–420.
Article Google Scholar
Davies, E. R. (1990). Machine Vision: Theory, Algorithms. Practicalities: Academic Press. ISBN 978-0122060908.
Google Scholar
DeQueiroz, R. L., Buckley, R., & Xu, M. 1999 (February). Mixed raster content (MRC) model for compound image compression. Proceedings IS&T/SPIE Symposium on Electronic Imaging, Visual Communications and Image Processing (vol. 3653, pp. 1106–1117).
Google Scholar
Dubes, R. C. (1987). How many clusters are best?-an experiment. Pattern Recognition, 20(6), 645–663.
Article Google Scholar
Dubes, R. C., & Jain, A. K. (1976). Clustering techniques: The user’s dilemma. Pattern Recognition, 8, 247–260.
Article Google Scholar
Frigui, H., & Krishnapuram, R. (1999). A robust competitive clustering algorithm with applications in computer vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(5), 450–465.
Article Google Scholar
Gambotto, J. P. (1993). A new approach to combining region growing and edge detection. Pattern Recognition Letters, 14(11), 869–875.
Article MATH Google Scholar
Gonzalez, R. C., & Woods, R. E. (1992). Digital image processing. 3 edn. Prentice Hall, ISBN: 978-0201508031.
Google Scholar
Haddon, J. F., & Boyce, J. F. (1990). Image segmentation by unifying region and boundary information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10), 929–948.
Article Google Scholar
Haddon, J. F., & Boyce, J. F. (1993). Co-occurrence matrices for image analysis. Electronics and Communication Engineering Journal, 71–83.
Google Scholar
Haddon, J. F., & Boyce, J. F. (1994). Texture classification of segmented regions of FLIR images using neural networks. Proceedings of the International Conference on Image Processing.
Google Scholar
Haddon, J. F., & Boyce, J. F. (1998). Integrating spatio-temporal information in image sequence analysis for the enforcement of consistency of interpretation. Digital Signal Processing, special issue on image analysis and information fusion.
Google Scholar
Haddon, J. F., Schneebeli, M., & Buser, O. (1997) (May 25–30). Automatic segmentation and classification using a co-occurrence based approach. Proceedings of the 2nd International Conference on Imaging Technologies: Techniques and Applications in Civil Engineering.
Google Scholar
Hansen, M. W., & Higgins, W. E. (1997). Relaxation methods for supervised image segmentation. IEEE Transactions on Pattern Recognition and Machine Intelligence, 19(9), 949–961.
Article Google Scholar
Haralick, R., & Shapiro, L. (1991). Computer and robot vision (vol. 1). Addison-Wesley, ISBN: 978-0201108774.
Google Scholar
Harrington, S. J., & Klassen, R. V. (1997) (October). Method of encoding an image at full resolution for storing in a reduced image buffer. Technical report. US Patent 5,682,249.
Google Scholar
Hase, H., Shinokawa, T., Yoneda, M., & Suen, C. (2001). Character string extraction from color documents. Elsevier Pattern Recognition, 34(7), 1349–1365.
Article MATH Google Scholar
Hase, H., Yoneda, M., Tokai, S., Kato, J., & Suen, C. (2003). Color segmentation for text extraction. International Journal on Document Analysis and Recognition, 6(4), 271–284.
Article Google Scholar
He, H., & Chen, Y. Q. (2000). Unsupervised texture segmentation using resonance algorithm for natural scenes. Pattern Recognition Letters, 21, 741–757.
Article Google Scholar
Hojjatoleslami, S. A., & Kittler, J. (1998). Region growing: A new approach. IEEE Transactions on Image Processing, 7(7), 1079–1084.
Article Google Scholar
Hough, P. V. C. (1959). Machine analysis of bubble chamber pictures. International Conference on High Energy Accelerators and Instrumentation.
Google Scholar
Huang, J., Wang, Y., & Wong, E. K. (1998). Check image compression using a layered coding method. Journal of Electronic Imaging, 7(3), 426–442.
Article Google Scholar
ISO-IEC. (2000a) (December). Information technology—JPEG. (2000). image coding system—Part 1: Core coding system, ISO/IEC International Standard 15444–1. ISO/IEC: Technical report.
Google Scholar
ISO-IEC-CCITT. (1993a). Information Technology—Digital Compression and Coding of Continuous-Tone Still Images—Requirements and Guidelines, ISO/IEC International Standard 10918-1, CCITT Recommendation T.81. Technical report ISO/IEC/CCITT.
Google Scholar
ISO-IEC-ITU. (1993). JBIG, Progressive bi-level image compression, ISO/IEC International Standard 11544 and ITU Recommendation T.82. Technical report. ISO/IEC/ITU.
Google Scholar
ISO-IEC-ITU. (1996). JPEG-3, Information Technology-Digital Compression and Coding of Continuous-Tone Still Images: Extensions, ISO/IEC 10 918-3, ITU-T Recom.T84. Technical report. ISO/IEC/ITU.
Google Scholar
ISO-IEC-ITU. (2000). JBIG2, ISO/CEI International Standard 14492 and ITU-T Recommendation T.88. Technical report. ISO/IEC/ITU.
Google Scholar
Jain, A. K., & Dubes, R. C. (1988). Algorithms for clustering data. Englewood Cliffs, N.J.: Prentice Hall.
MATH Google Scholar
Jung, K., & Seiler, R. (2003). Segmentation and compression of documents with JPEG2000. IEEE Transactions on Consumer Electronics, 49(4), 802–807.
Article Google Scholar
Kalman, M., Keslassy, I., Wang, D., & Girod, B. (2001) (October 7–10). Classification of compound images based on transform coefficient likelihood. Proceedings of the IEEE International Conference on Image Processing (vol. 1, pp. 750–753).
Google Scholar
Konstantinides, K., & Tretter, D. (1998) (October 4–7). A method for variable quantization in JPEG for improved text quality in compound documents. Proceedings of IEEE International Conference on Image Processing (ICIP 98) (vol. 2, pp. 565–568).
Google Scholar
Konstantinides, K., & Tretter, D. (2000). A JPEG variable quantization method for compound documents. IEEE Transactions on Image Processing, Correspondence, 9(7), 1282–1287.
Article Google Scholar
Kurita, T. (1991). An efficient agglomerative clustering algorithm using a heap. Pattern Recognition, 24(3), 205–209.
Article MathSciNet Google Scholar
Laws, K. I. (1980) (January). Textured image segmentation. Ph.D. thesis, University of Southern California.
Google Scholar
Li, L., Gong, J., & Chen, W. (1997). Gray-level image thresholding based on Fisher linear projection of two-dimensional histogram. Pattern Recognition, 30(5), 743–749.
Article Google Scholar
Lu, S. W., & Xu, H. (1995). Textured image segmentation using autoregressive model and artificial neural network. Pattern Recognition, 28(12), 1807–1817.
Article Google Scholar
Marr, D. (1982). Vision: A computational investigation into the human representation and processing of visual information. Henry Holt and Co., ISBN: 0716715678.
Google Scholar
Medioni, G. G., & Yasumoto, Y. (1984). A note on using the fractal dimension for segmentation. Proceedings of 2nd IEEE Computer Vision Workshop (pp. 25–30).
Google Scholar
Mehnert, A., & Jackway, P. (1997). An improved seeded region growing algorithm. Pattern Recognition Letters, 18, 1065–1071.
Article Google Scholar
Memon, N., & Tretter, D. (2000) (February). A method for variable quantization in JPEG for improved perceptual quality. International Conference on Visual Communications and Image Processing.
Google Scholar
Murata, K. (1996) (July). Image data compression and expansion apparatus, and image area discrimination processing apparatus therefore. Technical report. US Patent 5,535,013.
Google Scholar
Ng, M. K. (2000). A note on constrained k-means algorithm. Pattern Recognition, 33, 515–519.
Article Google Scholar
Ohlander, R. B. (1975). Analysis of natural scenes. Ph.D. thesis, Carnegie Institute of Technology, Department of Computer Science, Carnegie-Mellon University, Pittsburgh, PA, USA.
Google Scholar
Ohm, J. R., & Ma, P. (1997). Feature-based cluster segmentation of image sequences. Proceedings of the IEEE International Conference on Image Processing (pp. 178–181).
Google Scholar
Ojala, T., & Pietikäinen, M. (1999). Unsupervised texture segmentation using feature distributions. Pattern Recognition, 32, 477–486.
Article Google Scholar
Otsu, N. (1979). A threshold selection method from grey level histograms. IEEE Transactions on Systems, Man and Cybernetics, 9(1), 62–66.
Article MathSciNet Google Scholar
Papamarkos, N. (1999). Color reduction using local features and a Kohonen Self-Organized Feature Map neural network. International Journal of Imaging Systems and Technology, 10(5), 404–409.
Google Scholar
Papamarkos, N., & Atsalakis, A. (2000). Gray-level reduction using local spatial features. Computer Vision and Image Understanding, 78(3), 336–350.
Article Google Scholar
Papamarkos, N., & Gatos, B. (1994). A new approach for multilevel threshold selection. Computer Vision, Graphics, and Image Processing-Graphical Models and Image Processing, 56(5), 357–370.
Article Google Scholar
Papamarkos, N., & Strouthopoulos, C. (2000). Multithresholding of mixed type documents. Engineering Applications of Artificial Intelligence, 13, 323–343.
Article Google Scholar
Papamarkos, N., Strouthopoulos, C., & Andreadis, I. (2000). Multithresholding of color and gray-level images through a neural network technique. Image and Vision Computing, 18, 213–222.
Article Google Scholar
Papamarkos, N., Atsalakis, A., & Strouthopoulos, C. (2002). Adaptive color reduction. IEEE Transactions on Systems, Man, and Cybernetics-Part B, 32(1), 44–56.
Google Scholar
Pauwels, J., & Frederix, G. (1999). Finding salient regions in images: Nonparametric clustering for image segmentation and grouping. Computer Vision and Image Understanding, 75, 73–85.
Article Google Scholar
Pavlidis, T. (1982). Algorithms for graphics and image processing. Berlin Heidelberg: Springer. ISBN 978-3642932106.
Book MATH Google Scholar
Pennebaker, W. B., & Mitchell, J. L. (1993). JPEG Still Image Compression Standard. New York: Springer.
Google Scholar
Pentland, A. (1984). Fractal-based description of natural scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6), 661–674.
Article Google Scholar
Perkins, W. A. (1980). Area segmentation of images using edge points. IEEE Transactions on Pattern Recognition and Machine Intelligence, 2(1), 8–15.
Article Google Scholar
Perlmutter, K., Chaddha, N., Buckheit, J., Gray, R., & Olshen, R. (1996) (May 7–10). Text segmentation in mixed-mode images using classification trees and transform tree-structured vector quantization. ICASSP 1996 (vol. 4, pp. 2231–2234).
Google Scholar
Perner, P. (1999). An architecture for a CBR image segmentation system. Engineering Applications of Artificial Intelligence, 12, 749–759.
Article Google Scholar
Pietikäinen, M., & Okun, O. (2001) (September 10–13). Edge-based method for text detection from complex document images. Proceedings of the 6th IEEE International Conference on Document Analysis and Recognition (pp. 286–291).
Google Scholar
Prager, J. M. (1980). Extracting and labeling boundary segments in natural scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2(1), 16–27.
Article Google Scholar
Ramos, M., & DeQueiroz, R. L. (1999) (October). Adaptive rate-distortion-based thresholding: Application in JPEG compression of mixed images for printing. Proceedings of IEEE International Conference on Image Processing (ICIP 99) (pp. 25–28).
Google Scholar
Rosenfeld, A., Hummel, R., & Zucker, S. (1976). Scene labeling by relaxation operations. IEEE Transactions on Systems, Man and Cybernetics, 6(6), 420–433.
Article MathSciNet MATH Google Scholar
Rosenholtz, R., & Watson, A. (1996) (September 16–19). Perceptual adaptive JPEG coding. IEEE International Conference on Image Processing (vol. 1, pp. 901–904).
Google Scholar
Said, A., & Pearlman, W. A. (1996). A new fast and efficient image codec based on set partitioning in hierarchical trees. IEEE Transaction on Circuits Systems and Video Technology, 6(3), 243–250.
Google Scholar
Singh, S., & Al-Mansoori, R. (2000). Identification of regions of interest in digital mammograms. Journal of Intelligent Systems, 10(2), 183–217.
Article Google Scholar
Sobokkta, K., Kronenberg, H., Perroud, T., & Bunke, H. (2000). Text extraction from colored book and journal covers. International Journal on Document Analysis and Recognition, 2, 163–176.
Google Scholar
Taubman, D. S., & Marcellin, M. W. (2002). JPEG2000 Image Compression Fundamentals, Standards and Practice. Kluwer Academic Publishers, ASIN: B011DB6NGY.
Google Scholar
Todoran, L., & Worring, M. (1999). Segmentation of color documents by line oriented clustering using spatial information. International Conference on Document Analysis and Recognition ICDAR’99 (pp. 67–70).
Google Scholar
Tversky, A. (1977). Feature of similarity. Psychological Review, 84(4), 327–350.
Article Google Scholar
Vernon, D. (1991). Machine vision: Automated visual inspection and robot vision. Prentice-Hall, ISBN: 978-0135433980.
Google Scholar
Wallace, G. (1991). The JPEG still picture compression standard. Communications of the ACM, 34(4), 30–44.
Article Google Scholar
Xu, Y., Olman, V., & Uberbacher, E. C. (1998). A segmentation algorithm for noisy images: Design and evaluation. Pattern Recognition Letters, 19, 1213–1224.
Article MATH Google Scholar
Yarman-Vural, F., & Ataman, E. (1987). Noise, histogram and cluster validity for Gaussian-mixtured data. Pattern Recognition, 20(4), 385–401.
Article Google Scholar
Yeung, M., Yeo, B. L., & Liu, B. (1998). Segmentation of video by clustering and graph analysis. Computer Vision and Image Processing, 71(1), 94–109.
Google Scholar
Yoshimura, M., & Oe, S. (1999). Evolutionary segmentation of texture image using genetic algorithms towards automatic decision of optimum number of segmentation areas. Pattern Recognition, 32, 2041–2054.
Article Google Scholar
Zahid, N., Limouri, M., & Essaid, A. (1999). A new cluster validity for fuzzy clustering. Pattern Recognition, 32, 1089–1097.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Athena Research Center, University Campus at Kimmeria, GR-67100, Xanthi, Greece
George Pavlidis

Authors

George Pavlidis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to George Pavlidis .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Pavlidis, G. (2017). Segmentation of Digital Images. In: Mixed Raster Content. Signals and Communication Technology. Springer, Singapore. https://doi.org/10.1007/978-981-10-2830-4_3

Download citation

DOI: https://doi.org/10.1007/978-981-10-2830-4_3
Published: 03 November 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-2829-8
Online ISBN: 978-981-10-2830-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics