Abstract
Most of the world’s cultural and scientific heritage is still available in printed form. In the past decades a number of actions and projects targeted the digitization of the printed material and the disclosure of the wealth of this content through Web technologies. One of the obstacles in this distribution is the lack of an efficient method of managing the large volume of data involved in archives of scanned documents. Since, digital image segmentation is identified as of paramount importance in mixed raster content compression, this chapter introduces to the field of segmentation and provides a brief description of known segmentation techniques and their applications.
—We bring the interpretation process into awareness through tricks. First, we degrade the image, making interpretation difficult. Second, we provide competing organizations, making possible several conflicting interpretations of the same image. Third, we provide organization without meaning to see how past experience affects the process.
Peter Lindsay and Donald Norman
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The term ‘requires’ is used to denote the requirements relating to the HVS modeling.
- 2.
Fisher linear discriminant (Brown 1999), is a clustering method, in which data of high dimensionality are projected on a line and a clustering is applied on the one dimensional space. The projection maximizes the distance between the average values of the classes, while, at the same time, it minimizes the variance within each class. The Fisher criterion that is being maximized for all line projections is defined for two classes as \(J(w)=\frac{|m_1-m_2|^2}{\sigma _1^2+\sigma _2^2}\), where m is the mean and \(\sigma ^2\) the variance of the classes.
- 3.
Dynamic Programming is a method for problem solving, in which the solution path is designed in reverse (from the required outcome backwards to the beginning).
- 4.
Region growing methods are bottom-up approaches, since they start by a single pixel and scale up to the overall image, whereas region splitting methods are top-down approaches, since they start by examining the whole image and scale down to the single pixel.
- 5.
The term feature images denotes images that consist only of detected features and not typical pixel values.
- 6.
In essence, Monte-Carlo statistics support the creation of a probability density function for the study of the effect of noise in the data. Monte-Carlo processes may prove useful in the study of the characteristics of a distribution, which are affected by noise, along with the study of characteristics that are crucial for the interpretation of the data.
- 7.
This corresponds to a family of methodologies based on the analysis of repeated texture structures. In general, a number of matrices is calculated for a texture, and from those various other features are derived. The co-occurrence matrix is defined by a magnitude measure and an angle, and its mathematical expression is: \(C_{\theta ,d}(x,y)=|\{(m,n) \in (M \times N) \times (M \times N):d(m,n)=d, tan^{-1}(m-n)=\theta \mathrm {\;or\;} \pi - \theta \mathrm {\;and\;} f(m)=x, f(n)=y\}|\) where d(a,b) a distance measure, usually \(d[(a,b),(c,d)]=max(|a-c|,|b-d|)\) and \(f[(M \times N)\mapsto N(0,255)]\) the image under analysis. This definition guarantees that the matrix is diagonal.
- 8.
Gabor expressed but not proved that the from all real-valued functions, the Hermite functions have the smallest product of implicit uncertainty. It holds that: \(g_n(x)=H_n(x)e^{-\frac{1}{2}x^2}\), with \(H_n(x)=(-1)^n e^{x^2} \frac{\mathrm {d}^n}{\mathrm {d}x^n} e^{-x^2}\) the Hermite polynomials.
- 9.
MLP is the most common type of neural network. It is simple yet with powerful mathematical basis. The input data pass through layers of neurons, with the input layer consisting of as many neurons as the variables of the problem. The output layer has so many neurons as the desired output variables (often just one). The intermediate layers are called hidden.
- 10.
Assuming S a closed subspace, then for any \(\epsilon >0\) and \(N(\epsilon )\) the minimum number of spheres of radius \(\le \epsilon \) necessary to cover all of S, if there exists a \({\delta }\) such that: \({\delta } = -\lim _{\epsilon \rightarrow 0^{+}} \frac{log N(\epsilon )}{log\epsilon }\), then \({\delta }\) is called the fractal dimension of S. In other words, the fractal dimension can be calculated by the limit of the ratio of the logarithm of change in the size of an object to the logarithm of the change in the measurement scale, as this scale tends to zero. In practice the following relation is used: \({\delta }=log(\mathrm {number\;of\;self-similar\;pieces})/log (\mathrm {magnification\;factor})\).
References
Adams, R., & Bischof, L. (1994). Seeded region growing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(6), 641–647.
Ahuja, N., Rosenfeld, A., & Haralick, R. M. (1980). Neighbor gray levels as features in pixel classification. Pattern Recognition, 12(4), 251–260.
Atsalakis, A., Papamarkos, N., & Andreadis, I. (2002a). On estimation of the number of image principal colors and color reduction through self-organized neural networks. International Journal of Imaging Systems and Technology, 12(3), 117–127.
Atsalakis, A., Kroupis, N., Soudris, D., & Papamarkos, N. (2002b). A window-based color quantization technique and its embedded implementation. IEEE International Conference on Image Processing ICIP 2002, Rochester, USA.
Basu, S. (1987). Image segmentation by semantic method. Pattern Recognition, 20(5), 497–511.
Beaulieu, J. M., & Goldberg, M. (1989). Hierarchy in picture segmentation: A stepwise optimization approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(2), 150–163.
Bhanu, B., & Faugeras, O. D. (1982). Segmentation of images having unimodal distributions. IEEE Transactions on Pattern Recognition and Machine Intelligence, 4(4), 408–419.
Bhanu, B., & Parvin, B. A. (1987). Segmentation of natural scenes. Pattern Recognition, 20(5), 487–496.
Bottou, L., Haffner, P., Howard, P., Simard, P., Bengio, Y., & LeCunn, Yann. (1998). High quality document image compression with DjVu. Journal of Electronic Imaging, 7(3), 410–425.
Braquelaire, J. P., & Brun, L. (1998). Image segmentation with topological maps and inter-pixel representation. Journal of Visual Communication and Image Representation, 9(1), 62–79.
Brodatz, P. (1966). Textures: A photographic album for artists and designers. 1 edn. Dover Publications, Inc., ASIN: B000ZGO6XQ.
Brown, M. (1999). Fisher’s Linear Discriminant.
Buckley, R., Venable, D., & McIntyre, L. (1997) (November 17–20). New developments in color facsimile and Internet fax. Proceedings of the Fifth Color Imaging Conference: Color Science, Systems, and Applications (pp. 296–300).
Campbell, N. W., Thomas, B. T., & Troscianko, T. (1997). Automatic segmentation and classification of outdoor images using neural networks. International Journal of Neural Systems, 8(1), 137–144.
Canny, J., & Computational, A. (1986). Approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6), 679–698.
Chan, F. H. Y., Lam, F. K., & Zhu, H. (1998). Adaptive thresholding by variational method. IEEE Transactions on Image Processing, 2(3), 168–174.
Chang, Y. L., & Li, X. (1994). Adaptive image region-growing. IEEE Transactions on Image Processing, 3(6), 868–873.
Chen, W., & Chen, S. (1998). Adaptive page segmentation for color technical journals’ cover images. Elsevier Image and Vision Computing, 16, 855–877.
Cheng, H., & Bouman, C. A. (1998). Trainable context model for multiscale segmentation. Proceedings of IEEE International Conference on Image Processing (ICIP 98) (vol. 1(October 4–7), pp. 610–614).
Cheng, H., Bouman, C. A., & Allebach, J. (1997) (May 18–23). Multiscale document segmentation. Proceedings of IS&T’s 50th Annual Conference (pp. 417–425).
Cheng, H., Bouman, C. A., & Bouman, A. (2001). Document compression using rate-distortion optimized segmentation. Journal of Electronic Imaging, 10(2), 460–474.
Cheriet, M., Said, J. N., & Suen, C. Y. (1998). Recursive thresholding technique for image segmentation. IEEE Transactions on Image Processing, 7(6), 918–920.
Cho, K., & Meer, P. (1997). Image segmentation from consensus information. Computer Vision and Image Understanding, 68(1), 72–89.
Christopoulos, C., Ebrahimi, T., & Lee, S. U. (2002). JPEG2000 Special Issue. Elsevier signal processing: Image communication (vol. 17). Elsevier.
Comer, M. L., & Delp, E. J. (1999). Segmentation of textured images using a multi-resolution Gaussian autoregressive model. IEEE Transactions on Image Processing, 8(3), 408–420.
Davies, E. R. (1990). Machine Vision: Theory, Algorithms. Practicalities: Academic Press. ISBN 978-0122060908.
DeQueiroz, R. L., Buckley, R., & Xu, M. 1999 (February). Mixed raster content (MRC) model for compound image compression. Proceedings IS&T/SPIE Symposium on Electronic Imaging, Visual Communications and Image Processing (vol. 3653, pp. 1106–1117).
Dubes, R. C. (1987). How many clusters are best?-an experiment. Pattern Recognition, 20(6), 645–663.
Dubes, R. C., & Jain, A. K. (1976). Clustering techniques: The user’s dilemma. Pattern Recognition, 8, 247–260.
Frigui, H., & Krishnapuram, R. (1999). A robust competitive clustering algorithm with applications in computer vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(5), 450–465.
Gambotto, J. P. (1993). A new approach to combining region growing and edge detection. Pattern Recognition Letters, 14(11), 869–875.
Gonzalez, R. C., & Woods, R. E. (1992). Digital image processing. 3 edn. Prentice Hall, ISBN: 978-0201508031.
Haddon, J. F., & Boyce, J. F. (1990). Image segmentation by unifying region and boundary information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10), 929–948.
Haddon, J. F., & Boyce, J. F. (1993). Co-occurrence matrices for image analysis. Electronics and Communication Engineering Journal, 71–83.
Haddon, J. F., & Boyce, J. F. (1994). Texture classification of segmented regions of FLIR images using neural networks. Proceedings of the International Conference on Image Processing.
Haddon, J. F., & Boyce, J. F. (1998). Integrating spatio-temporal information in image sequence analysis for the enforcement of consistency of interpretation. Digital Signal Processing, special issue on image analysis and information fusion.
Haddon, J. F., Schneebeli, M., & Buser, O. (1997) (May 25–30). Automatic segmentation and classification using a co-occurrence based approach. Proceedings of the 2nd International Conference on Imaging Technologies: Techniques and Applications in Civil Engineering.
Hansen, M. W., & Higgins, W. E. (1997). Relaxation methods for supervised image segmentation. IEEE Transactions on Pattern Recognition and Machine Intelligence, 19(9), 949–961.
Haralick, R., & Shapiro, L. (1991). Computer and robot vision (vol. 1). Addison-Wesley, ISBN: 978-0201108774.
Harrington, S. J., & Klassen, R. V. (1997) (October). Method of encoding an image at full resolution for storing in a reduced image buffer. Technical report. US Patent 5,682,249.
Hase, H., Shinokawa, T., Yoneda, M., & Suen, C. (2001). Character string extraction from color documents. Elsevier Pattern Recognition, 34(7), 1349–1365.
Hase, H., Yoneda, M., Tokai, S., Kato, J., & Suen, C. (2003). Color segmentation for text extraction. International Journal on Document Analysis and Recognition, 6(4), 271–284.
He, H., & Chen, Y. Q. (2000). Unsupervised texture segmentation using resonance algorithm for natural scenes. Pattern Recognition Letters, 21, 741–757.
Hojjatoleslami, S. A., & Kittler, J. (1998). Region growing: A new approach. IEEE Transactions on Image Processing, 7(7), 1079–1084.
Hough, P. V. C. (1959). Machine analysis of bubble chamber pictures. International Conference on High Energy Accelerators and Instrumentation.
Huang, J., Wang, Y., & Wong, E. K. (1998). Check image compression using a layered coding method. Journal of Electronic Imaging, 7(3), 426–442.
ISO-IEC. (2000a) (December). Information technology—JPEG. (2000). image coding system—Part 1: Core coding system, ISO/IEC International Standard 15444–1. ISO/IEC: Technical report.
ISO-IEC-CCITT. (1993a). Information Technology—Digital Compression and Coding of Continuous-Tone Still Images—Requirements and Guidelines, ISO/IEC International Standard 10918-1, CCITT Recommendation T.81. Technical report ISO/IEC/CCITT.
ISO-IEC-ITU. (1993). JBIG, Progressive bi-level image compression, ISO/IEC International Standard 11544 and ITU Recommendation T.82. Technical report. ISO/IEC/ITU.
ISO-IEC-ITU. (1996). JPEG-3, Information Technology-Digital Compression and Coding of Continuous-Tone Still Images: Extensions, ISO/IEC 10 918-3, ITU-T Recom.T84. Technical report. ISO/IEC/ITU.
ISO-IEC-ITU. (2000). JBIG2, ISO/CEI International Standard 14492 and ITU-T Recommendation T.88. Technical report. ISO/IEC/ITU.
Jain, A. K., & Dubes, R. C. (1988). Algorithms for clustering data. Englewood Cliffs, N.J.: Prentice Hall.
Jung, K., & Seiler, R. (2003). Segmentation and compression of documents with JPEG2000. IEEE Transactions on Consumer Electronics, 49(4), 802–807.
Kalman, M., Keslassy, I., Wang, D., & Girod, B. (2001) (October 7–10). Classification of compound images based on transform coefficient likelihood. Proceedings of the IEEE International Conference on Image Processing (vol. 1, pp. 750–753).
Konstantinides, K., & Tretter, D. (1998) (October 4–7). A method for variable quantization in JPEG for improved text quality in compound documents. Proceedings of IEEE International Conference on Image Processing (ICIP 98) (vol. 2, pp. 565–568).
Konstantinides, K., & Tretter, D. (2000). A JPEG variable quantization method for compound documents. IEEE Transactions on Image Processing, Correspondence, 9(7), 1282–1287.
Kurita, T. (1991). An efficient agglomerative clustering algorithm using a heap. Pattern Recognition, 24(3), 205–209.
Laws, K. I. (1980) (January). Textured image segmentation. Ph.D. thesis, University of Southern California.
Li, L., Gong, J., & Chen, W. (1997). Gray-level image thresholding based on Fisher linear projection of two-dimensional histogram. Pattern Recognition, 30(5), 743–749.
Lu, S. W., & Xu, H. (1995). Textured image segmentation using autoregressive model and artificial neural network. Pattern Recognition, 28(12), 1807–1817.
Marr, D. (1982). Vision: A computational investigation into the human representation and processing of visual information. Henry Holt and Co., ISBN: 0716715678.
Medioni, G. G., & Yasumoto, Y. (1984). A note on using the fractal dimension for segmentation. Proceedings of 2nd IEEE Computer Vision Workshop (pp. 25–30).
Mehnert, A., & Jackway, P. (1997). An improved seeded region growing algorithm. Pattern Recognition Letters, 18, 1065–1071.
Memon, N., & Tretter, D. (2000) (February). A method for variable quantization in JPEG for improved perceptual quality. International Conference on Visual Communications and Image Processing.
Murata, K. (1996) (July). Image data compression and expansion apparatus, and image area discrimination processing apparatus therefore. Technical report. US Patent 5,535,013.
Ng, M. K. (2000). A note on constrained k-means algorithm. Pattern Recognition, 33, 515–519.
Ohlander, R. B. (1975). Analysis of natural scenes. Ph.D. thesis, Carnegie Institute of Technology, Department of Computer Science, Carnegie-Mellon University, Pittsburgh, PA, USA.
Ohm, J. R., & Ma, P. (1997). Feature-based cluster segmentation of image sequences. Proceedings of the IEEE International Conference on Image Processing (pp. 178–181).
Ojala, T., & Pietikäinen, M. (1999). Unsupervised texture segmentation using feature distributions. Pattern Recognition, 32, 477–486.
Otsu, N. (1979). A threshold selection method from grey level histograms. IEEE Transactions on Systems, Man and Cybernetics, 9(1), 62–66.
Papamarkos, N. (1999). Color reduction using local features and a Kohonen Self-Organized Feature Map neural network. International Journal of Imaging Systems and Technology, 10(5), 404–409.
Papamarkos, N., & Atsalakis, A. (2000). Gray-level reduction using local spatial features. Computer Vision and Image Understanding, 78(3), 336–350.
Papamarkos, N., & Gatos, B. (1994). A new approach for multilevel threshold selection. Computer Vision, Graphics, and Image Processing-Graphical Models and Image Processing, 56(5), 357–370.
Papamarkos, N., & Strouthopoulos, C. (2000). Multithresholding of mixed type documents. Engineering Applications of Artificial Intelligence, 13, 323–343.
Papamarkos, N., Strouthopoulos, C., & Andreadis, I. (2000). Multithresholding of color and gray-level images through a neural network technique. Image and Vision Computing, 18, 213–222.
Papamarkos, N., Atsalakis, A., & Strouthopoulos, C. (2002). Adaptive color reduction. IEEE Transactions on Systems, Man, and Cybernetics-Part B, 32(1), 44–56.
Pauwels, J., & Frederix, G. (1999). Finding salient regions in images: Nonparametric clustering for image segmentation and grouping. Computer Vision and Image Understanding, 75, 73–85.
Pavlidis, T. (1982). Algorithms for graphics and image processing. Berlin Heidelberg: Springer. ISBN 978-3642932106.
Pennebaker, W. B., & Mitchell, J. L. (1993). JPEG Still Image Compression Standard. New York: Springer.
Pentland, A. (1984). Fractal-based description of natural scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6), 661–674.
Perkins, W. A. (1980). Area segmentation of images using edge points. IEEE Transactions on Pattern Recognition and Machine Intelligence, 2(1), 8–15.
Perlmutter, K., Chaddha, N., Buckheit, J., Gray, R., & Olshen, R. (1996) (May 7–10). Text segmentation in mixed-mode images using classification trees and transform tree-structured vector quantization. ICASSP 1996 (vol. 4, pp. 2231–2234).
Perner, P. (1999). An architecture for a CBR image segmentation system. Engineering Applications of Artificial Intelligence, 12, 749–759.
Pietikäinen, M., & Okun, O. (2001) (September 10–13). Edge-based method for text detection from complex document images. Proceedings of the 6th IEEE International Conference on Document Analysis and Recognition (pp. 286–291).
Prager, J. M. (1980). Extracting and labeling boundary segments in natural scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2(1), 16–27.
Ramos, M., & DeQueiroz, R. L. (1999) (October). Adaptive rate-distortion-based thresholding: Application in JPEG compression of mixed images for printing. Proceedings of IEEE International Conference on Image Processing (ICIP 99) (pp. 25–28).
Rosenfeld, A., Hummel, R., & Zucker, S. (1976). Scene labeling by relaxation operations. IEEE Transactions on Systems, Man and Cybernetics, 6(6), 420–433.
Rosenholtz, R., & Watson, A. (1996) (September 16–19). Perceptual adaptive JPEG coding. IEEE International Conference on Image Processing (vol. 1, pp. 901–904).
Said, A., & Pearlman, W. A. (1996). A new fast and efficient image codec based on set partitioning in hierarchical trees. IEEE Transaction on Circuits Systems and Video Technology, 6(3), 243–250.
Singh, S., & Al-Mansoori, R. (2000). Identification of regions of interest in digital mammograms. Journal of Intelligent Systems, 10(2), 183–217.
Sobokkta, K., Kronenberg, H., Perroud, T., & Bunke, H. (2000). Text extraction from colored book and journal covers. International Journal on Document Analysis and Recognition, 2, 163–176.
Taubman, D. S., & Marcellin, M. W. (2002). JPEG2000 Image Compression Fundamentals, Standards and Practice. Kluwer Academic Publishers, ASIN: B011DB6NGY.
Todoran, L., & Worring, M. (1999). Segmentation of color documents by line oriented clustering using spatial information. International Conference on Document Analysis and Recognition ICDAR’99 (pp. 67–70).
Tversky, A. (1977). Feature of similarity. Psychological Review, 84(4), 327–350.
Vernon, D. (1991). Machine vision: Automated visual inspection and robot vision. Prentice-Hall, ISBN: 978-0135433980.
Wallace, G. (1991). The JPEG still picture compression standard. Communications of the ACM, 34(4), 30–44.
Xu, Y., Olman, V., & Uberbacher, E. C. (1998). A segmentation algorithm for noisy images: Design and evaluation. Pattern Recognition Letters, 19, 1213–1224.
Yarman-Vural, F., & Ataman, E. (1987). Noise, histogram and cluster validity for Gaussian-mixtured data. Pattern Recognition, 20(4), 385–401.
Yeung, M., Yeo, B. L., & Liu, B. (1998). Segmentation of video by clustering and graph analysis. Computer Vision and Image Processing, 71(1), 94–109.
Yoshimura, M., & Oe, S. (1999). Evolutionary segmentation of texture image using genetic algorithms towards automatic decision of optimum number of segmentation areas. Pattern Recognition, 32, 2041–2054.
Zahid, N., Limouri, M., & Essaid, A. (1999). A new cluster validity for fuzzy clustering. Pattern Recognition, 32, 1089–1097.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Pavlidis, G. (2017). Segmentation of Digital Images. In: Mixed Raster Content. Signals and Communication Technology. Springer, Singapore. https://doi.org/10.1007/978-981-10-2830-4_3
Download citation
DOI: https://doi.org/10.1007/978-981-10-2830-4_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-2829-8
Online ISBN: 978-981-10-2830-4
eBook Packages: EngineeringEngineering (R0)