Semantic Understanding and Task-Oriented for Image Assessment

Tsai, Cheng-Min; Guan, Shin-Shen; Tsai, Wang-Chin; Zhang, Zhi-Hua

doi:10.1007/978-3-319-92034-4_30

Semantic Understanding and Task-Oriented for Image Assessment

Cheng-Min Tsai¹⁵,
Shin-Shen Guan¹⁶,
Wang-Chin Tsai¹⁷ &
…
Zhi-Hua Zhang¹⁵

Conference paper
First Online: 01 June 2018

2646 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10926))

Abstract

This study focuses on image perception issues based on humans’ visual assessment. Semantic understanding and task-oriented are also considered in image assessment. The brightness and colorfulness attributes are selected to be the image assessment tasks in the study. The Linear Regression (LR) analysis and Non-Linear Regression (NLR) analysis methods are used to establish the image assessment models, which also compares their prediction ability. The visual assessment experiment was comprised of 90 participants. Four images were selected from the ISO standard by the focus group. The results showed that “brightness” and “colorfulness” remained stable in the predictive models of the LR and NLR methods. The results also demonstrated very high prediction ability in brightness and colorfulness in both the linear and non-linear models. The brightness attribute directly relates to the image’s lightness, and the colorfulness attribute directly relates to the image’s saturation. The simple semantic understanding and the single task oriented that assessed the brightness and colorfulness of the images are also very important in the image assessment experiment.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

According to Newell’s information processing system (IPS), three levels, which include physical implementation, algorithmic manipulation, and semantic understanding, should be considered in information processing [1]. Maeder and Eckert [2] also found three levels including mathematical, psychovisual, and task-oriented, which are part of the cognition processing system (CPS) with oriented human cognition. However, the first level (physical implementation in the IPS and mathematical in the CPS) is primarily intended to consider the physical attributes such as the image fidelity of processed images relative to the original images. The second level can be considered the base of the human visual system, which includes algorithmic manipulation in IPS and psychovisual in CPS. The most important level is the top level, which includes the cognition processes (semantic understanding in IPS and task-oriented in CPS). The top level is widely researched by implementing human factor research to perform empirical studies. Many studies have started to examine assessments of image information and visual assessment through human visual systems based on the perspective of human factor engineering and perception psychology.

Table 1, three levels, including image compression, human visual system, and image assessment/visual quality, were used to assess image quality issues. According to Tsai’s research framework, the perceptual image quality corresponds to the three levels of IPS and CPS models based on a user-centered approach (see Fig. 1) [3].

Table 1. The three levels of image assessment.

Full size table

The top level shows that image assessment items consider the cognition issues. It is also directly related to semantic understanding and the task-oriented concept. Tsai et al. (2016) [3] said that the important items for image assessment at the cognition level include brightness, colorfulness, naturalness, preference, and total image quality. Although these attribute items correspond to cognition issues, sometimes it was difficult to assess the total image quality with visual assessment. The previous study figured out the key points, which were not only that the results of “perceptual image quality” and “perceptual color quality” were directly correlated, but also that the “perceptual color quality” was easier than the “perceptual image quality” in image assessment. In addition, the time required to assess perceptual image quality was significantly higher than the time needed to assess perceptual color quality, and the results showed that the concept of perceptual image quality is broader and vaguer in the cognitive processes of subjects compared to the concept of perceptual color quality. Two different kinds of phrases, “color quality” and “image quality,” correspond to the semantic understanding of the observer. Furthermore, the tasks for evaluating “perceptual color quality” were significantly clearer and more specific than the tasks for evaluating “overall image quality.” Table 2 shows the top eight attributes corresponding to the image quality. The attributes—brightness, colorfulness, naturalness, preference, sharpness, contrast, fidelity, and total image quality—were selected and discussed by 33 master program students in the department of design. Brightness and colorfulness are the top two attributes, which directly corresponds to the image assessment concept.

Table 2. The attributes related to the image quality concept.

Full size table

Many studies in the past decade have started to assess image information and visual quality through human visual systems based on human factor engineering and perception psychology [10, 12, 16, 17]. Because the model should be easy to apply to image industries and practical issues, most studies use the simple linear regression analysis method to construct the model. However, many assessment items used to evaluate an image are complex for semantic understanding, such as preference [18, 19], naturalness [20], and total image quality [6, 10]. Thus, this study questions whether the linear analysis method is sufficiently accurate and easy to apply based on simple assessment items. Semantic understanding and task-oriented are the most important issues in image assessment. Thus, the aim of this study focuses on discussion of clear semantic understanding and single task-oriented in the image assessment issues based on the cognition level approach. As the clear semantic understanding and single task-oriented concept, brightness and colorfulness are selected to discuss in this study. In the role of psycho-attributes and physical attributes, the brightness relates to the lightness of image and the colorfulness relates to the chroma and hue angle.

2 Research Method

2.1 The Scale of Physical Attributes

A previous study checked the scale of physical attributes, which is appropriate for an experiment of image quality. Serial psychophysical experiments were conducted to examine the differences of each physical attribute by visual assessment. In the results of the previous study, the range of image adjustments in lightness was set from 60% to 140%, and the best was 100%. The range of image adjustments in chroma was set from 80% to 155%, and the best was 115%. The range of image adjustments in hue angle was set from −20° to 20°, and the best was 0°. The range of image adjustments in contrast was set from −5 to +3, and the original value was zero [4].

2.2 The Image Stimuli of Previous Study

Four images were used in the previous study, and each image was modulated by four physical attributes including lightness, chroma, hue angle, and contrast (see Fig. 2). All images were also processed according to color conversion and physical adjustment based on the CIECAM02 function by the Boland C++ program. To display the image for visual assessment tasks, the modulated images were represented on the screen by Visual Basic 6.0 software [3, 4].

2.3 Experiment Design and Participants

Each observer was seated facing a calibrated 30 inch Sharp LCD-TV with fixed luminance and color temperature control on 120 cd/m² and about 6500 k. The laboratory light was fixed luminance and the color temperature was controlled by 233 lx and abuts 6500 k. The resolution of screen was multiply 1360 by 768 pixels. Each trial randomly showed an image, and the background color was set by a mid-grey color having L* about 60. The serial order in which the images were projected was randomized by the computer so that the image order changed every trial. Totally, that about 40 min for finish the experiment. The experimental environment has set as the same as to the environment of pervious study [4]. The scale of physical attributes and the image stimuli of experiment were according to the previous studies [3, 4]. 90 observers participated in the visual assessment experiment. All the observers possessed normal color vision according to the Ishihara color vision test.

2.4 Multi-collinearity Testing

In terms of Multi-Collinearity testing based on the Person correlation coefficient analysis for check the correlation coefficient between brightness and colorfulness. 90 observers participated in the results show that there were very low correlation coefficient between brightness and colorfulness (r = .154, p < .001). In the other words, that may have no linear relationship between those two attributes.

3 Data Analysis and Results

Due to the stepwise regression method in linear analysis can achieve the greatest predictive power of dependent variables with a minimum number of variables. Stepwise regression method also uses the relative strength of every explanatory variable and dependent variable to determine which independent variables can be incorporated into regression equations [21]. Thus, this study implemented the stepwise regression method to establish a regression prediction model by using SPSS 17 statistic software. The data adopted to establish the regression model in this stage was obtained by randomly selecting the 81 participants (about 90%) from the original 90 participants. The remaining one-third of the 9 participants (about 10%) were reserved for model verification [21]. The sample size of participants conforms with the minimum sample size criteria of holdout data, which is 2 K + 25. Since there are four independent variables, K equals to 4. Thus, 33 participants conformed with the minimum sample size criteria [21]. The total number of observation from experiments were 26,244 (4 images × 81 scale adjustment variations × 81 participants). Finally, 81 averaged samples were set for linear and non-linear regression analysis.

3.1 Linear Regression Result for Brightness

The Lightness variable in Model I can explain that Brightness has reached a variance of 94.0% (F_(1,79) = 598.340, p < .001). The Lightness variable also has an explanatory power of 88.3% when represented by adjusted R². Chroma variable was added into Model II, which can individually explain that Brightness only possesses a variance of 2.5% (F_(2,78) = 395.649, p < .001). Contrast variable was added into Model III, which individually explained Brightness with only 0.4% for variance (F_(3,77) = 351.579, p < .001). Hue Angle variable was added into Model IV. This variable can explain Brightness individually, and it revealed the minimum R² with a variance of 0.4% (F_(4,76) = 281.019, p < .001). All four independent variables aforementioned reached the significance level for regression model (p < .001). Therefore, the four independent variables in Model IV can explain Brightness with a variance of 96.8% in total. The adjusted R² was also 93.3% (see Table 3). So the regression equation for Brightness is shown as Eq. (1).

Table 3. Explanatory power and cross-validation of two prediction models with linear regression by stepwise method.

Full size table

3.2 Linear Regression Result for Colorfulness

The Chroma variable in Model I can explain that colorfulness has reached a variance of 89.4% (F_(1,79) = 314.086, p < .001). The colorfulness variable has an explanatory power of 79.6% when represented by adjusted R². Hue angle variable was added into Model II, which can individually explain that colorfulness only possesses a variance of 1.4% (F_(2,78) = 171.616, p < .001). All the two independent variables aforementioned reached the significance level for regression model (p < .001). Therefore, the two independent variables in Model II can explain colorfulness with a variance of 90.3% in total. The adjusted R² was 81.0%. So the regression equation for colorfulness is shown as Eq. (2).

$$ \begin{aligned} {\text{Brightness}} \, = \, & 5. 7 3 6 \, \times \, {\text{Lightness}}\, + \, 1.0 6 7 \, \times \,{\text{Chroma}} \\ \quad \quad \quad \quad \quad \;\; & + \, 2. 2 10 \, \times \,{\text{Contrast}} \, + \,0. 9 1 9\\ \quad \quad \quad \quad \quad \;\; & \times \,{\text{HueAngle}}\, - \,0. 70 6\\ \end{aligned} $$

(1)

$$ \begin{aligned} {\text{Colorfulness}}\, = \, & 4.0 60 \, \times \, {\text{Chroma}} \, + \, 1. 1 7 5 \, \times \,{\text{HueAngle}} \\ \quad \quad \quad \quad \quad \quad & + \, 0. 70 6\\ \end{aligned} $$

(2)

3.3 Cross-validation of the Linear Regression Prediction Model

Cross-validation was also implemented at the same time to examine the correlation between the actual value of participants’ evaluation and the prediction value posted to the computing process of the model. The analysis using Pearson’s correlation coefficient discovered that the participants’ evaluation outcome obtained from the brightness prediction model had 94.6% correlation (prediction ability), while the prediction model for colorfulness had 82.9% prediction ability.

3.4 Results of Non-linear Regression Analysis

The stepwise regression method has also used to non-linear regression analysis. The results of regression analysis shows (see Table 4), the six independent variables in brightness’s model can explain that Brightness has reached a variance of 98.5% (F_(1,79) = 402.972, p < .001). The adjusted R² was also 96.8%. The seven independent variables in colorfulness’s model can explain that colorfulness has reached a variance of 97.6% (F_(1,79) = 207.606, p < .001). The adjusted R² was about 94.8%. It’s also very high prediction ability that compare to the results of linear regression.

Table 4. Explanatory power and cross-validation of two prediction models with non-linear regression by stepwise method.

Full size table

4 Discussion

The image assessment tasks for diverse observers are difficult to collect as stable data and results. Both the semantic understanding in the IPS model and the task-oriented in the CPS model were considered high-level processing based on cognition level. It’s an important issue for image assessment research. Clearly a semantic allows observers to understand the assessment task when they assess a complex image, such as the brightness attribute directly related to the image’s lightness, tone, highlight/shadow, grayscale levels, and so on. Specifically, the single task-oriented, which assesses the brightness of the images, is very important in the image assessment experiment. For example, the colorfulness attribute directly relates to the image’s saturation, color balance, memory color for objects, skin color, color scale levels in shadow, chroma, hue angle, and reality color. All of these items were considered by the observers when they assessed the images. Tsai et al. (2016) [3] point out that naturalness, preference, and total image quality also corresponded to diversity factors with cognition issues, but they were more complex when assessing the total image quality by visual assessment. In contrast to brightness and colorfulness, the “semantic understanding” should be more difficult for observers who are assessing total image quality.

The aim of this study is to focus on the clear semantic understanding and single task-oriented issues in image assessment. Thus, brightness and colorfulness were selected to discuss in this study. This study also finds the simple function based on brightness and colorfulness prediction models. They are easy to apply or to reconstruct the research. The results demonstrated a very high prediction ability in brightness and colorfulness whether the linear (Adjust R² = .933 in brightness prediction model, Adjust R² = .810 in colorfulness prediction model) or non-linear (Adjust R² = .968 in brightness prediction model, Adjust R² = .948 in colorfulness prediction model) models were used. Since the model should be easy to apply to images, industries and practical issues are the main purpose. The results show that a clear semantic understanding and single task-oriented are important in the experiment, which could help observers go through a simple image assessment process. It could also help researchers get clear data and nice results.

References

Newell, A.: Unified Theories of Cognition. Harvard University Press, Cambridge (1990)
Google Scholar
Maeder, A.J., Eckert, M.: Medical image compression: quality and performance issues. In: Pham, B., Braun, M., Maeder, A.J, Eckert, M.P. (eds.) New Approaches in Medical Image Analysis, Proceedings of SPIE, vol. 3747, pp. 93–101 (1999)
Google Scholar
Tsai, Cheng-Min, Guan, Shing-Sheng, Tsai, Wang-Chin: Eye movements on assessing perceptual image quality. In: Zhou, Jia, Salvendy, Gavriel (eds.) ITAP 2016. LNCS, vol. 9754, pp. 378–388. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39943-0_37
Chapter Google Scholar
Tsai, C.M., Guan, S.S., Juan, L.Y.G, Lai, Y.Y.: The scale on different physical attributes of images. In: 11th Congress of the International Colour Association on Proceedings, Sydney, Australia (2009)
Google Scholar
Sheikh, H.R., Bovik, A.C.: Image information and visual quality. IEEE Trans. Image Process. 15(2), 430–444 (2006)
Article Google Scholar
Ginesu, G., Massidda, F., Giusto, D.D.: A multi-factors approach for image quality assessment based on a human visual system model. Sig. Process. Image Commun. 21, 316–333 (2006)
Article Google Scholar
Fedorovskaya, E.A., Ridder, H., Blommaert, F.J.: Chroma variants and perceived quality of color images of natural scenes. Color Res. Appl. 22(2), 96–110 (1996)
Article Google Scholar
Kurita, T., Saito, A.: A characteristic of the temporal integrator in the eye-tracing integration model of the visual system on the perception of displayed moving images. In: IDW 2002 Conference VHF2-1 on Proceedings, pp. 1279–128 (2002)
Google Scholar
Chalmers, A.N.: Colour difference and colour preference in video imaging. In: 8th Congress of the International Colour Association on Proceedings, Kyoto, Japan, pp. 634–637 (1997)
Google Scholar
Maeder, A.J.: The image importance approach to human vision based image quality characterization. Pattern Recogn. Lett. 26, 347–354 (2004)
Article Google Scholar
Watson, A.B., Malo, J.: Video quality measures based on the standard spatial observer. In: IEEE ICIP, pp. 41–44 (2002)
Google Scholar
Janssen, T.J.W.M., Blommaert, F.J.J.: A computational approach to image quality. Displays 21, 129–142 (2000)
Article Google Scholar
Nguyen, A., Chandran, V., Sridharan, S.: Gaze trackign for region of interest coding in JPEG 2000. Sig. Process. Image Commun. 21, 356–377 (2006)
Google Scholar
Civanlar, M. R.: Content adaptive video coding and transport. In Proceedings of the IEEE 12th Signal Processing and Communications Applications Conference, pp. 28–30 (2004)
Google Scholar
Oda, K., Yuuki, A., Teragaki, T.: Evaluation of moving picture quality using the pursuit camera system. Euro Display 6(3), 115–118 (2002)
Google Scholar
Egmont-Petersena, M., de Ridderb, D., Handelsc, H.: Image processing with neural networks - a reivew. Pattern Recogn. Lett. 35, 2279–2301 (2002)
Article Google Scholar
Sheedy, J.E., Smith, R., Hayes, J.: Visual effects of the luminance surrounding a computer display. Ergonomics 48(9), 1114–1128 (2005)
Article Google Scholar
Choi, S.Y., Luo, M.R., Pointer, M.R., Rhodes, P.A.: Investigation of large display color image appearance i: important factors affecting perceived quality. J. Imaging Sci. Technol. 52(4), 040904-1–040904-11 (2008)
Google Scholar
Choi, S.Y., Luo, M.R., Pointer, M.R., Rhodes, P.A.: Investigation of large display color image appearance ii: the influence of surround conditions. J. Imaging Sci. Technol. 52(4), 040905-1–040905-9 (2008)
Google Scholar
Choi, S.Y., Luo, M.R., Pointer, M.R., Rhodes, P.A.: Investigation of large display color image appearance- iii: modeling image naturalness. J. Imaging Sci. Technol. 53(3), 301104-1–301104-12 (2009)
Article Google Scholar
Hair, J.F., Black, W.C., Babin, B.J., Anderson, R.E., Tatham, R.L.: Multivariate Data Analysis, 6th edn. Prentice Hall, Englewood Cliffs (2006)
Google Scholar

Download references

Acknowledgements

The authors would like to thanks TTLA (Taiwan TFT LCD Association) for supporting this research and providing insightful comments. The authors would also like to thanks all observers for helpful in the experiments.

Author information

Authors and Affiliations

Department of Visual Arts and Design, Nanhua University, Chiayi, Taiwan, R.O.C.
Cheng-Min Tsai & Zhi-Hua Zhang
School of Design, Fujian University of Technology, Fuzhou, China
Shin-Shen Guan
Department and Graduate School of Product and Media Design, Fo Guang University, Yilan, Taiwan, R.O.C.
Wang-Chin Tsai

Authors

Cheng-Min Tsai
View author publications
You can also search for this author in PubMed Google Scholar
Shin-Shen Guan
View author publications
You can also search for this author in PubMed Google Scholar
Wang-Chin Tsai
View author publications
You can also search for this author in PubMed Google Scholar
Zhi-Hua Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheng-Min Tsai .

Editor information

Editors and Affiliations

Chongqing University, Chongqing, China
Jia Zhou
Purdue University, West Lafayette, Indiana, USA
Gavriel Salvendy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tsai, CM., Guan, SS., Tsai, WC., Zhang, ZH. (2018). Semantic Understanding and Task-Oriented for Image Assessment. In: Zhou, J., Salvendy, G. (eds) Human Aspects of IT for the Aged Population. Acceptance, Communication and Participation. ITAP 2018. Lecture Notes in Computer Science(), vol 10926. Springer, Cham. https://doi.org/10.1007/978-3-319-92034-4_30

Download citation

DOI: https://doi.org/10.1007/978-3-319-92034-4_30
Published: 01 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92033-7
Online ISBN: 978-3-319-92034-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Abstract

1 Introduction

2 Research Method

2.1 The Scale of Physical Attributes

2.2 The Image Stimuli of Previous Study

2.3 Experiment Design and Participants

2.4 Multi-collinearity Testing

3 Data Analysis and Results

3.1 Linear Regression Result for Brightness

3.2 Linear Regression Result for Colorfulness

3.3 Cross-validation of the Linear Regression Prediction Model

3.4 Results of Non-linear Regression Analysis

4 Discussion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation