1 Introduction

With the advancement of technology, electronic devices like mobiles, laptops, i-pads etc. have become an essential part in our daily life. As a result, design of interfaces has become a vital issue for the researchers of Human Computer Interaction (HCI) community. Good interface design requires knowledge about people: how they see, understand and think [5]. Another important aspect of good design is how information is visually presented to the users or the aesthetics of the interface. According to the Oxford dictionary [17], aesthetics is “concerned with beauty and art and the understanding of beautiful things”. It is argued that aesthetically pleasing interfaces increase user efficiency and decrease perceived interface complexity, which in turn helps in increasing usability, productivity and acceptability [15]. It has been also observed [2] that one redesigned graphical window, can save $20,000 of company during the first year of use. The general rule of thumb says every dollar invested in usability returns $10 to $100 [16]. So the current requirement of interface design is the design of aesthetically pleasing interface.

Thus, in general we can say that the interface aesthetics have an important role to play in determining usability. This is true particularly in the web page design. A webpage is composed of three different types of basic elements, or any combination of them, namely: text, image and short video/animation. Other webpage elements like: icon, table, and link etc. can also be approximated by text and image; like icon with image, table and link with text. The shapes of all the webpage elements can be approximated by rectangles. For example, a circular image can be treated as content of a rectangle (wireframe).

Since aesthetics is important in webpage design, it is necessary to measure it. Aesthetic measurement of webpage is considered to be subjective. Hence, the measure is primarily done through empirical means. A parallel research effort also attempted to develop computational models to evaluate aesthetics of the whole interface [12]. The advantage of computational model is the ability to evaluate interface aesthetics automatically, which in turn makes it possible to automate the design process itself, as demonstrated in [14].

Although a number of works [1, 4, 6,7,8,9,10,11] report on the contents of a webpage, Ngo et al. [12] found that the impacts of the contents on aesthetics are not important. Rather, the positional geometry of the webpage elements (wireframe) is strongly related with aesthetics. They proposed a computational model for webpage aesthetics based on the 13 positional geometry features of webpage elements. The average of these 13 feature values (termed as order) was used to judge the aesthetics of a webpage. Again Lai et al. [7] claimed that the symmetry feature had no relation with aesthetics. Finally, Ngo et al. [12] claimed that, their model may not be appropriate for the real webpages. So, we felt that there is a need to review the geometry related features and to develop a computational model for webpage aesthetics.

In this work, we re-examined the 13 best known features of webpage aesthetics, as reported in [12]. We created wireframes of 52 webpages and rated them by 100 participants. ANOVA study on the users’ ratings revealed only 9 out of the 13 features are statistically significant for aesthetics measurement. Based on those 9 features, we developed a computational model to predict webpage aesthetics. Our model work based on the linear kernel of Support Vector Machine (SVM) [3]. To judge the efficiency of our model, we considered the wireframes of 10 real webpages, and rated them by 80 users. Experimental results show that our model can predict aesthetics with 90% accuracy. The details of the 13 positional geometry features, empirical data collections, the proposed model and analysis are described in this paper.

2 Empirical Study for Feature Identification

In order to find the impact of the 13 independent features associated with aesthetics [12], we performed an empirical study. We created 4 webpages for each feature by systematically varying the feature values. These webpages were rated by 100 participants. One dimensional ANOVA was used to find the impact of each feature, associated with webpage aesthetics.

For each feature, we created 4 webpages by varying the feature values systematically in 4 levels – significant low (SL), low (LO), average (AV), high (HI). Range for the feature values for each level is shown in Table 1. All these features are independent of each other, as reported in [12]. During the variation of each feature (in 4 different feature classes), we did not observe any significant variations in the other features, which may affect aesthetics.

Table 1. Four feature class with their range

Altogether, we designed 52 webpages wireframe models using Adobe Photoshop CS6TM. The size of each model was 700 × 700 pixels. Figure 1 shows set of 4 such models, where the unity feature varies. All these 52 models were shown to the participants on PCs having 2.6 GHz AMD Phenom II X3 710, processor running on Windows 8. Each PC had a 23 in. wide viewing angle color display.

Fig. 1.
figure 1

Wireframe models of webpages designed by varying the unity feature

One hundred participants took part in our study. Out of them, 20 were school students (average age 16 years), 60 were under and post graduate students (average age 22 years), and rest 20 were teachers (average age 36 years). All of them had normal or corrected-to-normal vision and none of them was colour blind (self reported). All of them were regular computer users. However, none was familiar with screen design concepts.

We created webpages’ model of the 52 webpages without considering the contents of the webpage elements. All the 52 models were rated by 100 users in a five point rating scale (1–5); five denoted aesthetically pleasing webpage, and one denoted the aesthetically least pleasing webpage. A browser-based viewer was created for the users, with facilities to view previous/next sample and to rate a webpage model. After viewing each webpage model, participants rated it according to its aesthetic appeal. Each participant rated the 52 models assigned to him/her in two sessions (26 each) in a day. They were allowed to take breaks in each session. These measures were taken to avoid discomfort to the participants that might have arose due to the large number of webpages to be rated. To avoid the learning effect, we randomly varied the sequence of the webpage models shown to the users. Before data collection, we performed a small training sessions for the participants. In these sessions, participants were familiarized with the 5 point scale and the web interface by which they had to rate the webpage models.

Results and Analysis

We computed the feature values of the 52 webpages using the analytical expressions proposed by Ngo’s work [12].

Based on the empirical results, we performed 13 independent 1 dimensional ANOVA (one for each feature) by using the ANOVA1 command of MATLAB 2014. We observe that the p values are higher (p > 0.05) for the features - density (0.11), economy (.108), rhythm (.174) and simplicity (.0536). This implies that the variations of these feature values were not having statistically significant impact on the webpage aesthetics. On the contrary, the remaining 9 features – balance, cohesion, equilibrium, homogeneity, proportion, regularity, sequence, symmetry and unity were found to be statistically significant for the webpage aesthetics. So, we consider these 9 geometry features for webpage aesthetics. Accordingly, we propose a computational model based on these 9 features, as discussed below.

3 Proposed Computational Model

Our model works based on the linear kernel of Support Vector Machine (SVM) [3]. SVM is popularly used for solving the binary classification problems. In the following section, we discuss about the training procedure of our model.

3.1 Model Training

For the model training, we considered a subset of data (9 out of 13 features). It may be noted that we have 5200 (100 × 4 × 13) data points which are the ratings of 100 users on the 52 webpages; 4 webpages for each of the 13 features. Out of these 13 features, we considered 9 features for the training of our model. So, for these 9 features we have 36 webpages (4 for each feature) and their corresponding 100 users’ ratings. Altogether, we have 3600 (9 × 4 × 100) training data points. In our model we consider 9 SVMs to predict the 9 features independently. Each of these 9 SVMs was trained by the 400 data points (100 × 4), which are the ratings of 100 users for the 4 different webpages of a particular feature. As, SVM works on labeled data, we converted all the 3600 unlabeled data to labeled training data by using the following logic.

figure a

A particular feature of a webpage was labeled as good (+1), only when an user gave a rating more than 2, in a 5 point scale, as well as the feature value was greater than or equals to 0.5 (computed by the Ngo’s analytical expression [12]) in 0 to 1 scale. This was done to label almost half of the scale for aesthetically pleasing (good) feature and the rest was for aesthetically unpleasing feature. Based on these labeled data, we trained our model using the SVMTRAIN function of MATLAB 2014.

3.2 Empirical Study for Model Validation

The main objective of our model is to predict aesthetics for real webpages. For this purpose, we performed another empirical study on 10 real webpages. The details of the empirical study, along with the performance of our model are discussed below.

For model validation, we considered wireframes of 10 real webpages [18,19,20,21,22,23,24,25,26,27] (home pages of 10 websites). These webpages represent some popular domains, like – education, banking, e-commerce, social networking, entertainment, news and corporate sector. For all these webpages, we created the webpage models (constructed without considering the content of the webpage elements) by using Adobe Photoshop CS6TM. Figure 2(b) shows the model of the soundcloud (Fig. 2(a)) page. These models were rated by the users in PCs, having 2.6 GHz AMD Phenom II X3 710 processor, running on Windows 8. Each PC had a 23 in. wide viewing angle colour display.

Fig. 2.
figure 2

Webpages and their model

All the 10 webpage models were rated by 80 new users. Out of them, 40 were male and the rest were female. Fifty participants were undergraduate students (average age 21 years), 20 users were postgraduate students (average age 27 years) and rest 10 were the faculty members (average age 40 years). All the participants were regular computer users but none of them had any knowledge about the website design principles. All the users had normal or corrected-to-normal vision and none of them was colour blind (self reported).

Each participant rated the 10 webpage models using the same browser-based viewer used in our previous study. After viewing each webpage model, they rated it according to its aesthetic appeal in the same 5 point rating scale, used in our previous study; 5 denoted aesthetically pleasing webpage, and 1 denoted aesthetically least pleasing webpage. To avoid the learning effect, we randomly varied the sequence of the webpage models shown to the users. A participant rated the 10 webpages assigned to her/him in one session in a day. Before data collection, we performed a small training sessions for the participants. In these sessions, participants were familiarized with the 5 point scale and the web interfaces by which they had to rate the webpages.

3.3 Results and Discussion

The 13 feature values and their average of the 10 webpage models were computed with the help of Ngo’s [12] formulas. Although we selected the 9 features, but computation of these 13 feature values will help us to prove the efficiency of our model. Table 2 shows the order (average values of all the 13 features) of the 10 webpages’ wireframes. Using these wireframes, we performed another empirical study. Results of the empirical study are shown in Table 2. The mode column of Table 2 denotes the rating, given by most of the users for a particular webpage. Based on the mode value, we classified the webpages in two classes – good or bad. For the binary classification we used the following logic.

figure b
Table 2. Empirical study result vs. predicted result

We independently trained our SVMs to predict the feature class (good or bad) of the 9 features (for 10 webpages), reported in Table 2. Feature class prediction was done using the SVMPREDICT function of MATLAB 2014 version. Finally, for predicting the aesthetics of the whole webpage we used the following algorithm.

figure c

The prediction algorithm shows, if most of the predicted features (5 out of 9) are aesthetically pleasing (good) then our algorithm (as mentioned above) predicts the webpage as good (aesthetically pleasing), otherwise it is treated as bad (aesthetically unpleasing).

We compared the predicted result of our model with the results obtained from empirical study. Using our model, we predicted the feature types of the 10 webpages as shown in Table 2. Then based on our webpage prediction algorithm we predicted the webpages as aesthetically pleasing (+1) or not (−1). The second rightmost column of the Table 2 shows the result of the webpage type (aesthetically pleasing or not) by the users’ rating.

Out of 10 webpages, our model accurately predicted 9 webpages. Thus, our model predicted webpage aesthetics with an accuracy of 90% (9 out of 10).

Ngo et al. [12] claimed that the order value may be a measure for aesthetics computation. However, in our study we observe that the order may not be relevant for real webpages. The order value of the 10 real webpages lied in the range from 0.43 to 0.52 as shown in Table 2, which is sorted based on the order value. It may be noted that, the order value of the facebook is 0.45 was treated by users as aesthetically pleasing webpage. In a contrary, the higher order value (than that of Facebook) of Flipkart 0.47 was treated as aesthetically unpleasing by the users. Again, the lower order value (than that of Flipkart) of TCS and CIT, Kokrajhar are 0.4532 and 0.4581 were marked as aesthetically pleasing webpages by the users. Even we can observe that the difference of the order value among the webpages NewsLive and Facebook is only 0.1%. But, still users find facebook as aesthetically pleasing, while the NewsLive as aesthetically unpleasing. Similarly, we observe the difference in the order value is only 0.7% among CIT Kokrajhar and Flipkart. But CIT Kokrajhar was treated as aesthetically pleasing, while the Flipkart was treated as aesthetically unpleasing.

Based on the above observations, we can claim that order is not a suitable metric for aesthetics computation of real webpages. In contrast, our model works based on SVM; which has the capability for solving binary classification problems with high accuracy.

SVM maps data into a higher dimensional input space and creates an optimal separating hyper plane in the higher dimensional space. As a result, two classes (good or bad) are created across the separating hyper plane. Then for a particular input, SVM predicts the corresponding class. We used this notion to predict the feature types of the 10 webpages. However, selection of SVM kernel is a tricky task. The general convention is to use the linear kernel first, as they are easier and faster than that of the others kernels, like polynomial, RBF and Sigmoid kernels. If the prediction result is satisfactory then linear kernel is the best option; otherwise, other kernels have to be used. Using this convention, we used the linear kernel of SVM in our model. We observed an accuracy of 90% in aesthetics prediction, which is good enough. So, we refrain to consider the other kernels. However, the performances of the other kernels in this context may be explored. In our model, we trained each SVM by 400 data. However, larger training data may improve the performance of our model.

Our model can predict webpage aesthetics in terms of two classes – aesthetically pleasing (good) or unpleasing (bad). It is the simplest type of classification and has the less chance of misclassifying than that of multiclass classifiers. Again, multiclass classifiers are more complex and consume more time for classification, than that of binary classifiers. So, for the devices, where time and computational power are the judgement factors, like – mobiles, PDAs etc., binary classifier is the best choice. In our study we collected the users’ rating in a 5 point rating scale and predicted webpage aesthetics in two classes (good or bad). However, the performance of a model which can predict aesthetics in same scale (1–5) may be explored for further analysis.

Our computational model can help the designers to improve the design. For a particular webpage if our model predicts it as aesthetically unpleasing, then there is likely to be some problem with the design, which can reduce aesthetics and consequently, reduce usability of the design. Hence, the designer should take some corrective measures to improve the design. A designer can correct his/her design by considering the predicted feature_class values produced by our model. We can integrate our model with the webpage design guidelines or apply genetic algorithm based approach as reported in [14] to redesign webpage geometry. This redesign can improve aesthetic appeal; which in turn can increase usability.

In this work we have only considered the size and geometrical positions of the webpage elements for aesthetics. We have not considered the content of the webpage elements. But in some studies we observed that contents of webpage elements have impact on aesthetics. Our model may be combined with the models of short animations [14], text [10] and image model [9] to develop a complete computational model for evaluating aesthetic quality of a webpage. In our work, we consider the 4 variations of each feature. However, making more variations in each feature value may help us to develop a better model.

4 Conclusion

In this work, we reassessed the best known features for webpage aesthetics. We performed an empirical study and found that 9 features are important for aesthetics measurement. By considering these 9 features, we developed a computational model for aesthetics prediction. Our model works based on the linear kernel of Support Vector Machine. To judge the efficiency of our model, we performed another empirical study on real webpages, and found that our model can predict webpage aesthetics with a high accuracy of 90%. In future, we plan to refine and extend our model by more empirical data collection for model training and testing. We also try to combine our model with other predictive models for text, image and short animation aesthetics and develop a fully automated design environment. Investigation of more robust model based on other kernels of SVMs may also be explored. In our model we predicted aesthetics as good or bad. However, predicting aesthetics in the 5 point rating scale may also be an interesting topic. For each feature we considered only 4 variations. More variations in the feature values may help to build a more robust model.