1 WISE2016

The 17th International Conference on Web Information Systems Engineering (WISE 2016) was held in Shanghai, China, November 7–10, 2016. Building on the success of its predecessors, WISE 2016 continues to be a major international forum for researchers, professionals, and industrial practitioners to share their knowledge in the rapidly growing area of Web technologies, methodologies, and applications. The first WISE event took place in Hong Kong, China (2000). Then the trip continued to Kyoto, Japan (2001); Singapore (2002); Rome, Italy (2003); Brisbane, Australia (2004); New York, USA (2005); Wuhan, China (2006); Nancy, France (2007); Auckland, New Zealand (2008); Poznan, Poland (2009); Hong Kong, China (2010); Sydney, Australia (2011); Paphos, Cyprus (2012); Nanjing, China (2013); Thessaloniki, Greece (2014); and Miami, USA (2015). In 2016 WISE was held in Shanghai, China, supported by Fudan University (China) and Victoria University (Australia).

A total of 233 research papers were submitted to the conference for consideration, and each paper was reviewed by at least three reviewers. Finally, 39 submissions were selected as full papers (with an acceptance rate of 16.7% approximately), plus 31 as short papers. The research papers cover the areas of social network data analysis, recommender systems, topic modeling, data diversity, data similarity, context-aware recommendation, prediction, big data processing, cloud computing, event detection, data mining, sentiment analysis, ranking in social networks, microblog data analysis, query processing, spatial and temporal data, graph theory and non-traditional environments. In addition to regular and short papers, WISE-2016 program also featured a special session on Data Quality and Trust in Big Data (QUAT-16), and a Medical Big Data forum. Several world’s leading experts in the field joined WISE 2016 as distinguished keynote speakers and invited speakers.

2 The special issues

5 top ranked papers out of 39 full papers at WISE 2016 have been selected for the special issue of World Wide Web Journal (WWWJ). The selected papers have been extended with at least 30% of new and unpublished material including more technical and implementation details, improved algorithms, more experiment results, etc. Those papers underwent a rigorous extra refereeing and revision process.

The paper by Deepak et al. proposes a diversified query expansion problem. Authors consider the usage of semantic resources and tools to arrive at improved methods for diversified query expansion. In particular, they develop two methods, those that leverage Wikipedia and pre-learnt distributional word embeddings, respectively. Both approaches operate on a common three-phase framework; that of first taking a set of informative terms from the search results of the initial query, then building a graph, following by using a diversity-conscious node ranking to prioritize candidate terms for diversified query expansion.

The paper by He et al. defines the task of Event Phase Oriented News Summarization (EPONS). In this approach, authors assume that a summary contains multiple timelines, each corresponding to an event phase. They model the semantic relations of news articles via a graph model called Temporal Content Coherence Graph. A structural clustering algorithm EPCluster is designed to separate news articles into several groups corresponding to event phases. Authors apply a vertex-reinforced random walk to rank news articles. The ranking results are further used to create timelines. Extensive experiments conducted on multiple datasets show the effectiveness of that approach.

The paper by Zhao et al. studies on over 10 million stock-relevant tweets and 3 million investors from Weibo. It is revealed that inexperienced investors with high emotional volatility are more sensible to the market fluctuations than the experienced or institutional ones, and their dominant occupation also indicates that the Chinese market might be more emotional as compared to its western counterparts. Then both correlation analysis and causality test demonstrate that five attributes of the stock market in China can be competently predicted by various online emotions, like disgust, joy, sadness and fear. Specifically, a presented prediction model significantly outperforms the baseline model, including the one taking purely financial time series as input features, on predicting five attributes of the stock market under the K-means discretization. Authors also employ this prediction model in the scenario of realistic online application and its performance is further testified.

The paper by Liu et al. presents a novel probabilistic generative model (DTSA) to extract topics and the specified sentiments from news streams, and simultaneously to analyse their evolution over time. In DTSA, three different timescale models are studied to account for the historical dependencies of sentiment-topic word distributions at current epoch, continuous, skip and multiple timescale models. Additionally, authors further consider the links among news comments to avoid errors caused by user interactions. To mine more interpretable topics, a Conditional Random Fields (CRF) model is adopted to label a set of meaningful phrases for augmenting the bag-of-word features. Finally, the authors derive distributed online inference procedures to update the model with newly arrived data and to show effectiveness of the proposed model on real-world data sets.

The paper by Yu et al. proposes a novel matrix factorization recommendation algorithm based on integrating social network information, such as trust relationships, rating information of users and users’ own knowledge. Specifically, the authors use a user’s status in a social network to indicate user’s knowledge in a field. User’s status is inferred from the distributions of users’ ratings and followers across fields or from the structure of domain-specific social network. Then, the authors model the final rating of decision-making as a linear combination of the user’s own preferences, social influence and user’s own knowledge. Experimental results on real world data sets show that the proposed approach generally outperforms the state-of-the-art recommendation algorithms that do not consider the knowledge level differences between the users.