Building the Profile of Web Events Based on Website Measurement
Nowadays, Web makes it possible to study emergencies from web information due to its real-time, open, and dynamic features. After the emergence of a web event, there will be numerous websites publishing webpages to cover this web event. Measuring temporal features in evolution course of web events can help people timely know and understand which events are emergencies, so harms to the society caused by emergencies can be reduced. In this paper, website preference is formally defined and mined by three proposed strategies which are all explicitly or implicitly based on the three-level networks: website-level, webpage-level and keyword-level. An iterative algorithm is firstly introduced to calculate outbreak power of web events, and increased web pages of events, increased attributes of events, distribution of attributes in web pages and the relationships of attributes are embedded into this iterative algorithm as the variables. By means of prior knowledge, membership grade of web events belong to each type can be calculated, and then the type of web events can be discriminated. Experiments on real data set demonstrate the proposed algorithm is both efficient and effective, and it is capable of providing accurate results of discrimination.
KeywordsWebsite preference Web mining Web events
This work was supported in part by the National Science and Technology Major Project under Grant 2013ZX01033002-003, in part by the National High Technology Research and Development Program of China (863 Program) under Grant 2013AA014603, in part by the National Science Foundation of China under Grant 61300202, in part by the China Postdoctoral Science Foundation under Grant 2014M560085, and in part by the Science Foundation of Shanghai under Grant 13ZR1452900.
- 1.C. Yang, X. Shi, and C. Wei. Discovering Event Evolution Graphs from News Corpora. IEEE Trans. On Systems, Man and Cybernetics—Part A: 39(4):850–863, 2009.Google Scholar
- 2.Juha Makkonen. Investigation on event evolution in TDT. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language, pp. 43–48, 2003.Google Scholar
- 3.J. Allan, G. Carbonell, G. Doddington, J. Yamron, and Y. Yang. Topic Detection and Tracking Pilot Study Final Report. In Proceedings of the Broadcast News Transcription and Understanding Workshop, 1998.Google Scholar
- 4.Shanshan Qi, Crystal Ip, Rosanna Leung, and Rob Law. 2010. A new framework on website evaluation. In E-Business and E-Government (ICEE), 2010 International Conference on. IEEE, 78–81.Google Scholar
- 5.Michael J Davern, Dov Te’eni, and Jae Yun Moon. 2000. Content versus structure in information environments: A longitudinal analysis of Website preferences. In Proceedings of the twenty first international conference on Information systems. Association for Information Systems, 564–570.Google Scholar
- 6.Barbara Poblete and Ricardo Baeza-Yates. 2006. A content and structure website mining model. In Proceedings of the 15th international conference on World Wide Web. ACM, 957–958.Google Scholar
- 7.Jonathan Chang and David M Blei. 2010. Hierarchical relational models for document networks. The Annals of Applied Statistics 4, 1 (2010), 124–150.Google Scholar
- 8.Z. Xu et al. Crowdsourcing based Description of Urban Emergency Events using Social Media Big Data. IEEE Transactions on Cloud Computing. doi: 10.1109/TCC.2016.2517638.
- 9.J. Xuan, X. Luo, G. Zhang, J. Lu, and Z. Xu. Uncertainty Analysis for the Keyword System of Web Events. IEEE Transactions on Systems, Man, and Cybernetics: Systems. doi: 10.1109/TSMC.2015.2470645.
- 10.Z. Xu et al. The Semantic Analysis of Knowledge Map for the Traffic Violations from the Surveillance Video Big Data. Computer Systems Science and Engineering, 30(5):403–410, 2015.Google Scholar
- 11.Z. Xu et al. Crowdsourcing based Social Media Data Analysis of Urban Emergency Events. Multimedia Tools And Applications, doi: 10.1007/s11042-015-2731-1.
- 12.Z. Xu et al. Incremental building association link network. Computer systems science and engineering, 26(3):153–162, 2011.Google Scholar
- 13.X. Luo, Zheng Xu, J. Yu, and X. Chen. Building Association Link Network for Semantic Link on Web Resources. IEEE transactions on automation science and engineering, 2011, 8(3), 482–494.Google Scholar
- 14.C. Hu, Zheng Xu, et al. Semantic Link Network based Model for Organizing Multimedia Big Data. IEEE Transactions on Emerging Topics in Computing, 2014, 2(3), 376–387.Google Scholar
- 15.Z. Xu et al. Knowle: a Semantic Link Network based System for Organizing Large Scale Online News Events. Future Generation Computer Systems, 2015, 43–44, 40–50.Google Scholar