Abstract
Most of the news portals and social media networks are utilizing RSS feeds for information distribution and content sharing. Event identification improves the service quality of feeds providers in the aspect of content distribution and event browsing. However, thriving challenges arise due to representation of structural information and real-time requirement in feeds streams mining. In this paper, we focus on the record linkage problem which classifies stream content into known categories. To realize fast and efficient record linkage over XML feeds stream, we design two classification strategies: a classifier based on ensemble ELMs and an incremental classifier based on OS-ELM. Experimental results show that our solutions provide effective and efficient record linkage for event identification applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Becker, H., Naaman, M., Gravano, L.: Learning similarity metrics for event identification in social media. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, WSDM ’10, pp. 291–300. ACM, New York, NY, USA (2010)
Psallidas, F., Becker, H., Naaman, M., Gravano, L.: Effective event identification in social media. IEEE Data Eng. Bull. 36(3), 42–50 (2013)
Reuter, T., Cimiano, P., Drumond, L., Buza, K., Schmidt-Thieme, L.: Scalable event-based clustering of social media via record linkage techniques. In: Proceedings of the Fifth International Conference on Weblogs and Social Media, Barcelona, Catalonia, Spain, 17–21 July 2011
Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: a new learning scheme of feedforward neural networks. In: International Symposium on Neural Networks, vol. 2 (2004)
Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: theory and applications. Neurocomputing 70, 489–501 (2006)
Zong, W., Huang, G.-B.: Face recognition based on extreme learning machine. Neurocomputing 74, 2541–2551 (2011)
Zhao, X., Wang, G., Bi, X., Gong, P., Zhao, Y.: XML document classification based on ELM. Neurocomputing 74, 2444–2451 (2011)
Wang, B., Wang, G., Li, J., Wang, B.: Update strategy based on region classification using elm for mobile object index. Soft Comput. 16(9), 1607–1615 (2012)
Wang, G., Zhao, Y., Wang, D.: A protein secondary structure prediction framework based on the extreme learning machine. Neurocomputing 72, 262–268 (2008)
Allan, J., Papka, R., Lavrenko, V.: On-line new event detection and tracking. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 37–45. ACM (1998)
Kumaran, G., Allan, J.: Text classification and named entities for new event detection. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 297–304 (2004)
Dey, L., Mahajan, A., Haque, S.K.M.: Document clustering for event identification and trend analysis in market news. In: Proceedings of the Seventh International Conference on Advances in Pattern Recognition, ICAPR 2009, pp. 103–106. IEEE Computer Society, Kolkata, India, 4–6 Feb 2009
Yin, J.: Clustering microtext streams for event identification. In: Sixth International Joint Conference on Natural Language Processing, pp. 719–725 (2013)
Becker, H., Naaman, M., Gravano, L.: Event identification in social media. In: 12th International Workshop on the Web and Databases, WebDB 2009, Providence, Rhode Island, USA, 28 June 2009
Weiler, Scholl, M.H., Wanner, F., Rohrdantz, C.: Event identification for local areas using social media streaming data. In: Proceedings of the ACM SIGMOD Workshop on Databases and Social Networks, pp. 1–6. ACM (2013)
Vavliakis, K.N., Symeonidis, A.L., Mitkas, P.A.: Event identification in web social media through named entity recognition and topic modeling. Data Knowl. Eng. 88, 1–24 (2013)
Weiler, A., Grossniklaus, M., Scholl, M.H.: Event identification and tracking in social media streaming data. In: Proceedings of the Workshops of the EDBT/ICDT 2014 Joint Conference, Athens, Greece, 28 Mar 2014, pp. 282–287 (2014)
Trabelsi, C., Yahia, S.: A probabilistic approach for events identification from social media rss feeds. In: Database Systems for Advanced Applications Hong, B., Meng, X., Chen, L., Winiwarter, W., Song, W. (eds.), of Lecture Notes in Computer Science, vol. 7827, pp. 139–152. Springer, Berlin Heidelberg (2013)
Cao, K., Wang, G., Han, D., Ning, J., Zhang, X.: Classification of uncertain data streams based on extreme learning machine. Cogn. Comput. 7(1), 150–160 (2015)
Zhao, X., Bi, X., Qiao, B.: Probability based voting extreme learning machine for multiclass xml documents classification. World Wide Web 1–15 (2013)
Yang, J., Chen, X.: A semi-structured document model for text mining. J. Comput. Sci. Technol. (2002)
Huang, G., Song, S., Gupta, J., Wu, C.: Semi-supervised and unsupervised extreme learning machines. IEEE Trans. Cybern. 99, 1–1 (2014)
Liang, N.-Y., Huang, G.-B., Saratchandran, P., Sundararajan, N.: A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans. Neural Netw. 17, 1411–1423 (2006)
Acknowledgments
This research is partially supported by the National Natural Science Foundation of China under Grant Nos. 61272181, 61173029, and 61173030; the National Basic Research Program of China under Grant No. 2011CB302200-G; the 863 Program under Grant No. 2012AA011004; and the Fundamental Research Funds for the Central Universities under Grant No. N120404006.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Bi, X., Zhao, X., Ma, W., Zhang, Z., Zhan, H. (2016). Record Linkage for Event Identification in XML Feeds Stream Using ELM. In: Cao, J., Mao, K., Wu, J., Lendasse, A. (eds) Proceedings of ELM-2015 Volume 1. Proceedings in Adaptation, Learning and Optimization, vol 6. Springer, Cham. https://doi.org/10.1007/978-3-319-28397-5_36
Download citation
DOI: https://doi.org/10.1007/978-3-319-28397-5_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28396-8
Online ISBN: 978-3-319-28397-5
eBook Packages: EngineeringEngineering (R0)