Skip to main content

Record Linkage for Event Identification in XML Feeds Stream Using ELM

  • Conference paper
  • First Online:
Proceedings of ELM-2015 Volume 1

Part of the book series: Proceedings in Adaptation, Learning and Optimization ((PALO,volume 6))

  • 1241 Accesses

Abstract

Most of the news portals and social media networks are utilizing RSS feeds for information distribution and content sharing. Event identification improves the service quality of feeds providers in the aspect of content distribution and event browsing. However, thriving challenges arise due to representation of structural information and real-time requirement in feeds streams mining. In this paper, we focus on the record linkage problem which classifies stream content into known categories. To realize fast and efficient record linkage over XML feeds stream, we design two classification strategies: a classifier based on ensemble ELMs and an incremental classifier based on OS-ELM. Experimental results show that our solutions provide effective and efficient record linkage for event identification applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.ibm.com/developerworks.

  2. 2.

    http://abcnews.go.com.

References

  1. Becker, H., Naaman, M., Gravano, L.: Learning similarity metrics for event identification in social media. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, WSDM ’10, pp. 291–300. ACM, New York, NY, USA (2010)

    Google Scholar 

  2. Psallidas, F., Becker, H., Naaman, M., Gravano, L.: Effective event identification in social media. IEEE Data Eng. Bull. 36(3), 42–50 (2013)

    Google Scholar 

  3. Reuter, T., Cimiano, P., Drumond, L., Buza, K., Schmidt-Thieme, L.: Scalable event-based clustering of social media via record linkage techniques. In: Proceedings of the Fifth International Conference on Weblogs and Social Media, Barcelona, Catalonia, Spain, 17–21 July 2011

    Google Scholar 

  4. Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: a new learning scheme of feedforward neural networks. In: International Symposium on Neural Networks, vol. 2 (2004)

    Google Scholar 

  5. Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: theory and applications. Neurocomputing 70, 489–501 (2006)

    Article  Google Scholar 

  6. Zong, W., Huang, G.-B.: Face recognition based on extreme learning machine. Neurocomputing 74, 2541–2551 (2011)

    Article  Google Scholar 

  7. Zhao, X., Wang, G., Bi, X., Gong, P., Zhao, Y.: XML document classification based on ELM. Neurocomputing 74, 2444–2451 (2011)

    Article  Google Scholar 

  8. Wang, B., Wang, G., Li, J., Wang, B.: Update strategy based on region classification using elm for mobile object index. Soft Comput. 16(9), 1607–1615 (2012)

    Article  Google Scholar 

  9. Wang, G., Zhao, Y., Wang, D.: A protein secondary structure prediction framework based on the extreme learning machine. Neurocomputing 72, 262–268 (2008)

    Article  Google Scholar 

  10. Allan, J., Papka, R., Lavrenko, V.: On-line new event detection and tracking. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 37–45. ACM (1998)

    Google Scholar 

  11. Kumaran, G., Allan, J.: Text classification and named entities for new event detection. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 297–304 (2004)

    Google Scholar 

  12. Dey, L., Mahajan, A., Haque, S.K.M.: Document clustering for event identification and trend analysis in market news. In: Proceedings of the Seventh International Conference on Advances in Pattern Recognition, ICAPR 2009, pp. 103–106. IEEE Computer Society, Kolkata, India, 4–6 Feb 2009

    Google Scholar 

  13. Yin, J.: Clustering microtext streams for event identification. In: Sixth International Joint Conference on Natural Language Processing, pp. 719–725 (2013)

    Google Scholar 

  14. Becker, H., Naaman, M., Gravano, L.: Event identification in social media. In: 12th International Workshop on the Web and Databases, WebDB 2009, Providence, Rhode Island, USA, 28 June 2009

    Google Scholar 

  15. Weiler, Scholl, M.H., Wanner, F., Rohrdantz, C.: Event identification for local areas using social media streaming data. In: Proceedings of the ACM SIGMOD Workshop on Databases and Social Networks, pp. 1–6. ACM (2013)

    Google Scholar 

  16. Vavliakis, K.N., Symeonidis, A.L., Mitkas, P.A.: Event identification in web social media through named entity recognition and topic modeling. Data Knowl. Eng. 88, 1–24 (2013)

    Article  Google Scholar 

  17. Weiler, A., Grossniklaus, M., Scholl, M.H.: Event identification and tracking in social media streaming data. In: Proceedings of the Workshops of the EDBT/ICDT 2014 Joint Conference, Athens, Greece, 28 Mar 2014, pp. 282–287 (2014)

    Google Scholar 

  18. Trabelsi, C., Yahia, S.: A probabilistic approach for events identification from social media rss feeds. In: Database Systems for Advanced Applications Hong, B., Meng, X., Chen, L., Winiwarter, W., Song, W. (eds.), of Lecture Notes in Computer Science, vol. 7827, pp. 139–152. Springer, Berlin Heidelberg (2013)

    Google Scholar 

  19. Cao, K., Wang, G., Han, D., Ning, J., Zhang, X.: Classification of uncertain data streams based on extreme learning machine. Cogn. Comput. 7(1), 150–160 (2015)

    Article  Google Scholar 

  20. Zhao, X., Bi, X., Qiao, B.: Probability based voting extreme learning machine for multiclass xml documents classification. World Wide Web 1–15 (2013)

    Google Scholar 

  21. Yang, J., Chen, X.: A semi-structured document model for text mining. J. Comput. Sci. Technol. (2002)

    Google Scholar 

  22. Huang, G., Song, S., Gupta, J., Wu, C.: Semi-supervised and unsupervised extreme learning machines. IEEE Trans. Cybern. 99, 1–1 (2014)

    Google Scholar 

  23. Liang, N.-Y., Huang, G.-B., Saratchandran, P., Sundararajan, N.: A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans. Neural Netw. 17, 1411–1423 (2006)

    Google Scholar 

Download references

Acknowledgments

This research is partially supported by the National Natural Science Foundation of China under Grant Nos. 61272181, 61173029, and 61173030; the National Basic Research Program of China under Grant No. 2011CB302200-G; the 863 Program under Grant No. 2012AA011004; and the Fundamental Research Funds for the Central Universities under Grant No. N120404006.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Bi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Bi, X., Zhao, X., Ma, W., Zhang, Z., Zhan, H. (2016). Record Linkage for Event Identification in XML Feeds Stream Using ELM. In: Cao, J., Mao, K., Wu, J., Lendasse, A. (eds) Proceedings of ELM-2015 Volume 1. Proceedings in Adaptation, Learning and Optimization, vol 6. Springer, Cham. https://doi.org/10.1007/978-3-319-28397-5_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-28397-5_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-28396-8

  • Online ISBN: 978-3-319-28397-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics