Abstract
Network data extraction based on public information security is the application of network data collection technology to real-time monitoring of the net content. It calls for high speed and accuracy of data collection. Network forums, Blogs and news webpages are the major existing space for public information security data. This paper designed a data collector and proposed a data extraction method for forums, Blogs and news webpages.
This project is supported by Science and Technological Program for Dongguan′s Higher Education, Science and Research, and Health Care Institutions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Yao, X.: BBS Hot Topic’s Mining and Opinion Analysis. Dalian Maritime University, Dalian (2008)
Hogue, A., Karger, D.: Thresher: Automating the Unwrapping of Semantic Content from the World Wide Web. In: Proc. of the 14th Intl. World Wide Web Conference (WWW 2005), pp. 86–95 (2005)
Li, B., Chen, Y., Yu, S.: Overview of Information Extraction Studies. Department of computer science and technology, Peking University. Institute of computing languages, Beijing (2003)
Liu, B., Grossman, R., Zhai, Y.: Mining Web pages for data records. IEEE. Intelligent Systems 19, 1541–1672 (2004)
Li, L.: Study on the Content Extraction of Mulitple-featured HTML Webpage. Shandong University, Shandong (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, Z., Guo, J. (2012). A Network Data Extraction Method Based on Public Information Security. In: Lei, J., Wang, F.L., Deng, H., Miao, D. (eds) Emerging Research in Artificial Intelligence and Computational Intelligence. AICI 2012. Communications in Computer and Information Science, vol 315. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34240-0_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-34240-0_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34239-4
Online ISBN: 978-3-642-34240-0
eBook Packages: Computer ScienceComputer Science (R0)