Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Crescenzi V, Mecca G, Merialdo P. RoadRunner: towards automatic data extraction from large web sites. In: Proceedings of the 27th International Conference on Very Large Data Bases; 2001. p. 109–18.
Debnath S, Mitra P, Giles CL. Automatic extraction of informative blocks from webpages. In: Proceedings of the 2005 ACM Symposium on Applied Computing; 2005. p. 1722–6.
Glance N, Hurst M, Nigam K, Siegler M, Stockton R, Tomokiyo T. Deriving marketing intelligence from online discussion. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2005. p. 419–28.
Hofmann K, Weerkamp W. Web corpus cleaning using content and structure. In: Fairon C, Naerts H, Kilgarrif A, de Schryver G, editors. Building and exploring web Corpora. vol. 4, UCL; 2007.p. 145–54.
Kovacevic M, Dilligenti M, Gori M, Milutinovic V. Recognition of common areas in a web page using a visualization approach. In: Proceedings of the 10th International Conference on Artificial Intelligence: Methodology, Systems, and Applications; 2002. p. 203–12.
Kushmerick N, Weld D, Doorenbos R. Wrapper induction for information extraction. In: Proceedings of the 15th International Joint Conference on AI; 1997. p. 119–28.
Lin SH, Ho JM. Discovering informative content blocks from web documents. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2002.p. 588–93.
Liu B, Grossman R, Zhai Y. Mining data records in web pages. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2003. p. 601–6.
Muslea I, Minton S, Knoblock C. Hierarchical wrapper induction for semistructured information sources. Auton Agent Multi-Agent Syst. 2001;4(1–2):93–114.
Simon K, Lausen G. ViPER: augmenting automatic information extraction with visual perceptions. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management; 2005. p. 381–8.
Ziegler CN, Skubacz M. Towards automated reputation and brand monitoring on the web. In: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence; 2006. p. 1066–70.
Ziegler CN, Skubacz M. Content extraction from news pages using particle swarm optimization on an linguistic and structural features. In: Proceedings of the 2007 IEEE/WIC/ACM International Conference on Web Intelligence; 2007. p. 242–9.
Zhao H, Meng W, Wu Z, Raghavan V, Yu C. Fully automatic wrapper generation for search engines. In: Proceedings of the 14th International World Wide Web Conference; 2005. p. 66–75.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Ziegler, CN. (2018). Fully Automatic Web Data Extraction. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_1159
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_1159
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering