Skip to main content

Wrapper Induction

  • Living reference work entry
  • First Online:
Encyclopedia of Database Systems
  • 76 Accesses

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Recommended Reading

  1. Adelberg B. NoDoSE: a tool for semi-automatically extracting structured and semistructured data from text documents. In Proceedings of ACM SIGMOD International Conference on Management of Data; 1998. p. 283–94.

    Google Scholar 

  2. Baumgartner R, Flesca S, Gottlob G. Visual web information extraction with Lixto. In Proceedings of 27th International Conference on Very Large Data Bases; 2001. p. 119–28.

    Google Scholar 

  3. Carme J, Ceresna M, Goebel M. Web wrapper specification using compound filter learning. In Proceedings of IADIS International Conference on WWW/Internet 2006; 2006.

    Google Scholar 

  4. Chang CH, Kuo SC. OLERA: semisupervised web-data extraction with visual support. IEEE Intell Syst. 2004;19(6):56–64.

    Article  Google Scholar 

  5. Finn A, Kushmerick N. Active learning selection strategies for information extraction. In Proceedings of Workshop on Adaptative Text Extraction and Mining; 2003.

    Google Scholar 

  6. Freitag D, Kushmerick N. Boosted wrapper induction. In Proceedings of 12th National Conference on AI; 2000. p. 577–83.

    Google Scholar 

  7. Hsu CN, Dung MT. Generating finite-state transducers for semi-structured data extraction from the web. Inf Syst. 1998;23(8):521–38.

    Article  Google Scholar 

  8. Irmak U, Suel T. Interactive wrapper generation with minimal user effort. In Proceedings of 15th International World Wide Web Conference; 2006. p. 553–63.

    Google Scholar 

  9. Knoblock CA, Lerman K, Minton S, Muslea I. Accurately and reliably extracting data from the web: a machine learning approach. Q Bull, IEEE TC Data Eng. 2000;23(4):33–41.

    Google Scholar 

  10. Kushmerick N. Wrapper induction for information extraction. PhD thesis, University of Washington; 1997.

    Google Scholar 

  11. Kushmerick N. Wrapper induction: efficiency and expressiveness. Artif Intell. 2000;118(1–2):15–68.

    Article  MathSciNet  MATH  Google Scholar 

  12. Laender AHF, Ribeiro-Neto B, da Silva AS. DEByE – date extraction by example. Data Knowl Eng. 2002;40(2):121–54.

    Article  MATH  Google Scholar 

  13. Liu L, Pu C, Han W. XWRAP: an XML-enabled wrapper construction system for web information sources. In Proceedings of 16th International Conference on Data Engineering; 2000. p. 611–21.

    Google Scholar 

  14. Muslea I, Minton S, Knoblock C. STALKER: learning extraction rules for semistructured, web-based information sources. 1998. URL http://citeseer.ist.psu.edu/muslea98stalker.html

  15. Muslea I, Minton S, Knoblock CA. Selective sampling with redundant views. In Proceedings of 12th National Conference on AI; 2000. p. 621–26.

    Google Scholar 

  16. Sahuguet A, Azavant F. WysiWyg web wrapper factory (W4F). 2001. URL http://citeseer.ist.psu.edu/553711.html; http://www.ai.mit.edu/people/jimmylin/papers/Sahuguet99.ps

  17. Seymore K, McCallum A, Rosenfeld R. Learning hidden Markov model structure for information extraction. In Proceedings of AAAI 99 Workshop on Machine Learning for Information Extraction; 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Max Goebel .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this entry

Cite this entry

Goebel, M., Ceresna, M. (2016). Wrapper Induction. In: Liu, L., Özsu, M. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4899-7993-3_1160-2

Download citation

  • DOI: https://doi.org/10.1007/978-1-4899-7993-3_1160-2

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, New York, NY

  • Online ISBN: 978-1-4899-7993-3

  • eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics