Skip to main content

SEEDEEP: A System for Exploring and Querying Scientific Deep Web Data Sources

  • Conference paper
Scientific and Statistical Database Management (SSDBM 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5566))

Abstract

A recent and emerging trend in scientific data dissemination involves online databases that are hidden behind query forms, thus forming what is referred to as the deep web. In this paper, we propose SEEDEEP, a System for Exploring and quErying scientific DEEP web data sources. SEEDEEP is able to automatically mine deep web data source schemas, integrate heterogeneous data sources, answer cross-source keyword queries, and incorporates features like caching and fault-tolerance. Currently, SEEDEEP integrates 16 deep web data sources in the biological domain. We demonstrate how an integrated model for correlated deep web data sources is constructed, how a complex cross-source keyword query is answered efficiently and correctly, and how important performance issues are addressed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. He, B., Patel, M., Zhang, Z., Chang, K.C.-C.: Accessing the deep web: A survey. Communications of ACM 50, 94–101 (2007)

    Article  Google Scholar 

  2. Babu, P.A., Boddepalli, R., Lakshmi, V.V., Rao, G.N.: Dod: Database of databases–updated molecular biology databases. Silico. Biol. 5 (2005)

    Google Scholar 

  3. He, B., Zhang, Z., Chang, K.C.C.: Knocking the door to the deep web: Integrating web query interfaces. In: Proceedings of the 2004 ACM SIGMOD international conference on Management of Data, pp. 913–914 (2004)

    Google Scholar 

  4. Chang, K.C.C., Cho, J.: Accessing the web: From search to integration. In: Proceedings of the 2006 ACM SIGMOD international conference on Management of Data, pp. 804–805 (2006)

    Google Scholar 

  5. Chang, K., He, B., Zhang, Z.: Toward large scale integration: Building a metaquerier over databases on the web (2005)

    Google Scholar 

  6. He, H., Meng, W., Yu, C., Wu, Z.: Automatic integration of web search interfaces with wise_integrator. The international Journal on Very Large Data Bases 12, 256–273 (2004)

    Google Scholar 

  7. Zhao, H., Meng, W., Yu, C.: Mining templates from search result records of search engines. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 884–891 (2007)

    Google Scholar 

  8. Bergman, M.K.: The deep web: Surfacing hidden value. Journal of Electronic Publishing 7 (2001)

    Google Scholar 

  9. Kementsietsidis, A., Neven, F., de Craen, D.V., Vansummeren, S.: Scalable multi-query optimization for exploratory queries over federated scientific databases. Proceedings of the VLDB Endowment 1, 16–27 (2008)

    Article  Google Scholar 

  10. Hristidis, V., Papakonstantinou, Y.: Discover: Keyword search in relational databases. In: Proceedings of the 28th international conference on Very Large Data Bases, pp. 67–681 (2002)

    Google Scholar 

  11. Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Desai, R., Karambelkar, H.: Bidirectional expansion for keyword search on graph databases. In: Proceedings of the 31st international conference on Very Large Data Bases, pp. 505–516 (2005)

    Google Scholar 

  12. Aditya, B., Bhalotia, G., Chakrabarti, S., Hulgeri, A., Nakhe, C., Parag, P., Sudarshan, S.: Banks: Browsing and keyword searching in relational databases. In: Proceedings of the 28th International Conference on Very Large Data Bases, vol. 28, pp. 1083–1086 (2002)

    Google Scholar 

  13. Liu, T., Wang, F., Agrawal, G.: Exploiting parallelism to accelerate keyword search on deep-web sources. In: The proceedings of the 2009 DILS workshop (to appear, 2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, F., Agrawal, G. (2009). SEEDEEP: A System for Exploring and Querying Scientific Deep Web Data Sources. In: Winslett, M. (eds) Scientific and Statistical Database Management. SSDBM 2009. Lecture Notes in Computer Science, vol 5566. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02279-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02279-1_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02278-4

  • Online ISBN: 978-3-642-02279-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics