Skip to main content

WIND: A warehouse for internet data

  • Object Orientation and The Internet
  • Conference paper
  • First Online:
Advances in Databases (BNCOD 1997)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1271))

Included in the following conference series:

Abstract

The increasing amount of information available in the web demands sophisticated querying methods and knowledge discovery techniques. In this study, we introduce our architectural framework WIND for a data warehouse over a domain-specific thematic section of the Internet. The aim of WIND is to provide a partially materialized structured view of the underlying information sources, on which database querying can be applied and mining techniques can be developed. WIND loads web documents into several complementary local repositories like OODBMSs and text retrieval systems. This allows for a combination of attribute and content-oriented query processing. Special interest is paid to domain-specific document formats. To support conversion between (semi-)structured documents and database objects, we consider a technique for the generation of format converters based on the notion of object-grammars.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. K. Aberer, K. Böhm, and C. Hüser. The prospects of publishing using advanced database concepts. Electronic Publishing, 6(4):469–480, dec 1993.

    Google Scholar 

  2. S. Abiteboul, S. Cluet, and T. Milo. Querying and updating the file. In 19th VLDB Conf., volume 19, pages 73–85, 8 1993.

    Google Scholar 

  3. S. Abiteboul, S. Cluet, and T. Milo. A database interface for file update. In SIGMOD '95, pages 386–397, 1995.

    Google Scholar 

  4. S. Abiteboul, S. Cluet, and T. Milo. Correspondence and translation for heterogeneous data. In ICDT '97, number 1186 in LNCS, pages 351–363, 1997.

    Google Scholar 

  5. R. Cattell. The Object Database Standard, ODMG-93. Morgan Kaufmann, 1994.

    Google Scholar 

  6. S. Chaudhuri and L. Gravano. Optimizing queries over multimedia repositories. In SIGMOD'96, pages 91–102, Montreal, Canada, June 1996. ACM.

    Google Scholar 

  7. S. Chawathe, H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. Ullman, and J. Widom. The TSIMMIS project: Integration of heterogeneous information sources. In Proc. of the 100th Anniv. Meeting, pages 7–18. Information Processing Society of Japan, 1994.

    Google Scholar 

  8. O. Etzioni. The World-Wide Web: Quagmire or gold mine? CACM, 39(11):65–68, Nov. 1996.

    Google Scholar 

  9. R. Fagin. Combining fuzzy informationm from multiple systems. In PODS'96, pages 216–226, Montreal, Canada, June 1996. ACM.

    Google Scholar 

  10. L. Faulstich, V. Linnemann, and M. Spiliopoulou. Using object-grammars for internet data warehousing. Technical report, Institut für Informationssysteme, Med. Universität Lübeck, 1997. http://www.inf.fu-berlin.de/faulstic/wind.ps.

    Google Scholar 

  11. U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. The KDD process for extracting useful knowledge from volumes of data. CACM, 39(11):27–34, Nov. 1996.

    Google Scholar 

  12. A. Feng and T. Wakayama. SIMON: A grammar-based transformation system for structured documents. Electronic Publishing, 6(4):361–372, Dec. 1993.

    Google Scholar 

  13. W. Inmon. EIS and the data warehouse: a simple approach to building an effective foundation for EIS. Database Programming & Design, 5(11):70–73, nov 1992.

    Google Scholar 

  14. W. Inmon. The data warehouse and data mining. CACM, 39(11):49–50, Nov. 1996.

    Google Scholar 

  15. W. Inmon and C. Kelley. Rdb/VMS: Developing the Data Warehouse. QED Publishing Group, Boston, Massachusetts, 1993.

    Google Scholar 

  16. E. Kuikka and M. Penttonen. Transformation of structured documents with the use of grammar. Electronic Publishing, 6(4):373–383, Dec. 1993.

    Google Scholar 

  17. A. Y. Levy, A. Rajaraman, and J. J. Ordille. Querying Heterogeneous Information Sources Using Source Descriptions. In 22th VLDB Conf., pages 251–262, 1996.

    Google Scholar 

  18. J. Paakki. Attribute grammar paradigms: A high-level methodology in language implementation. ACM Computing Surveys, 27(2):196–255, June 1995.

    Google Scholar 

  19. U. Stutschka and V. Linnemann. Attributierte grammatiken als Werkzeug zur datenmodellierung. In G. Lausen, editor, BTW'95, pages 160–178, 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Carol Small Paul Douglas Roger Johnson Peter King Nigel Martin

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Faulstich, L.C., Spiliopoulou, M., Linnemann, V. (1997). WIND: A warehouse for internet data. In: Small, C., Douglas, P., Johnson, R., King, P., Martin, N. (eds) Advances in Databases. BNCOD 1997. Lecture Notes in Computer Science, vol 1271. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63263-8_20

Download citation

  • DOI: https://doi.org/10.1007/3-540-63263-8_20

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63263-4

  • Online ISBN: 978-3-540-69254-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics