Skip to main content

Learning the Grammar of Distant Change in the World-Wide Web

  • Conference paper
AI 2004: Advances in Artificial Intelligence (AI 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3339))

Included in the following conference series:

  • 2561 Accesses

Abstract

One problem many Web users encounter is to keep track of changes of distant Web sources. Push services, informing clients about data changes, are frequently not provided by Web servers. Therefore it is necessary to apply intelligent pull strategies, optimizing reload requests by observation of data sources. In this article an adaptive pull strategy is presented that optimizes reload requests with respect to the ‘age’ of data and lost data. The method is applicable if the remote change pattern may approximately be described by a piecewise deterministic behavior which is frequently the case if data sources are updated automatically. Emphasis is laid on an autonomous estimation where only a minimal number of parameters has to be provided manually.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arasu, A., Cho, J., Garcia-Molina, H., Paepcke, A., Raghavan, S.: Searching the web. ACM Trans. Inter. Tech. 1(1), 2–43 (2001)

    Article  Google Scholar 

  2. Babu, S., Widom, J.: Continuous queries over data streams. SIGMOD Rec. 30(3), 109–120 (2001)

    Article  Google Scholar 

  3. Brewington, B.E., Cybenko, G.: How dynamic is the Web? Computer Networks 33(1–6), 257–276 (2000)

    Article  Google Scholar 

  4. Chen, X., Zhang, X.: Web document prefetching on the internet. In: Zhong, Liu, Yao (eds.) Web Intelligence, ch.16, Springer, Heidelberg (2003)

    Google Scholar 

  5. Cho, J., Garcia-Molina, H.: Estimating frequency of change. ACM Trans. Inter. Tech. 3(3), 256–290 (2003)

    Article  Google Scholar 

  6. Coffman, E., Liu, Z., Weber, R.R.: Optimal robot scheduling for web search engines. Journal of Scheduling 1(1), 15–29 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  7. World Wide Web Consortium. W3c httpd, http://www.w3.org/Protocols/

  8. Deshpande, M., Karypis, G.: Selective markov models for predicting web page accesses. ACM Trans. Inter. Tech. 4(2), 163–184 (2004)

    Article  Google Scholar 

  9. Dingle, A., Partl, T.: Web cache coherence. Computer Networks and ISDN Systems 28(7-11), 907–920 (1996)

    Article  Google Scholar 

  10. Dupont, P., Miclet, L., Vidal, E.: What is the search space of the regular inference? In: Carrasco, R.C., Oncina, J. (eds.) Proceedings of the Second International Colloquium on Grammatical Inference (ICGI 1994): Grammatical Inference and Applications, vol. 862, pp. 25–37. Springer, Berlin (1994)

    Google Scholar 

  11. Everitt, B.S.: Cluster Analysis. Hodder Arnold (2001)

    Google Scholar 

  12. Gold, E.: Language identification in the limit. Information and Control 10, 447–474 (1967)

    Article  MATH  Google Scholar 

  13. Kendall, J.E., Kendall, K.E.: Information delivery systems: an exploration of web pull and push technologies. Commun. AIS 1(4es), 1 (1999)

    Google Scholar 

  14. Kukulenz, D.: Capturing web dynamics by regular approximation. In: Zhou, X., et al. (eds.) WISE 2004. LNCS, vol. 3306, pp. 528–540. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  15. Kukulenz, D.: Optimization of continuous queries by regular inference. In: 6th International Baltic Conference on Databases and IS. Scientific Papers, vol. 672, pp. 62–77 (2004)

    Google Scholar 

  16. Olston, C., Wildom, J.: Best-effort cache synchronization with source cooperation. In: Proceedings od SIGMOD (May 2002)

    Google Scholar 

  17. Oncina, J., Garcia, P.: Inferring regular languages in polynomial update time. In: Perez, N., Sanfeliu, A., Vidal, E. (eds.) Pattern Recognition and Image Analysis, pp. 49–61. World Scientific, Singapore (1992)

    Chapter  Google Scholar 

  18. Parekh, R., Honavar, V.: Learning dfa from simple examples. Machine Learning 44(1/2), 9–35 (2001)

    Article  MATH  Google Scholar 

  19. Wessels, D.: Intelligent caching for world-wide web objects. In: Proceedings of INET 1995, Honolulu, Hawaii, USA (1995)

    Google Scholar 

  20. Wolf, J.L., Squillante, M.S., Yu, P.S., Sethuraman, J., Ozsen, L.: Optimal crawling strategies for web search engines. In: Proceedings of the eleventh international conference on World Wide Web, pp. 136–147. ACM Press, New York (2002)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kukulenz, D. (2004). Learning the Grammar of Distant Change in the World-Wide Web. In: Webb, G.I., Yu, X. (eds) AI 2004: Advances in Artificial Intelligence. AI 2004. Lecture Notes in Computer Science(), vol 3339. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30549-1_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30549-1_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24059-4

  • Online ISBN: 978-3-540-30549-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics