Skip to main content

RoSeS: A Continuous Content-Based Query Engine for RSS Feeds

  • Conference paper
Database and Expert Systems Applications (DEXA 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6861))

Included in the following conference series:

Abstract

In this paper we present RoSeS (Really Open Simple and Efficient Syndication), a generic framework for content-based RSS feed querying and aggregation. RoSeS is based on a data-centric approach, using a combination of standard database concepts like declarative query languages, views and multi-query optimization. Users create personalized feeds by defining and composing content-based filtering and aggregation queries on collections of RSS feeds. Publishing these queries corresponds to defining views which can then be used for building new queries / feeds. This naturally reflects the publish-subscribe nature of RSS applications. The contributions presented in this paper are a declarative RSS feed aggregation language, an extensible stream algebra for building efficient continuous multi-query execution plans for RSS aggregation views, a multi-query optimization strategy for these plans and a running prototype based on a multi-threaded asynchronous execution engine.

The authors acknowledge the support of the French Agence Nationale de la Recherche (ANR), under grant ROSES (ANR-07-MDCO-011) “Really Open, Simple and Efficient Syndication”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. The yahoo! pipes feed aggregator, http://pipes.yahoo.com

  2. Yahoo! query language, http://developer.yahoo.com/yql

  3. Arasu, A., Babu, S., Widom, J.: The CQL continuous Query Language: Semantic Foundations and Query Execution. In: VLDB, pp. 121–142 (2006)

    Google Scholar 

  4. Botan, I., Fischer, P., Florescu, D., Kossman, D., Kraska, T., Tamosevicius, R.: Extending XQuery with Window Functions. In: VLDB, pp. 75–86 (2007)

    Google Scholar 

  5. Cammert, M., Krmer, J., Seeger, B., Vaupel, S.: A cost-based approach to adaptive resource management in data stream systems. TKDE 20, 230–245 (2008)

    Google Scholar 

  6. Chandramouli, B., Phillips, J.M., Yang, J.: Value-based notification conditions in large-scale publish/subscribe systems. In: VLDB, pp. 878–889 (2007)

    Google Scholar 

  7. Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M., Hellerstein, J., Hong, W., Krishnamurthy, S., Madden, S., Raman, V., Reiss, F., Shah, M.A.: TelegraphCQ: Continuous Dataflow Processing for an Uncertain World. In: CIDR (2003)

    Google Scholar 

  8. Charikar, M., Chekuri, C., Cheung, T., Dai, Z., Goel, A., Guha, S., Li, M.: Approximation algorithms for directed steiner problems. In: Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms SODA 1998, pp. 192–200. Society for Industrial and Applied Mathematics (1998)

    Google Scholar 

  9. Chen, J., DeWitt, D.J., Tian, F., Wang, Y.: NiagaraCQ: A Scalable Continuous Query System for Internet Databases. SIGMOD Record, 379–390 (2000)

    Google Scholar 

  10. Creus, J., Amann, B., Travers, N., Vodislav, D.: Un agrégateur de flux rss avancé. 26e Journées Bases de Données Avancées, demonstration (2010)

    Google Scholar 

  11. Demers, A., Gehrke, J., Hong, M., Riedewald, M., White, W.: Towards expressive publish/Subscribe systems. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 627–644. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  12. Fabret, F., Jacobsen, A., Llirbat, F., Pereira, J., Ross, K., Shasha, D.: Filtering algorithms and implementation for very fast publish/subscribe systems. SIGMOD Record, 115–126 (2001)

    Google Scholar 

  13. Golab, L., Özsu, M.T.: Issues in Data Stream Management. SIGMOD Record 32(2), 5–14 (2003)

    Article  Google Scholar 

  14. Gupta, A.K., Suciu, D.: Stream Processing of XPath Queries with Predicates. SIGMOD Record, 419–430 (2003)

    Google Scholar 

  15. Hong, M., Demers, A.J., Gehrke, J., Koch, C., Riedewald, M., White, W.M.: Massively Multi-Query Join Processing in Publish/Subscribe Systems. SIGMOD Record, 761–772 (2007)

    Google Scholar 

  16. Horincar, R., Amann, B., Artières, T.: Best-effort refresh strategies for content-based RSS feed aggregation. In: Chen, L., Triantafillou, P., Suel, T. (eds.) WISE 2010. LNCS, vol. 6488, pp. 262–270. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  17. I.E.T.F. IETF. Atompub status pages. (2011), http://tools.ietf.org/wg/atompub

  18. Jun, S., Ahamad, M.: FeedEx: Collaborative Exchange of News Feeds. In: WWW, pp. 113–122 (2006)

    Google Scholar 

  19. Koch, C., Scherzinger, S., Schweikardt, N., Stegmaier, B.: FluXQuery: An Optimizing XQuery Processor for Streaming XML Data. In: VLDB (2004)

    Google Scholar 

  20. König, A.C., Church, K.W., Markov, M.: A Data Structure for Sponsored Search. In: ICDE, pp. 90–101 (2009)

    Google Scholar 

  21. Li, J., Maier, D., Tufte, K., Papadimos, V., Tucker, P.A.: Semantics and Evaluation Techniques for Window Aggregates in Data Streams. SIGMOD Record, 311–322 (2005)

    Google Scholar 

  22. Li, X., Yan, J., Deng, Z., Ji, L., Fan, W., Zhang, B., Chen, Z.: A Novel Clustering-Based RSS Aggregator. In: WWW, pp. 1309–1310 (2007)

    Google Scholar 

  23. Luo, C., Thakkar, H., Wang, H., Zaniolo, C.: A Native Extension of SQL for Mining Data Streams. SIGMOD Record, 873–875 (2005)

    Google Scholar 

  24. Milo, T., Zur, T., Verbin, E.: Boosting topic-based publish-subscribe systems with dynamic clustering. SIGMOD Record, 749–760 (2007)

    Google Scholar 

  25. Peng, F., Chawathe, S.: XPath Queries on Streaming Data. SIGMOD Record, 431–442 (2003)

    Google Scholar 

  26. Rose, I., Murty, R., Pietzuch, P.R., Ledlie, J., Roussopoulos, M., Welsh, M.: Cobra: Content-based filtering and aggregation of blogs and rss feeds. In: NSDI (2007)

    Google Scholar 

  27. Sellis, T.K.: Multiple-query optimization. ACM Trans. Database Syst. 13, 23–52 (1988)

    Article  Google Scholar 

  28. Whang, S.E., Garcia-Molina, H., Brower, C., Shanmugasundaram, J., Vassilvitskii, S., Vee, E., Yerneni, R.: Indexing boolean expressions. VLDB Endow. 2, 37–48 (2009)

    Article  Google Scholar 

  29. Wu, E., Diao, Y., Rizvi, S.: High-Performance Complex Event Processing over Streams. SIGMOD Record, 407–418 (2006)

    Google Scholar 

  30. Yang, Y., Krämer, J., Papadias, D., Seeger, B.: Hybmig: A hybrid approach to dynamic plan migration for continuous queries. TKDE 19(3), 398–411 (2007)

    Google Scholar 

  31. Zhou, Y., Salehi, A., Aberer, K.: Scalable delivery of stream query result. VLDB Endow. 2, 49–60 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Creus Tomàs, J., Amann, B., Travers, N., Vodislav, D. (2011). RoSeS: A Continuous Content-Based Query Engine for RSS Feeds. In: Hameurlain, A., Liddle, S.W., Schewe, KD., Zhou, X. (eds) Database and Expert Systems Applications. DEXA 2011. Lecture Notes in Computer Science, vol 6861. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23091-2_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23091-2_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23090-5

  • Online ISBN: 978-3-642-23091-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics