Skip to main content

Expressiveness and Performance of Full-Text Search Languages

  • Conference paper
Advances in Database Technology - EDBT 2006 (EDBT 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3896))

Included in the following conference series:

Abstract

We study the expressiveness and performance of full-text search languages. Our motivation is to provide a formal basis for comparing full-text search languages and to develop a model for full-text search that can be tightly integrated with structured search. We design a model based on the positions of tokens (words) in the input text, and develop a full-text calculus (FTC) and a full-text algebra (FTA) with equivalent expressive power; this suggests a notion of completeness for full-text search languages. We show that existing full-text languages are incomplete and identify a practical subset of the FTC and FTA that is more powerful than existing languages, but which can still be evaluated efficiently.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baeza-Yates, R., Ribiero-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)

    Google Scholar 

  2. Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrabarti, S., Sudarshan, S.: Keyword Searching and Browsing in Databases using BANKS. ICDE (2002)

    Google Scholar 

  3. Botev, C., Amer-Yahia, S., Shanmugasundaram, J.: ”On the Completeness of Full-Text Search Languages”. Technical Report, Cornell University (2005), http://www.cs.cornell.edu/database/TeXQuery/Expressiveness.pdf

  4. Bremer, J.M., Gertz, M.: XQuery/IR: Integrating XML Document and Data Retrieval. In: WebDB (2002)

    Google Scholar 

  5. Brown, E.W.: Fast Evaluation of Structured Queries for Information Retrieval. SIGIR (1995)

    Google Scholar 

  6. Chinenyanga, T.T., Kushmerick, N.: Expressive and Efficient Ranked Querying of XML Data. WebDB (2001)

    Google Scholar 

  7. Clarke, C., Cormack, G., Burkowski, F.: An Algebra for Structured Text Search and a Framework for its Implementation. Comput. J. 38(1), 43–56 (1995)

    Google Scholar 

  8. Codd, E.F.: Relational Completeness of Database Sublanguages. In: Rustin, R. (ed.) Database Systems (1972)

    Google Scholar 

  9. Cohen, S., et al.: XSEarch: A Semantic Search Engine for XML. In: VLDB (2003)

    Google Scholar 

  10. Consens, M.P., Milo, T.: Algebras for Querying Text Regions: Expressive Power and Optimization. J. Comput. Syst. Sci. 57(3), 272–288 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  11. Fagin, R., Lotem, A., Naor, M.: Optimal Aggregation Algorithms for Middleware. J. of Comp. and Syst. Sciences 66 (2003)

    Google Scholar 

  12. Florescu, D., Kossmann, D., Manolescu, I.: Integrating Keyword Search into XML Query Processing. WWW (2000)

    Google Scholar 

  13. Fuhr, N., Grossjohann, K.: XIRQL: An Extension of XQL for Information Retrieval. SIGIR (2000)

    Google Scholar 

  14. Fuhr, N., Rölleke, T.: A Probabilistic Relational Algebra for the Integration of Information Retrieval and Database Systems. ACM TOIS 15(1) (1997)

    Google Scholar 

  15. Hayashi, Y., Tomita, J., Kikui, G.: Searching Text-rich XML Documents with Relevance Ranking. In: SIGIR Workshop on XML and Information Retrieval (2000)

    Google Scholar 

  16. Hristidis, V., Gravano, L., Papakonstantinou, Y.: Efficient IR-Style Keyword Search over Relational Databases. In: VLDB (2003)

    Google Scholar 

  17. Jaakkola, J., Kilpelinen, P.: Nested Text-Region Algebra Report C-1999-2, Dept. of Computer Science, University of Helsinki (January 1999)

    Google Scholar 

  18. Melton, J., Eisenberg, A.: SQL Multimedia and Application Packages (SQL/MM). SIGMOD Record 30(4) (2001)

    Google Scholar 

  19. Myaeng, S.-H., Jang, D.-H., Kim, M.-S., Zhoo, Z.-C.: A FlexibleModel for Retrieval of SGML Documents. In: SIGIR (1998)

    Google Scholar 

  20. Navarro, G., Baeza-Yates, R.: Proximal Nodes: a Model to Query Document Databases by Content and Structure. ACM Trans. Inf. Syst. 15(4) (1997)

    Google Scholar 

  21. Salminen, A.: A Relational Model for Unstructured Documents. In: SIGIR 1987 (1987)

    Google Scholar 

  22. Salminen, A., Tompa, F.: PAT Expressions: an Algebra for Text Search. Acta Linguistica Hungar 41(1-4) (1992)

    Google Scholar 

  23. Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983); Expressiveness and Performance of Full-Text Search Languages 367

    MATH  Google Scholar 

  24. Salton, G.: Automatic Text Processing: The Transformation, Analysis and Retrieval of Information by Computer. Addison-Wesley, Reading (1989)

    Google Scholar 

  25. Theobald, A., Weikum, G.: The index-based XXL search engine for querying XML data with relevance ranking. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, p. 477. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  26. Vardi, M.: The Complexity of Relational Query Languages. STOC (1982)

    Google Scholar 

  27. Witten, I., Moffat, A., Bell, T.: Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan Kaufmann Publishing, San Francisco (1999)

    Google Scholar 

  28. Young-Lai, M., Tompa, F.: One-pass Evaluation of Region Algebra Expressions. Inf. Syst. 28(3) (2003)

    Google Scholar 

  29. Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On Supporting Containment Queries in Relational Database Management Systems. SIGMOD (2001)

    Google Scholar 

  30. Zimanyi, E.: Query Evaluations in Probabilistic Relational Databases. Theoretical Computer Science (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Botev, C., Amer-Yahia, S., Shanmugasundaram, J. (2006). Expressiveness and Performance of Full-Text Search Languages. In: Ioannidis, Y., et al. Advances in Database Technology - EDBT 2006. EDBT 2006. Lecture Notes in Computer Science, vol 3896. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11687238_23

Download citation

  • DOI: https://doi.org/10.1007/11687238_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-32960-2

  • Online ISBN: 978-3-540-32961-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics