Skip to main content

Semijoin

  • Living reference work entry
  • First Online:
Encyclopedia of Database Systems
  • 39 Accesses

Synonyms

Bit vector join; Bloom filter join; Bloom join; Hash filter join; Semijoin filter

Definition

Semijoin is a technique for processing a join between two tables that are stored at different sites. The basic idea is to reduce the transfer cost by first sending only the projected join column(s) to the other site, where it is joined with the second relation. Then, all matching tuples from the second relation are sent back to the first site to compute the final join result.

Historical Background

The semijoin technique was originally developed by Bernstein et al. [3] as part of the SDD-1 project as a reduction operator for distributed query processing. The idea of applying hash filtering was proposed by Babb [1] as well as by Valduriez [9] particularly for specialized hardware (content addressed file stores and distributed database machines respectively). The theory of semijoin-based distributed query processing was presented in [2]. In [10] semijoins are also exploited for query...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Recommended Reading

  1. Babb E. Implementing a relational database by means of specialized hardware. ACM Trans Database Syst. 1979;4(1):1–29.

    Article  Google Scholar 

  2. Bernstein PA, Chiu D-MW. Using semi-joins to solve relational queries. J ACM. 1981;28(1):25–50.

    Article  MATH  Google Scholar 

  3. Bernstein PA, Goodman N, Wong E, Reeve CL, Rothnie Jr. Query processing in a system for distributed databases (SDD-1). ACM Trans Database Syst. 1981;6(4):602–25.

    Article  MATH  Google Scholar 

  4. Bloom BH. Space/time trade-offs in hash coding with allowable errors. Commun ACM. 1970;13(7):422–6.

    Article  MATH  Google Scholar 

  5. Hevner AR, Yao SB. Query processing in distributed database systems. IEEE Trans Softw Eng. 1979;5(3):177–82.

    Article  MATH  Google Scholar 

  6. Lu H, Carey M. Some experimental results on distributed join algorithms in a local network. Proceedings of 11th International Conference on Very Large Data Bases; 1985. p. 229–304.

    Google Scholar 

  7. Mackert L.F., Lohman G. R* optimizer validation and performance evaluation for local queries. Proceedings of ACM SIGMOD International Conference on Management on Data; 1986. p. 4–95.

    Google Scholar 

  8. Özsu MT, Valduriez P. Principles of distributed database systems. 2nd ed. Prentice-Hall; 1999.

    Google Scholar 

  9. Valduriez P. Semi-join algorithms for distributed database machines. In: Schneider J-J, editor. Distributed data bases. Amsterdam: North-Holland; 1982. p. 23–37.

    Google Scholar 

  10. Valduriez P, Gardarin G. Join and semi join algorithms for a multiprocessor database machine. ACM Trans Database Syst. 1984;9(1):133–61.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kai-Uwe Sattler .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media LLC

About this entry

Cite this entry

Sattler, KU. (2017). Semijoin. In: Liu, L., Özsu, M. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4899-7993-3_706-2

Download citation

  • DOI: https://doi.org/10.1007/978-1-4899-7993-3_706-2

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4899-7993-3

  • Online ISBN: 978-1-4899-7993-3

  • eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics