We provide several new results related to the concept of min-wise independence. Our main result is that any randomized sampling scheme for the relative intersection of sets based on testing equality of samples yields an equivalent min-wise independent family. Thus, in a certain sense, min-wise independent families are “complete” for this type of estimation.

We also discuss the notion of robustness, a concept extending min-wise independence to allow more efficient use of it in practice. A surprising result arising from our consideration of robustness is that under a random permutation from a min-wise independent family, any element of a fixed set has an equal chance to get any rank in the image of the set, not only the minimum as required by definition.


Nonempty Subset Transitive Closure Equal Chance Robustness Property Basic Feasible Solution 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Broder, A.Z.: On the resemblance and containment of documents. In: Proceedings of Compression and Complexity of Sequences 1997, pp. 21–29. IEEE Computer Society, Los Alamitos (1988)Google Scholar
  2. 2.
    Broder, A.Z.: Filtering near-duplicate documents. In: Proceedings of FUN 1998 (1998) (to appear)Google Scholar
  3. 3.
    Broder, A.Z., Burrows, M., Manasse, M.S.: Efficient computation of minima of random functions (manuscript)Google Scholar
  4. 4.
    Broder, A.Z., et al.: Min-wise independent permutations. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing (STOC 1998), May 1998, pp. 327–336. ACM Press, New York (1998)CrossRefGoogle Scholar
  5. 5.
    Broder, A.Z., Charikar, M., Mitzenmacher, M.: A derandomization using min- wise independent permutations. In: Rolim, J.D.P., Serna, M., Luby, M. (eds.) RANDOM 1998. LNCS, vol. 1518, pp. 15–24. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  6. 6.
    Broder, A.Z., Glassman, S.C., Manasse, M.S., Zweig, G.: Syntactic clustering of the Web. In: Proceedings of the Sixth International World Wide Web Conference, April 1997, pp. 391–404 (1997)Google Scholar
  7. 7.
    Cohen, E.: Estimating the size of the transitive closure in linear time. In: 35th Annual Symposium on Foundations of Computer Science, Santa Fe, New Mexico, pp. 190–200. IEEE, Los Alamitos (1994)CrossRefGoogle Scholar
  8. 8.
    Indyk, P.: A small approximately min-wise independent family. In: Proceedings of the Tenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 454–456 (1999)Google Scholar
  9. 9.
    Karger, D., Lehman, E., Leighton, T., Levine, M., Lewin, D., Panigrahy, R.: Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In: Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing, El Paso, Texas, May 4-6, pp. 654–663 (1997)Google Scholar
  10. 10.
    Mulmuley, K.: Randomized geometric algorithms and pseudorandom generators. Algorithmica, 16(4/5), 450–463 (1996)zbMATHMathSciNetGoogle Scholar
  11. 11.
    Saks, M., Srinivasan, A., Zhou, S., Zuckerman, D.: Discrepant Sets Yield Approximate Min-Wise Independent Permutation Families. In: These ProceedingsGoogle Scholar
  12. 12.
    Takei, Y., Itoh, T.: A characterization of min-wise independent permutation families. In: Proceedings of the Language and Automata Symposium, Kyoto-Univ, Japan, Feb 1-3 (1999) (to appear)Google Scholar
  13. 13.
    Takei, Y., Itoh, T., Shinozaki, T.: An optimal construction of exactly min-wise independent permutations. Technical Report COMP98-62, IEICE (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Andrei Z. Broder
    • 1
  • Michael Mitzenmacher
    • 2
  1. 1.Compaq Systems Research CenterPalo AltoUSA
  2. 2.Harvard UniversityCambridgeUSA

Personalised recommendations