Bloom Filter
m=|bit array|, n=|S|, k=# of hash functions.
false positive=
![\left(1-\left[1-\frac{1}{m}\right]^{kn}\right)^k \approx \left( 1-e^{-kn/m} \right)^k.](https://upload.wikimedia.org/math/2/1/8/2180ac79da81e5b3721963b4d80cf5a6.png)
Given m and n,
will results the minimum false positive. k=-log2eps.
![k = \frac{m}{n} \ln 2,](https://upload.wikimedia.org/math/b/e/f/befd3e221f8db3145948a28cb0901a13.png)
y=D(expression((1-exp(c*x))^x),'x')
> y
(1 - exp(c * x))^x * log((1 - exp(c * x))) - (1 - exp(c * x))^(x - 1) * (x * (exp(c * x) * c))
let y=0, we get k
let c=m/n, number of bits of each Si
in code: given eps and n, we get k=-log2eps, c=k/ln2, m=n/c
Java code :
https://github.com/MagnusS/Java-BloomFilter/blob/master/src/com/skjegstad/utils/BloomfilterBenchmark.java