BloomFilter Class Reference

Platforms: Unix, Windows

class pybloomfilter.BloomFilter(capacity : int, error_rate : float, filename : string)

Create a new BloomFilter object with a given capacity and error_rate. Note that we do not check capacity. This is important, because I want to be able to support logical OR and AND (see below). The capacity and error_rate then together serve as a contract—you add less than capacity items, and the Bloom Filter will have an error rate less than error_rate.

NEW: If you specify None for the filename, then the bloom filter will be backed by malloc’d memory, rather than by a file.

Static Methods


Open an already existing Bloomfilter file.

static BloomFilter.from_base64(filename, input[, perm = 0755])

Create a new BloomFilter object on filename from the input base64 string. Example:

>>> bf = BloomFilter.from_base64("/tmp/",
>>> "MIKE" in bf

Instance Attributes


The number of elements for this filter.


The acceptable probability of false positives.


The integer seeds used for the random hashing.

The file name (compatible with file objects)


The number of bits used in the filter as buckets


The number of hash functions used when computing

Instance Methods

BloomFilter.add(item) → Boolean

Add the item to the bloom filter.

  • item – Hashable object
Return type:

Boolean (True if item already in the filter)


Remove all elements from the bloom filter at once.

BloomFilter.copy(filename) → BloomFilter

Copies the current BloomFilter object to another object with new filename.

  • filename – string filename
Return type:

new BloomFilter object

BloomFilter.copy_template(filename[, perm=0755]) → BloomFilter

Creates a new BloomFilter object with the same parameters–same hash seeds, same size.. everything. Once this is performed, the two filters are comparable, so you can perform logical operators. Example:

>>> apple = BloomFilter(100, 0.1, '/tmp/apple')
>>> apple.add('apple')
>>> pear = apple.copy_template('/tmp/pear')
>>> pear.add('pear')
>>> pear |= apple

Forces a sync() call on the underlying mmap file object. Use this if you are about to copy the file and you want to be Sure (TM) you got everything correctly.

BloomFilter.to_base64() → string

Creates a compressed, base64 encoded version of the Bloom filter. Since the bloom filter is efficiently in binary on the file system this may not be too useful. I find it useful for debugging so I can copy filters from one terminal to another in their entirety.

Return type:Base64 encoded string representing filter

Calls add() on all items in the iterable.

BloomFilter.union(filter) → BloomFilter

Perform a set OR with another comparable filter. You can (only) construct comparable filters with copy_template above. See the example in copy_template. In that example, pear will have both “apple” and “pear”.

The result will occur in place. That is, calling:


is a way to add all the elements of bf2 to bf.

N.B.: Calling this function will render future calls to len() invalid.

BloomFilter.intersection(filter) → BloomFilter

The same as union() above except it uses a set AND instead of a set OR.

N.B.: Calling this function will render future calls to len() invalid.

Magic Methods

BloomFilter.__len__(item) → Integer

Returns the number of distinct elements that have been added to the BloomFilter object, subject to the error given in error_rate.


>>> bf = BloomFilter(100, 0.1, '/tmp/fruit.bloom')
>>> bf.add("Apple")
>>> bf.add('Apple')
>>> bf.add('orange')
>>> len(bf)
>>> bf2 = bf.copy_template('/tmp/new.bloom')
>>> bf2 |= bf
>>> len(bf2)
Traceback (most recent call last):
pybloomfilter.IndeterminateCountError: Length of BloomFilter object is unavailable after intersection or union called.
BloomFilter.__in__(item) → Boolean

Check to see if item is contained in the filter, with an acceptable false positive rate of error_rate (see above).

BloomFilter.__ior__(filter) → BloomFilter

See union(filter)

BloomFilter.__iand__(filter) → BloomFilter

See intersection(filter)


class pybloomfilter.IndeterminateCountError(message)

The exception that is raised if len() is called on a BloomFilter object after |=, &=, intersection(), or union() is used.

Table Of Contents

Previous topic

Welcome to Python BloomFilter’s documentation!

This Page