Additional Features

The SqliteExtDatabase accepts an initialization option to register support for a simple bloom filter. The bloom filter, once initialized, can then be used for efficient membership queries on large set of data.

Here’s an example:

  1. db = CSqliteExtDatabase(':memory:', bloomfilter=True)
  2. # Create and define a table to store some data.
  3. db.execute_sql('CREATE TABLE "register" ("data" TEXT)')
  4. Register = Table('register', ('data',)).bind(db)
  5. # Populate the database with a bunch of text.
  6. with db.atomic():
  7. for i in 'abcdefghijklmnopqrstuvwxyz':
  8. keys = [i * j for j in range(1, 10)] # a, aa, aaa, ... aaaaaaaaa
  9. Register.insert([{'data': key} for key in keys]).execute()
  10. # Collect data into a 16KB bloomfilter.
  11. query = Register.select(fn.bloomfilter(Register.data, 16 * 1024).alias('buf'))
  12. row = query.get()
  13. buf = row['buf']
  14. # Use bloomfilter buf to test whether other keys are members.
  15. test_keys = (
  16. ('aaaa', True),
  17. ('abc', False),
  18. ('zzzzzzz', True),
  19. ('zyxwvut', False))
  20. for key, is_present in test_keys:
  21. query = Register.select(fn.bloomfilter_contains(key, buf).alias('is_member'))
  22. answer = query.get()['is_member']
  23. assert answer == is_present

The SqliteExtDatabase can also register other useful functions:

  • rank_functions (enabled by default): registers functions for ranking search results, such as bm25 and lucene.
  • hash_functions: registers md5, sha1, sha256, adler32, crc32 and murmurhash functions.
  • regexp_function: registers a regexp function.

Examples:

  1. def create_new_user(username, password):
  2. # DO NOT DO THIS IN REAL LIFE. PLEASE.
  3. query = User.insert({'username': username, 'password': fn.sha1(password)})
  4. new_user_id = query.execute()

You can use the murmurhash function to hash bytes to an integer for compact storage:

  1. >>> db = SqliteExtDatabase(':memory:', hash_functions=True)
  2. >>> db.execute_sql('SELECT murmurhash(?)', ('abcdefg',)).fetchone()
  3. (4188131059,)