Database

The following covers available content storage configuration options.

content

  1. content: boolean|sqlite|duckdb|client|url|custom

Enables content storage. When true, the default storage engine, sqlite will be used to save metadata alongside embeddings vectors.

Client-server connections are supported with either client or a full connection URL. When set to client, the CLIENT_URL environment variable must be set to the full connection URL. See the SQLAlchemy documentation for more information on how to construct connection strings for client-server databases.

Add custom storage engines via setting this parameter to the fully resolvable class string.

Content storage specific settings are set with a corresponding configuration object having the same name as the content storage engine (i.e. duckdb or sqlite). None of these are required and are set to defaults if omitted.

sqlite

  1. sqlite:
  2. wal: enable write-ahead logging - allows concurrent read/write operations,
  3. defaults to false

objects

  1. objects: boolean|image|pickle

Enables object storage. Supports storing binary content alongside embeddings vectors and metadata. Requires content storage to also be enabled.

Object encoding options are:

  • standard: Default encoder when boolean set. Encodes and decodes objects as byte arrays.
  • image: Image encoder. Encodes and decodes objects as image objects.
  • pickle: Pickle encoder. Encodes and decodes objects with the pickle module. Supports arbitrary objects.

functions

  1. functions: list

List of functions with user-defined SQL functions, only used when content is enabled. Each list element must be one of the following:

  • function
  • callable object
  • dict with fields for name, argcount and function

An example can be found here.

query

  1. query:
  2. path: sets the path for the query model - this can be any model on the
  3. Hugging Face Model Hub or a local file path.
  4. prefix: text prefix to prepend to all inputs
  5. maxlength: maximum generated sequence length

Query translation model. Translates natural language queries to txtai compatible SQL statements.