External Dictionaries

You can add your own dictionaries from various data sources. The data source for a dictionary can be a local text or executable file, an HTTP(s) resource, or another DBMS. For more information, see “Sources for external dictionaries”.

ClickHouse:

  • Fully or partially stores dictionaries in RAM.
  • Periodically updates dictionaries and dynamically loads missing values. In other words, dictionaries can be loaded dynamically.
  • Allows to create external dictionaries with xml files or DDL queries.

The configuration of external dictionaries can be located in one or more xml-files. The path to the configuration is specified in the dictionaries_config parameter.

Dictionaries can be loaded at server startup or at first use, depending on the dictionaries_lazy_load setting.

The dictionaries system table contains information about dictionaries configured at server. For each dictionary you can find there:

  • Status of the dictionary.
  • Configuration parameters.
  • Metrics like amount of RAM allocated for the dictionary or a number of queries since the dictionary was successfully loaded.

The dictionary configuration file has the following format:

  1. <yandex>
  2. <comment>An optional element with any content. Ignored by the ClickHouse server.</comment>
  3. <!--Optional element. File name with substitutions-->
  4. <include_from>/etc/metrika.xml</include_from>
  5. <dictionary>
  6. <!-- Dictionary configuration. -->
  7. <!-- There can be any number of <dictionary> sections in the configuration file. -->
  8. </dictionary>
  9. </yandex>

You can configure any number of dictionaries in the same file.

DDL queries for dictionaries doesn’t require any additional records in server configuration. They allow to work with dictionaries as first-class entities, like tables or views.

Attention

You can convert values for a small dictionary by describing it in a SELECT query (see the transform function). This functionality is not related to external dictionaries.

See Also

Original article