Section “Common” in configuration

lemmatizer_base

Lemmatizer dictionaries base path. Optional, default is /usr/share/manticore.

Our lemmatizer implementation (see Morphology for a discussion of what lemmatizers are) is dictionary driven. lemmatizer_base directive configures the base dictionary path. File names are hardcoded and specific to a given lemmatizer; the Russian lemmatizer uses ru.pak dictionary file. The dictionaries can be obtained from the Manticore website (https://manticoresearch.com/install/#other-downloads).

Example:

  1. lemmatizer_base = /usr/share/manticore/

progressive_merge

When merging real-time table disk chunks, do it from smaller to bigger ones. It makes merging faster with lower read/write amplification. Enabled by default. If disabled, chunks are merged from first to last created.

json_autoconv_keynames

Whether and how to auto-convert key names within JSON attributes. Known value is ‘lowercase’. Optional, default value is unspecified (do not convert anything).

When this directive is set to ‘lowercase’, key names within JSON attributes will be automatically brought to lower case when indexing. This conversion applies to any data source, that is, JSON attributes originating from either SQL or XMLpipe2 sources will all be affected.

Example:

  1. json_autoconv_keynames = lowercase

json_autoconv_numbers

Automatically detect and convert possible JSON strings that represent numbers, into numeric attributes. Optional, default value is 0 (do not convert strings into numbers).

When this option is 1, values such as “1234” will be indexed as numbers instead of strings; if the option is 0, such values will be indexed as strings. This conversion applies to any data source, that is, JSON attributes originating from either SQL or XMLpipe2 sources will all be affected.

Example:

  1. json_autoconv_numbers = 1

on_json_attr_error

What to do if JSON format errors are found. Optional, default value is ignore_attr (ignore errors). Applies only to sql_attr_json attributes.

By default, JSON format errors are ignored (ignore_attr) and the indexer tool will just show a warning. Setting this option to fail_index will rather make indexing fail at the first JSON format error.

Example:

  1. on_json_attr_error = ignore_attr

plugin_dir

Trusted location for the dynamic libraries (UDFs). Optional, default is /usr/local/lib/manticore/.

Specifies the trusted directory from which the UDF libraries can be loaded.

Example:

  1. plugin_dir = /usr/local/lib/manticore/

Special suffixes

Manticore Search supports special suffixes which makes it easier to use numeric values with a special meaning. The common form for them is integer number+literal, like 10k or 100d, but not 40.3s(since 40.3 is not integer), and not 2d 4h (since there are two, not one value). Literals are case-insensitive, so 10W is the same as 10w. There are 2 types of such suffixes currently supported:

  • Size suffixes: can be used in settings that define size of something: memory buffer, disk file size, limit of RAM etc. If you don’t specify any suffix the value is considered in bytes by default. The suffixes can be:
    • k for kilobytes (1k=1024)
    • m for megabytes (1m=1024k)
    • g for gigabytes (1g=1024m)
    • t for terabytes (1t=1024g)
  • Time suffixes: can be used in settings defining some time interval values: delays, timeouts etc. “Naked” values for those parameters usually have documented scale, e.g. for some settings 100 means 100 seconds, for others - 100 milliseconds. However, instead of guessing you can just use an explicit suffix. Those can be:
    • us for useconds (microseconds)
    • ms for milliseconds
    • s for seconds
    • m for minutes
    • h for hours
    • d for days
    • w for weeks.

Scripted configuration

Manticore configuration supports shebang syntax, meaning that the configuration can be written in a programming language and interpreted at loading, allowing dynamic settings.

For example, tables can be generated by querying a database table, various settings can be modified depending on external factors or external files can be included (which contain tables and/sources).

The configuration file is parsed by declared declared interpreter and the output is used as the actual configuration. This is happening each time the configuration is read (not only at searchd startup).

This facility is not available on Windows platform.

In the following example, we are using PHP to create multiple tables with different name and we also scan a specific folder for file containing extra declarations of tables.

  1. #!/usr/bin/php
  2. ...
  3. <?php for ($i=1; $i<=6; $i++) { ?>
  4. table test_<?=$i?> {
  5. type = rt
  6. path = /var/lib/manticore/data/test_<?=$i?>
  7. rt_field = subject
  8. ...
  9. }
  10. <?php } ?>
  11. ...
  12. <?php
  13. $confd_folder='/etc/manticore.conf.d/';
  14. $files = scandir($confd_folder);
  15. foreach($files as $file)
  16. {
  17. if(($file == '.') || ($file =='..'))
  18. {} else {
  19. $fp = new SplFileInfo($confd_folder.$file);
  20. if('conf' == $fp->getExtension()){
  21. include ($confd_folder.$file);
  22. }
  23. }
  24. }
  25. ?>