Configuring TimescaleDB

TimescaleDB works with the default PostgreSQL server configuration settings. However, we find that these settings are typically too conservative and can be limiting when using larger servers with more resources (CPU, memory, disk, etc). Adjusting these settings, either automatically with our tool timescaledb-tune or manually editing your machine’s postgresql.conf, can improve performance.

TIP:You can determine the location of postgresql.conf by running SHOW config_file; from your PostgreSQL client (e.g., psql).

Recommended: timescaledb-tune

To streamline the configuration process, we’ve created a tool called timescaledb-tune that handles setting the most common parameters to good values based on your system, accounting for memory, CPU, and PostgreSQL version. timescaledb-tune is packaged along with our binary releases as a dependency, so if you installed one of our binary releases (including Docker), you should have access to the tool. Alternatively, with a standard Go environment, you can also go get the repository to install it.

timescaledb-tune reads your system’s postgresql.conf file and offers interactive suggestions for updating your settings:

  1. Using postgresql.conf at this path:
  2. /usr/local/var/postgres/postgresql.conf
  3. Is this correct? [(y)es/(n)o]: y
  4. Writing backup to:
  5. /var/folders/cr/zpgdkv194vz1g5smxl_5tggm0000gn/T/timescaledb_tune.backup201901071520
  6. shared_preload_libraries needs to be updated
  7. Current:
  8. #shared_preload_libraries = 'timescaledb'
  9. Recommended:
  10. shared_preload_libraries = 'timescaledb'
  11. Is this okay? [(y)es/(n)o]: y
  12. success: shared_preload_libraries will be updated
  13. Tune memory/parallelism/WAL and other settings? [(y)es/(n)o]: y
  14. Recommendations based on 8.00 GB of available memory and 4 CPUs for PostgreSQL 11
  15. Memory settings recommendations
  16. Current:
  17. shared_buffers = 128MB
  18. #effective_cache_size = 4GB
  19. #maintenance_work_mem = 64MB
  20. #work_mem = 4MB
  21. Recommended:
  22. shared_buffers = 2GB
  23. effective_cache_size = 6GB
  24. maintenance_work_mem = 1GB
  25. work_mem = 26214kB
  26. Is this okay? [(y)es/(s)kip/(q)uit]:

These changes are then written to your postgresql.conf and will take effect on the next (re)start. If you are starting on fresh instance and don’t feel the need to approve each group of changes, you can also automatically accept and append the suggestions to the end of your postgresql.conf like so:

  1. $ timescaledb-tune --quiet --yes --dry-run >> /path/to/postgresql.conf

Postgres configuration and tuning

If you prefer to tune the settings yourself, or are curious about the suggestions that timescaledb-tune comes up with, we elaborate on them here. Additionally, timescaledb-tune does not cover all settings you may need to adjust; those are covered below.

Memory settings

TIP:All of these settings are handled by timescaledb-tune.

The settings shared_buffers, effective_cache_size, work_mem, and maintenance_work_mem need to be adjusted to match the machine’s available memory. We suggest getting the configuration values from the PgTune website (suggested DB Type: Data warehouse). You should also adjust the max_connections setting to match the ones given by PgTune since there is a connection between max_connections and memory settings. Other settings from PgTune may also be helpful.

Worker settings

TIP:All of these settings are handled by timescaledb-tune.

PostgreSQL utilizes worker pools to provide the required workers needed to support both live queries and background jobs. If you do not configure these settings, you may observe performance degradation on both queries and background jobs.

TimescaleDB background workers are configured using the timescaledb.max_background_workers setting. You should configure this setting to the sum of your total number of databases and the total number of concurrent background workers you want running at any given point in time. You need a background worker allocated to each database to run a lightweight scheduler that schedules jobs. On top of that, any additional workers you allocate here will run background jobs when needed.

For larger queries, PostgreSQL automatically uses parallel workers if they are available. To configure this use the max_parallel_workers setting. Increasing this setting will improve query performance for larger queries. Smaller queries may not trigger parallel workers. By default, this setting corresponds to the number of CPUs available. Use the --cpus flag or the TS_TUNE_NUM_CPUS docker environment variable to change it.

Finally, you must configure max_worker_processes to be at least the sum of timescaledb.max_background_workers and max_parallel_workers. max_worker_processes is the total pool of workers available to both background and parallel workers (as well as a handful of built-in PostgreSQL workers).

By default, timescaledb-tune sets timescaledb.max_background_workers to 8. In order to change this setting, use the --max-bg-workers flag or the TS_TUNE_MAX_BG_WORKERS docker environment variable. The max_worker_processes setting will automatically be adjusted as well.

Disk-write settings

In order to increase write throughput, there are multiple settings to adjust the behavior that PostgreSQL uses to write data to disk. We find the performance to be good with the default (safest) settings. If you want a bit of additional performance, you can set synchronous_commit = 'off'(PostgreSQL docs). Please note that when disabling synchronous_commit in this way, an operating system or database crash might result in some recent allegedly-committed transactions being lost. We actively discourage changing the fsync setting.

Lock settings

TimescaleDB relies heavily on table partitioning for scaling time-series workloads, which has implications for lock management. A hypertable needs to acquire locks on many chunks (sub-tables) during queries, which can exhaust the default limits for the number of allowed locks held. This might result in a warning like the following:

  1. psql: FATAL: out of shared memory
  2. HINT: You might need to increase max_locks_per_transaction.

To avoid this issue, it is necessary to increase the max_locks_per_transaction setting from the default value (which is typically 64). Since changing this parameter requires a database restart, it is advisable to estimate a good setting that also allows some growth. For most use cases we recommend the following setting:

  1. max_locks_per_transaction = 2 * num_chunks

where num_chunks is the maximum number of chunks you expect to have in a hypertable. This setting takes into account that the number of locks taken by a hypertable query is roughly equal to the number of chunks in the hypertable, or double that number if the query also uses an index. You can see how many chunks you currently have using the chunk_relation_size_pretty command. Also note that max_locks_per_transaction is not an exact setting; it only controls the average number of object locks allocated for each transaction. For more information, please review the official PostgreSQL documentation on lock management.

TimescaleDB configuration and tuning

Just as you can tune settings in PostgreSQL, TimescaleDB provides a number of configuration settings that may be useful to your specific installation and performance needs. These can also be set within the postgresql.conf file or as command-line parameters when starting PostgreSQL.

Policies

timescaledb.max_background_workers (int)

Max background worker processes allocated to TimescaleDB. Set to at least 1 + number of databases in Postgres instance to use background workers. Default value is 8.

Distributed Hypertables

timescaledb.enable_2pc (bool)

Enables two-phase commit for distributed hypertables. If disabled, it will use a one-phase commit instead, which is faster but can result in inconsistent data. It is by default enabled.

timescaledb.enable_per_data_node_queries (bool)

If enabled, TimescaleDB will combine different chunks belonging to the same hypertable into a single query per data node. It is by default enabled.

timescaledb.max_insert_batch_size (int)

When acting as a access node, TimescaleDB splits batches of inserted tuples across multiple data nodes. It will batch up to max_insert_batch_size tuples per data node before flushing. Setting this to 0 disables batching, reverting to tuple-by-tuple inserts. The default value is 1000.

timescaledb.enable_connection_binary_data (bool)

Enables binary format for data exchanged between nodes in the cluster. It is by default enabled.

timescaledb.enable_client_ddl_on_data_nodes (bool)

Enables DDL operations on data nodes by a client and do not restrict execution of DDL operations only by access node. It is by default disabled.

timescaledb.enable_async_append (bool)

Enables optimization that runs remote queries asynchronously across data nodes. It is by default enabled.

timescaledb.enable_remote_explain (bool)

Enable getting and showing EXPLAIN output from remote nodes. This will require sending the query to the data node, so it can be affected by the network connection and availability of data nodes. It is by default disabled.

timescaledb.remote_data_fetcher (enum)

Pick data fetcher type based on type of queries you plan to run, which can be either rowbyrow or cursor. The default is rowbyrow.

timescaledb.ssl_dir (string)

Specifies the path used to search user certificates and keys when connecting to data nodes using certificate authentication. Defaults to timescaledb/certs under the PostgreSQL data directory.

timescaledb.passfile (string)

Specifies the name of the file where passwords are stored and when connecting to data nodes using password authentication.

Administration

timescaledb.restoring (bool)

Set TimescaleDB in restoring mode. It is by default disabled.

timescaledb.license (string)

TimescaleDB license type. Determines which features are enabled. The variable can be set to timescale or apache. Defaults to timescale.

timescaledb.telemetry_level (enum)

Telemetry settings level. Level used to determine which telemetry to send. Can be set to off or basic. Defaults to basic.

timescaledb.last_tuned (string)

Records last time timescaledb-tune ran.

timescaledb.last_tuned_version (string)

Version of timescaledb-tune used to tune when it ran.

Changing configuration with Docker

When running TimescaleDB via a Docker container, there are two approaches to modifying your PostgreSQL configuration. In the following example, we modify the size of the database instance’s write-ahead-log (WAL) from 1GB to 2GB in a Docker container named timescaledb.

Modifying postgres.conf inside Docker

  1. Get into a shell in Docker in order to change the configuration on a running container.

    1. docker start timescaledb
    2. docker exec -i -t timescaledb /bin/bash
  2. Edit and then save the config file, modifying the setting for the desired configuration parameter (e.g., max_wal_size).

    1. vi /var/lib/postgresql/data/postgresql.conf
  3. Restart the container so the config gets reloaded.

    1. docker restart timescaledb
  4. Test to see if the change worked.

    1. docker exec -it timescaledb psql -U postgres
    2. postgres=# show max_wal_size;
    3. max_wal_size
    4. --------------
    5. 2GB

Specify config parameters as boot options

Alternatively, one or more parameters can be passed in to the docker run command via a -c option, as in the following.

  1. docker run -i -t timescale/timescaledb:latest-pg10 postgres -cmax_wal_size=2GB

Additional examples of passing in arguments at boot can be found in our discussion about using WAL-E for incremental backup.