Import data into TimescaleDB from .csv

If you have data stored in an external .csv file, you can import it into TimescaleDB.

Prerequisites

Before beginning, make sure you have installed and set up TimescaleDB within your PostgreSQL instance.

Import data

Import data from a csv.

Importing data

note

Timescale provides an open source parallel importer program, timescaledb-parallel-copy, to speed up data copying. The program parallelizes migration by using several workers to run multiple COPYs concurrently. It also offers options to improve the copying experience. If you prefer not to download timescaledb-parallel-copy, you can also use regular PostgreSQL COPY.

  1. Connect to your database and create a new empty table. Use a schema that matches the data in your .csv file. In this example, the .csv file contains the columns time, location, and temperature.

    1. CREATE TABLE <TABLE_NAME> (
    2. ts TIMESTAMPTZ NOT NULL,
    3. location TEXT NOT NULL,
    4. temperature DOUBLE PRECISION NULL
    5. );
  2. Convert the empty table to a hypertable using the create_hypertable function. Replace ts with the name of the column storing time values in your table.

    1. SELECT create_hypertable('<TABLE_NAME>', 'ts')
  3. At the command line, insert data into the hypertable from your csv. Use timescaledb-parallel-copy to speed up migration. Adjust the number of workers as desired. Alternatively see the next step.

    1. timescaledb-parallel-copy --db-name <DATABASE_NAME> --table <TABLE_NAME> \
    2. --file <FILENAME>.csv --workers 4 --copy-options "CSV"
  4. OptionalIf you don’t want to use `timescaledb-parallel-copy`, insert data into the hypertable by using PostgreSQL’s native `COPY`command. At the command line, run:

    1. psql -d <DATABASE_NAME> -c "\COPY <TABLE_NAME> FROM <FILENAME>.csv CSV"
note

Don’t set the number of workers for timescaledb-parallel-copy higher than the number of available CPU cores. Above that, workers compete with each other for resources and reduce the performance improvements.