PostgreSQL Table Partitioning

This tutorial explains how to use PostgreSQL Table PartitioningTable Partitioning - 图1open in new window with Bun.

Why partition a table?

Table partitioning allows to split one large table into smaller ones bringing the following benefits:

  • Smaller tables are faster both for reading and writing.
  • You can very efficiently drop the whole partition instead of deleting data row by row.
  • Because PostgreSQL knows how to prune unused partitions, you can use partitions as a crude index. For example, by paritioning a table by date, you may not need an index on the date field any more and use a sequential scan instead.
  • Rarely used partitions can be moved to a cheaper storage.

Partitioning methods

Let’s suppose we have a table:

  1. CREATE TABLE measurements (
  2. id int8 NOT NULL,
  3. value float8 NOT NULL,
  4. date timestamptz NOT NULL
  5. );

You can partition that table by providing columns to use as the partition key:

  1. CREATE TABLE measurements (
  2. id int8 NOT NULL,
  3. value float8 NOT NULL,
  4. date timestamptz NOT NULL
  5. ) PARTITION BY RANGE (date);

PostgreSQL supports several partitioning methods which only differ in the way they specify row values for the partition key.

Partition by range

Partitioning by range allows to specify a range of values for the partition, for example, we can store data for each month in a separate partition:

  1. CREATE TABLE measurements_y2021m01 PARTITION OF measurements
  2. FOR VALUES FROM ('2021-01-01') TO ('2021-02-01');

Partition by list

List partitioning allows to specify a list of values for the partition, for example, we can store small fraction of the frequently accessed data in the hot partition and move the rest to the cold partition:

  1. CREATE TABLE measurements (
  2. id int8 PRIMARY KEY,
  3. value float8 NOT NULL,
  4. date timestamptz NOT NULL,
  5. hot boolean
  6. ) PARTITION BY LIST (hot);
  7. CREATE TABLE measurements_hot PARTITION OF measurements
  8. FOR VALUES IN (TRUE);
  9. CREATE TABLE measurements_cold PARTITION OF measurements
  10. FOR VALUES IN (NULL);

You can then move rows between partitions by updating the hot column:

  1. -- Move rows to measurements_hot
  2. UPDATE measurements SET hot = TRUE;
  3. -- Move rows to measurements_cold
  4. UPDATE measurements SET hot = NULL;

Partition by hash

Partitioning by hash allows to uniformly distribute rows into a set of tables, for example, we can create 3 partitions for our table and pick a partition for the row using a hash and a remainder of division:

  1. CREATE TABLE measurements (
  2. id int8 PRIMARY KEY,
  3. value float8 NOT NULL,
  4. date timestamptz NOT NULL
  5. ) PARTITION BY HASH (id);
  6. CREATE TABLE measurements_1 PARTITION OF measurements
  7. FOR VALUES WITH (MODULUS 3, REMAINDER 0);
  8. CREATE TABLE measurements_2 PARTITION OF measurements
  9. FOR VALUES WITH (MODULUS 3, REMAINDER 1);
  10. CREATE TABLE measurements_3 PARTITION OF measurements
  11. FOR VALUES WITH (MODULUS 3, REMAINDER 2);

Thanks to using hashes, the partitions will receive approximately the same amount of rows.

Managing partitions

PostgreSQL allows to detach and attach partitions:

  1. ALTER TABLE measurements DETACH PARTITION measurements_y2021m01;
  2. ALTER TABLE measurements ATTACH PARTITION measurements_y2021m01
  3. FOR VALUES FROM ('2021-01-01') TO ('2021-02-01');

You can use those commands to partition an existing table without moving any data:

  1. -- Use the existing table as a partition for the existing data.
  2. ALTER TABLE measurements RENAME TO measurements_y2021m01;
  3. -- Create the partitioned table.
  4. CREATE TABLE measurements (LIKE measurements_y2021m01 INCLUDING DEFAULTS INCLUDING CONSTRAINTS)
  5. PARTITION BY RANGE (date);
  6. -- Attach the existing partition with open left constraint.
  7. ALTER TABLE measurements ATTACH PARTITION measurements_y2021m01
  8. FOR VALUES FROM ('0001-01-01') TO ('2021-02-01');
  9. -- Use proper constraints for new partitions.
  10. CREATE TABLE measurements_y2021m02 PARTITION OF measurements
  11. FOR VALUES FROM ('2021-02-01') TO ('2021-03-01');

Using partitioned tables with Bun

Bun allows to create partitioned tables:

  1. type Measure struct {
  2. ID int64
  3. Value float64
  4. Date time.Time
  5. Hot bool `bun:",nullzero"`
  6. }
  7. _, err := db.NewCreateTable().
  8. Model((*Measure)(nil)).
  9. PartitionBy("LIST (hot)").
  10. Exec(ctx)

And query partitions directly using ModelTableExpr:

  1. var measures []*Measure
  2. num, err := db.NewSelect(&measures).
  3. ModelTableExpr("measurements_hot").
  4. Count(ctx)

You can even create separate models for partitions:

  1. type MeasureHot struct {
  2. bun.BaseModel `bun:"measures_hot"`
  3. Measure
  4. }
  5. type MeasureCold struct {
  6. bun.BaseModel `bun:"measures_cold"`
  7. Measure
  8. }