SERIAL

The SERIAL pseudo data type is a keyword that canbe used in lieu of a real data type when defining table columns. Itis approximately equivalent to using an integer type witha DEFAULT expression that generates differentvalues every time it is evaluated. This default expression in turnensures that inserts that do not specify this column will receive anautomatically generated value instead of NULL.

Note:

SERIAL is provided only for compatibility with PostgreSQL. New applications should use real data types and a suitable DEFAULT expression.

In most cases, we recommend using the UUID data type with the gen_random_uuid() function as the default value, which generates 128-bit values (larger than SERIAL's maximum of 64 bits) and more uniformly scatters them across all of a table's underlying key-value ranges. UUIDs ensure more effectively that multiple nodes share the insert load when a UUID column is used in an index or primary key.

See this FAQ entry for more details.

Modes of operation

The keyword SERIAL is recognized in CREATE TABLE and isautomatically translated to a real data type and a DEFAULTexpression during table creation.The result of this translation is then used internally by CockroachDB,and can be observed using SHOW CREATE.

The chosen DEFAULT expression ensures that different values areautomatically generated for the column during row insertion. Theseare not guaranteed to increase monotonically, see this sectionbelow for details.

There are three possible translation modes for SERIAL:

ModeDescription
rowid (default)SERIAL implies DEFAULT unique_rowid(). The real data type is always INT.
virtual_sequence (experimental)SERIAL creates a virtual sequence and implies DEFAULT nextval(<seqname>). The real data type is always INT.
sql_sequence (experimental)SERIAL creates a regular SQL sequence and implies DEFAULT nextval(<seqname>). The real data type depends on SERIAL variant.

These modes can be configured with the experimental (unsupported) session variable experimental_serial_normalization.

Note:

The particular choice of DEFAULT expression when clients use theSERIAL keyword is subject to change in future versions ofCockroachDB. Applications that wish to use unique_rowid()specifically must use the full explicit syntax INT DEFAULT
unique_rowid()
and avoid SERIAL altogether.

Moreover, the existence of multiple translation modes for SERIAL isan experimental feature in CockroachDB 2.1 aimed at studyingcompatibility with existing PostgreSQL applications and may be removedin subsequent releases.

Generated values for modes rowid and virtual_sequence

In both modes rowid and virtual_sequence, a value is automaticallygenerated using the unique_rowid() function.This produces a 64-bit integer from the current timestamp and ID ofthe node executing the INSERT or UPSERT operation.This behavior is statistically likely to be globally unique except inextreme cases (see this FAQentryfor more details).

Also, because value generation using unique_rowid() does not requireinter-node coordination, it is much faster than the other modesql_sequence discussed below when multiple SQL clients are writing tothe table from different nodes.

Note:

The difference between rowid and virtual_sequence is that thelatter setting also creates a virtual (pseudo) sequence in thedatabase. However in both cases the unique_rowid() function isultimately used to generate new values.

This behavior of virtual_sequence is experimental and may be removedin a later version of CockroachDB.

Generated values for mode sql_sequence.

In this mode, a regular SQL sequence isautomatically created alongside the table where SERIAL is specified.

The actual data type is determined as follows:

SERIAL variantReal data type
SERIAL2, SMALLSERIALINT2
SERIAL4INT4
SERIALINT
SERIAL8, BIGSERIALINT8

Every insert or upsert into the table will then use nextval() toincrement the sequence and produce increasing values.

Because SQL sequences persist the current sequence value in thedatabase, inter-node coordination is required when multiple clientsuse the sequence concurrently via different nodes. This can causecontention and impactperformance negatively.

Therefore, applications should consider using unique_rowid() orgen_random_uuid() as discussed in this FAQentryinstead of sequences when possible.

Note:

This mode sql_sequence is an experimental feature provided for testing compatibility with existing PostgreSQL clients.

It is subject to change without notice and may be removed in later versions of CockroachDB.

Examples

Use SERIAL to auto-generate primary keys

In this example, we create a table with the SERIAL column as the primary key so we can auto-generate unique IDs on insert.

  1. > CREATE TABLE serial (a SERIAL PRIMARY KEY, b STRING, c BOOL);

The SHOW COLUMNS statement shows that the SERIAL type is just an alias for INT with unique_rowid() as the default.

  1. > SHOW COLUMNS FROM serial;
  1. +-------------+-----------+-------------+----------------+-----------------------+-------------+
  2. | column_name | data_type | is_nullable | column_default | generation_expression | indices |
  3. +-------------+-----------+-------------+----------------+-----------------------+-------------+
  4. | a | INT | false | unique_rowid() | | {"primary"} |
  5. | b | STRING | true | NULL | | {} |
  6. | c | BOOL | true | NULL | | {} |
  7. +-------------+-----------+-------------+----------------+-----------------------+-------------+
  8. (3 rows)

When we insert rows without values in column a and display the new rows, we see that each row has defaulted to a unique value in column a.

  1. > INSERT INTO serial (b,c) VALUES ('red', true), ('yellow', false), ('pink', true);
  1. > INSERT INTO serial (a,b,c) VALUES (123, 'white', false);
  1. > SELECT * FROM serial;
  1. +--------------------+--------+-------+
  2. | a | b | c |
  3. +--------------------+--------+-------+
  4. | 148656994422095873 | red | true |
  5. | 148656994422161409 | yellow | false |
  6. | 148656994422194177 | pink | true |
  7. | 123 | white | false |
  8. +--------------------+--------+-------+

Auto-incrementing is not always sequential

It's a common misconception that the auto-incrementing types in PostgreSQL and MySQL generate strictly sequential values. However, there can be gaps and the order is not completely guaranteed:

  • Each insert increases the sequence by one, even when the insert is not committed. This means that auto-incrementing types may leave gaps in a sequence.
  • Two concurrent transactions can commit in a different order than their use of sequences, and thus "observe" the values to decrease relative to each other. This effect is amplified by automatic transaction retries.
    These are fundamental properties of a transactional system with non-transactional sequences. PostgreSQL, MySQL, and CockroachDB do not increase sequences transactionally with other SQL statements, so these effects can happen in any case.

To experience this for yourself, run through the following example in PostgreSQL:

  • Create a table with a SERIAL column:
  1. > CREATE TABLE increment (a SERIAL PRIMARY KEY);
  • Run four transactions for inserting rows:
  1. > BEGIN; INSERT INTO increment DEFAULT VALUES; ROLLBACK;
  1. > BEGIN; INSERT INTO increment DEFAULT VALUES; COMMIT;
  1. > BEGIN; INSERT INTO increment DEFAULT VALUES; ROLLBACK;
  1. > BEGIN; INSERT INTO increment DEFAULT VALUES; COMMIT;
  • View the rows created:
  1. > SELECT * from increment;
  1. +---+
  2. | a |
  3. +---+
  4. | 2 |
  5. | 4 |
  6. +---+

Since each insert increased the sequence in column a by one, the first committed insert got the value 2, and the second committed insert got the value 4. As you can see, the values aren't strictly sequential, and the last value doesn't give an accurate count of rows in the table.

In summary, the SERIAL type in PostgreSQL and CockroachDB, and the AUTO_INCREMENT type in MySQL, all behave the same in that they do not create strict sequences. CockroachDB will likely create more gaps than these other databases, but will generate these values much faster. An alternative feature, introduced in v2.0, is the SEQUENCE.

Additional examples

If two transactions occur concurrently, CockroachDB cannot guarantee monotonically increasing IDs (i.e., first commit is smaller than second commit). Here are three more scenarios that demonstrate this:

Scenario 1:

  • At time 1, transaction T1 BEGINs.
  • At time 2, transaction T2 BEGINs on the same node (from a different client).
  • At time 3, transaction T1 creates a SERIAL value, x.
  • At time 3 + 2 microseconds, transaction T2 creates a SERIAL value, y.
  • At time 4, transaction T1 COMMITs.
  • At time 5, transaction T2 COMMITs.
    If this happens, CockroachDB cannot guarantee whether x < y or x > y, despite the fact T1 and T2 began and were committed in different times. In this particular example, it's even likely that x = y because there is less than a 10-microsecond difference and the SERIAL values are constructed from the number of microseconds in the current time.

Scenario 2:

  • At time 1, transaction T1 BEGINs.
  • At time 1, transaction T2 BEGINs somewhere else, on a different node.
  • At time 2, transaction T1 creates a SERIAL value, x.
  • At time 3, transaction T2 creates a SERIAL value, y.
  • At time 4, transaction T1 COMMITs.
  • At time 4, transaction T2 COMMITs.
    If this happens, CockroachDB cannot guarantee whether x < y or x > y. Both can happen, even though the transactions began and committed at the same time. However it's sure that x != y because the values were generated on different nodes.

Scenario 3:

  • At time 1, transaction T1 BEGINs.
  • At time 2, transaction T1 creates a SERIAL value, x.
  • At time 3, transaction T1 COMMITs.
  • At time 4, transaction T2 BEGINs somewhere else, on a different node.
  • At time 5, transaction T2 creates a SERIAL value, y.
  • At time 6, transaction T2 COMMITs.
    There is less than a 250-microsecond difference between the system clocks of the two nodes.

If this happens, CockroachDB cannot guarantee whether x < y or x > y. Even though the transactions "clearly" occurred one "after" the other, perhaps there was a clock skew between the two nodes and the system time of the second node is set earlier than the first node.

See also

Was this page helpful?
YesNo