Glossary

This glossary defines key terms used in the CrateDB reference manual.

Table of contents

Terms

B

Binary operator

See operation.

C

CLUSTERED BY column

See routing column.

E

Evaluation

See expression.

Expression

Any valid SQL that produces a value (e.g., column references, comparison operators, and functions) through a process known as evaluation.

Contrary to a statement.

See also

SQL: Value expressions

Built-ins: Subquery expressions

Data definition: Generation expressions

Scalar functions: Conditional functions and expressions

Aggregation: Aggregation expressions

F

Function

A token (e.g., replace) that takes zero or more arguments (e.g., three strings), performs a specific task, and may return one or more values (e.g., a modified string). Functions that return more than one value are called multi-valued functions.

Functions may be called in an SQL statement, like so:

  1. cr> SELECT replace('Hello world!', 'world', 'friend') as result;
  2. +---------------+
  3. | result |
  4. +---------------+
  5. | Hello friend! |
  6. +---------------+
  7. SELECT 1 row in set (... sec)

See also

Scalar functions

Aggregate functions

Table functions

Window functions

User-defined functions

M

Metadata gateway

Persists cluster metadata on disk every time the metadata changes. This data is stored persistently across full cluster restarts and recovered after nodes are started again.

See also

Cluster configuration: Metadata gateway

Multi-valued function

A function that returns two or more values.

See also

Table functions

Window functions

N

Nonscalar

A data type that can have more than one value (e.g., arrays and objects).

Contrary to a scalar.

See also

Geographic types

Container types

O

Operand

See operator.

Operation

See operator.

Operator

A reserved keyword (e.g., IN) or sequence of symbols (e.g., >=) that can be used in an SQL statement to manipulate one or more expressions and return a result (e.g., true or false). This process is known as an operation and the expressions can be called operands or arguments.

An operator that takes one operand is known as a unary operator and an operator that takes two is known as a binary operator.

See also

Arithmetic operators

Comparison operators

Array comparisons

P

Partition column

A column used to partition a table. Specified by the PARTITIONED BY clause.

Also known as a PARTITIONED BY column or partitioned column.

A table may be partitioned by one or more columns:

  • If a table is partitioned by one column, a new partition is created for every unique value in that partition column

  • If a table is partitoned by multiple columns, a new partition is created for every unique combination of row values in those partition columns

See also

Data definition: Partitioned tables

Generated columns: Partitioning

CREATE TABLE: PARTITIONED BY clause

ALTER TABLE: PARTITION clause

REFRESH: PARTITION clause

OPTIMIZE: PARTITION clause

COPY TO: PARTITION clause

COPY FROM: PARTITION clause

CREATE SNAPSHOT: PARTITION clause

RESTORE SNAPSHOT: PARTITION clause

PARTITIONED BY column

See partition column.

Partitioned column

See partition column.

R

Regular expression

An expression used to search for patterns in a string.

See also

Wikipedia: Regular expression

Data definition: Fulltext analyzers

Querying: Regular expressions

Scalar functions: Regular expressions

Table functions: regexp_matches

Routing column

Values in this column are used to compute a hash which is then used to route the corresponding row to a specific shard.

Also known as the CLUSTERED BY column.

All rows that have the same routing column row value are stored in the same shard.

Note

The routing of rows to a specific shard is not the same as the routing of shards to a specific node (also known as shard allocation).

See also

Storage and consistency: Addressing documents

Sharding: Routing

CREATE TABLE: CLUSTERED clause

S

Scalar

A data type with a single value (e.g., numbers and strings).

Contrary to a nonscalar.

See also

Primitive types

Shard allocation

The process by which CrateDB allocates shards to a specific nodes.

Note

Shard allocation is sometimes referred to as shard routing, which is not to be confused with row routing.

See also

Shard allocation filtering

Cluster configuration: Routing allocation

Sharding: Number of shards

Altering tables: Changing the number of shards

Altering tables: Reroute shards

Shard recovery

The process by which CrateDB synchronizes a replica shard from a primary shard.

Shard recovery can happen during node startup, after node failure, when replicating a primary shard, when moving a shard to another node (i.e., when rebalancing the cluster), or during snapshot restoration.

A shard that is being recovered cannot be queried until the recovery process is complete.

See also

Cluster settings: Recovery

System information: Checked node settings

Shard routing

See shard allocation.

Statement

Any valid SQL that serves as a database instruction (e.g., CREATE TABLE, INSERT, and SELECT) instead of producing a value.

Contrary to an expression.

See also

Data definition

Data manipulation

Querying

SQL Statements

Subquery

A SELECT statement used as a relation in the FROM clause of a parent SELECT statement.

Also known as a subselect.

Subselect

See subquery.

U

Unary operator

See operation.

Uncorrelated subquery

A scalar subquery that does not reference any relations (e.g., tables) in the parent SELECT statement.

See also

Built-ins: Subquery expressions

V

Value expression

See expression.