Lexical structure

An SQL input consists of a sequence of commands each of which is a sequence of tokens, terminated by a semicolon (;).

The syntax of a command defines its set of valid tokens. A token can be a key word, an identifier, a quoted identifier, a literal (or constant), or a special character symbol.

Table of contents

String literal

String literals are defined as an arbitrary sequence of characters that are delimited with single quotes ' as defined in ANSI SQL, for example 'This is a string'.

Escape strings

The escape character in CrateDB is the single-quote '. A character gets escaped when adding a single-quote before it. For example a single quote character within a string literal can be included by writing two adjacent single quotes, e.g., 'Jack''s car'.

Note

Two adjacent single quotes are not equivalent to the double-quote character ".

String literals with C-Style escapes

In addition to the escaped character ', CrateDB supports C-Style escaped string sequences. Such a sequence is constructed by prefixing the string literal with the letter E or e, for example, e'hello\nWorld'. The following escaped sequences are supported:

Escape Sequence

Interpretation

\b

backspace

\f

form feed

\n

newline

\r

carriage return

\t

tab

\o, \oo, \ooo (o = [0-7])

octal byte value

\xh, xhh (h = [0-9,A-F,a-f])

hexadecimal byte value

\uxxxx, \Uxxxxxxxx (x = [0-9,A-F,a-f])

16 or 32-bit hexadecimal Unicode character value

For instance, the escape string literal e'\u0061\x61\141' is equivalent to the 'aaa' string literal.

  1. cr> select e'\u0061\x61\141' as col1;
  2. +------+
  3. | col1 |
  4. +------+
  5. | aaa |
  6. +------+
  7. SELECT 1 row in set (... sec)

Any other character following a backslash is taken literally. Thus, to include a backslash character \, two adjacent backslashes need to be used (i.e. \\).

  1. cr> select e'aa\\nbb' as col1;
  2. +--------+
  3. | col1 |
  4. +--------+
  5. | aa\nbb |
  6. +--------+
  7. SELECT 1 row in set (... sec)

Finally, a single quote can be included in an escape string literal by also using the escape backslash character: \', in addition to the single-quote described in the string literals section.

  1. cr> select e'aa\'bb' as col1;
  2. +-------+
  3. | col1 |
  4. +-------+
  5. | aa'bb |
  6. +-------+
  7. SELECT 1 row in set (... sec)

Key words and identifiers

The table below lists all reserved key words in CrateDB. These need to be quoted if used as identifiers:

  1. cr> SELECT word FROM pg_catalog.pg_get_keywords() WHERE catcode = 'R' ORDER BY 1;
  2. +-------------------+
  3. | word |
  4. +-------------------+
  5. | add |
  6. | all |
  7. | alter |
  8. | and |
  9. | any |
  10. | array |
  11. | as |
  12. | asc |
  13. | between |
  14. | by |
  15. | called |
  16. | case |
  17. | cast |
  18. | column |
  19. | constraint |
  20. | create |
  21. | cross |
  22. | current_date |
  23. | current_schema |
  24. | current_time |
  25. | current_timestamp |
  26. | current_user |
  27. | default |
  28. | delete |
  29. | deny |
  30. | desc |
  31. | describe |
  32. | directory |
  33. | distinct |
  34. | drop |
  35. | else |
  36. | end |
  37. | escape |
  38. | except |
  39. | exists |
  40. | extract |
  41. | false |
  42. | first |
  43. | for |
  44. | from |
  45. | full |
  46. | function |
  47. | grant |
  48. | group |
  49. | having |
  50. | if |
  51. | in |
  52. | index |
  53. | inner |
  54. | input |
  55. | insert |
  56. | intersect |
  57. | into |
  58. | is |
  59. | join |
  60. | last |
  61. | left |
  62. | like |
  63. | limit |
  64. | match |
  65. | natural |
  66. | not |
  67. | null |
  68. | nulls |
  69. | object |
  70. | offset |
  71. | on |
  72. | or |
  73. | order |
  74. | outer |
  75. | persistent |
  76. | recursive |
  77. | reset |
  78. | returns |
  79. | revoke |
  80. | right |
  81. | select |
  82. | session_user |
  83. | set |
  84. | some |
  85. | stratify |
  86. | substring |
  87. | table |
  88. | then |
  89. | transient |
  90. | true |
  91. | try_cast |
  92. | unbounded |
  93. | union |
  94. | update |
  95. | user |
  96. | using |
  97. | when |
  98. | where |
  99. | with |
  100. +-------------------+
  101. SELECT 95 rows in set (... sec)

Tokens such as my_table, id, name, or data in the example below are identifiers, which identify names of tables, columns, and other database objects.

Example:

  1. CREATE TABLE my_table (
  2. id INTEGER,
  3. name STRING,
  4. data OBJECT
  5. ) WITH (number_of_replicas = 0);

Note

Key words and unquoted identifiers are case insensitive.

This means that:

  1. select foo from t;

is equivalent to:

  1. select Foo from t;

or:

  1. select FOO from t;

A widely used convention is to write key words in uppercase and identifiers in lowercase, such as

  1. ALTER TABLE foo ADD COLUMN new_column INTEGER;
  1. INSERT INTO foo (id, name) VALUES (1, 'bar');

Quoted identifiers can contain an arbitrary sequence of characters enclosed by double quotes ("). Quoted identifiers are never keywords, so you can use "update" as a table or column name.

Special characters

Some non-alphanumeric characters do have a special meaning. For their usage please refer to the sections where the respective syntax elements are described.

Semicolon

The semicolon (;) terminates an SQL statement. It cannot appear anywhere else within the command, except within a string or quoted identifier.

Comma

The comma (,) is used in various syntactical elements to separate elements of a list.

Brackets

Square brackets ([]) are used to select elements of arrays and objects, e.g. arr[1] or obj['key'].

Asterisk

The asterisk (*) is used in some contexts to denote all columns of a table. As an argument in global aggregate functions it has the meaning of any field, e.g. COUNT(*).

Period

The period (.) is used for numeric values and to separate schema and table names, e.g. blob.my_blob_table.

Comments

An SQL statement can contain comments. Single line comments start with a double dash (--) and end at the end of that line. Multi line comments start with /* and end with */.

Example:

  1. /*
  2. * Retrieve information about all tables in the 'doc' schema.
  3. */
  4. SELECT *
  5. FROM information_schema.tables
  6. WHERE table_schema = 'doc'; -- query information schema for doc tables