JSON Format

Format: Serialization Schema Format: Deserialization Schema

The JSON format allows to read and write JSON data based on an JSON schema. Currently, the JSON schema is derived from table schema.

Dependencies

In order to use the Json format the following dependencies are required for both projects using a build automation tool (such as Maven or SBT) and SQL Client with SQL JAR bundles.

Maven dependencySQL Client JAR
flink-jsonBuilt-in

How to create a table with JSON format

Here is an example to create a table using Kafka connector and JSON format.

  1. CREATE TABLE user_behavior (
  2. user_id BIGINT,
  3. item_id BIGINT,
  4. category_id BIGINT,
  5. behavior STRING,
  6. ts TIMESTAMP(3)
  7. ) WITH (
  8. 'connector' = 'kafka',
  9. 'topic' = 'user_behavior',
  10. 'properties.bootstrap.servers' = 'localhost:9092',
  11. 'properties.group.id' = 'testGroup',
  12. 'format' = 'json',
  13. 'json.fail-on-missing-field' = 'false',
  14. 'json.ignore-parse-errors' = 'true'
  15. )

Format Options

OptionRequiredDefaultTypeDescription
format
required(none)StringSpecify what format to use, here should be ‘json’.
json.fail-on-missing-field
optionalfalseBooleanWhether to fail if a field is missing or not.
json.ignore-parse-errors
optionalfalseBooleanSkip fields and rows with parse errors instead of failing. Fields are set to null in case of errors.
json.timestamp-format.standard
optional‘SQL’StringSpecify the input and output timestamp format for TIMESTAMP and TIMESTAMP WITH LOCAL TIME ZONE type. Currently supported values are ‘SQL’ and ‘ISO-8601’:
  • Option ‘SQL’ will parse input TIMESTAMP values in “yyyy-MM-dd HH:mm:ss.s{precision}” format, e.g “2020-12-30 12:13:14.123”, parse input TIMESTAMP WITH LOCAL TIME ZONE values in “yyyy-MM-dd HH:mm:ss.s{precision}’Z’” format, e.g “2020-12-30 12:13:14.123Z” and output timestamp in the same format.
  • Option ‘ISO-8601’will parse input TIMESTAMP in “yyyy-MM-ddTHH:mm:ss.s{precision}” format, e.g “2020-12-30T12:13:14.123” parse input TIMESTAMP WITH LOCAL TIME ZONE in “yyyy-MM-ddTHH:mm:ss.s{precision}’Z’” format, e.g “2020-12-30T12:13:14.123Z” and output timestamp in the same format.
json.map-null-key.mode
optional‘FAIL’StringSpecify the handling mode when serializing null keys for map data. Currently supported values are ‘FAIL’, ‘DROP’ and ‘LITERAL’:
  • Option ‘FAIL’ will throw exception when encountering map with null key.
  • Option ‘DROP’ will drop null key entries for map data.
  • Option ‘LITERAL’ will replace null key with string literal. The string literal is defined by json.map-null-key.literal option.
json.map-null-key.literal
optional‘null’StringSpecify string literal to replace null key when ‘json.map-null-key.mode’ is LITERAL.

Data Type Mapping

Currently, the JSON schema is always derived from table schema. Explicitly defining an JSON schema is not supported yet.

Flink JSON format uses jackson databind API to parse and generate JSON string.

The following table lists the type mapping from Flink type to JSON type.

Flink SQL typeJSON type
CHAR / VARCHAR / STRINGstring
BOOLEANboolean
BINARY / VARBINARYstring with encoding: base64
DECIMALnumber
TINYINTnumber
SMALLINTnumber
INTnumber
BIGINTnumber
FLOATnumber
DOUBLEnumber
DATEstring with format: date
TIMEstring with format: time
TIMESTAMPstring with format: date-time
TIMESTAMP_WITH_LOCAL_TIME_ZONEstring with format: date-time (with UTC time zone)
INTERVALnumber
ARRAYarray
MAP / MULTISETobject
ROWobject