New Record State Extraction

Configuration

The configuration is a part of source/sink task connector and is expressed in a set of properties:

  1. transforms=unwrap,...
  2. transforms.unwrap.type=io.debezium.transforms.ExtractNewRecordState
  3. transforms.unwrap.drop.tombstones=false
  4. transforms.unwrap.delete.handling.mode=rewrite
  5. transforms.unwrap.add.source.fields=table,lsn

Record filtering for delete records

The SMT provides a special handling for events that signal a delete operation. When a DELETE is executed on a datasource then Debezium generates two events:

  • a record with d operation that contains only old row data

  • (optionally) a record with null value and the same key (a “tombstone” message). This record serves as a marker for Apache Kafka that all messages with this key can be removed from the topic during log compaction.

Upon processing these two records, the SMT can pass on the d record as is, convert it into another tombstone record or drop it. The original tombstone message can be passed on as is or also be dropped.

The SMT by default filters out both delete records as widely used sink connectors do not support handling of tombstone messages at this point.

Adding metadata fields to the message

The SMT can optionally add metadata fields from the original change event to the final flattened record. This functionality can be used to add things like the operation or the table from the change event, or connector-specific fields like the Postgres LSN field. For more information on what’s available see the documentation for each connector.

In case of duplicate field names (e.g. “ts_ms” exists twice), the struct should be specified to get the correct field (e.g. “source.ts_ms”). The fields will be prefixed with “__“ or “__<struct>_“, depending on the specification of the struct. Please use a comma separated list without spaces.

For example, the configuration

  1. transforms=unwrap,...
  2. transforms.unwrap.type=io.debezium.transforms.ExtractNewRecordState
  3. transforms.unwrap.add.fields=op,table,lsn,source.ts_ms

will add

  1. { "__op" : "c", __table": "MY_TABLE", "__lsn": "123456789", "__source_ts_ms" : "123456789", ...}

to the final flattened record.

For DELETE events, this option is only supported when the delete.handling.mode option is set to “rewrite”.

Adding metadata fields to the header

The SMT can optionally add metadata fields from the original change event to the header of the final flattened record. This functionality can be used to add things like the operation or the table from the change event, or connector-specific fields like the Postgres LSN field. For more information on what’s available see the documentation for each connector.

In case of duplicate field names (e.g. “ts_ms” exists twice), the struct should be specified to get the correct field (e.g. “source.ts_ms”). The fields will be prefixed with “__“ or “__<struct>_“, depending on the specification of the struct. Please use a comma separated list without spaces.

For example, the configuration

  1. transforms=unwrap,...
  2. transforms.unwrap.type=io.debezium.transforms.ExtractNewRecordState
  3. transforms.unwrap.add.headers=op,table,lsn,source.ts_ms

will add headers *op*, table, *lsn* and source_ts_ms to the outgoing record.

Determine original operation [DEPRECATED]

The operation.header option is deprecated and scheduled for removal. Please use add.headers instead. If both add.headers and operation.header are specified, the latter will be ignored.

When a message is flattened the final result won’t show whether it was an insert, update or first read (deletions can be detected via tombstones or rewrites, see Configuration options).

To solve this problem Debezium offers an option to propagate the original operation via a header added to the message. To enable this feature the option operation.header must be set to true.

  1. transforms=unwrap,...
  2. transforms.unwrap.type=io.debezium.transforms.ExtractNewRecordState
  3. transforms.unwrap.operation.header=true

The possible values are the ones from the op field of the original change event.

Adding source metadata fields [DEPRECATED]

The add.source.fields option is deprecated and scheduled for removal. Please use add.fields instead. If both add.fields and add.source.fields are specified, the latter will be ignored.

The SMT can optionally add metadata fields from the original change event’s source structure to the final flattened record (prefixed with “__“). This functionality can be used to add things like the table from the change event, or connector-specific fields like the Postgres LSN field. For more information on what’s available in the source structure see the documentation for each connector.

For example, the configuration

  1. transforms=unwrap,...
  2. transforms.unwrap.type=io.debezium.transforms.ExtractNewRecordState
  3. transforms.unwrap.add.source.fields=table,lsn

will add

  1. { "__table": "MY_TABLE", "__lsn": "123456789", ...}

to the final flattened record.

For DELETE events, this option is only supported when the delete.handling.mode option is set to “rewrite”.

Configuration options

PropertyDefaultDescription

drop.tombstones

true

The SMT removes the tombstone generated by Debezium from the stream.

delete.handling.mode

drop

The SMT can drop (the default), rewrite or pass delete events (none). The rewrite mode will add a deleted column with true/false values based on record operation.

add.fields

Specify a list of metadata fields to add to the flattened message. In case of duplicate field names (e.g. “ts_ms” exists twice), the struct should be specified to get the correct field (e.g. “source.ts_ms”). The fields will be prefixed with ““ or “<struct>“, depending on the specification of the struct. Please use a comma separated list without spaces.

add.headers

Specify a list of metadata fields to add to the header of the flattened message. In case of duplicate field names (e.g. “tsms” exists twice), the struct should be specified to get the correct field (e.g. “source.tsms”). The fields will be prefixed with ““ or “<struct>__”, depending on the specification of the struct. Please use a comma separated list without spaces.

operation.header DEPRECATED

false

This option is deprecated and scheduled for removal. Please use add.headers instead. If both add.headers and operation.header are specified, the latter will be ignored.

The SMT adds the event operation (as obtained from the op field of the original record) as a message header.

add.source.fields DEPRECATED

This option is deprecated and scheduled for removal. Please use add.fields instead. If both add.fields and add.source.fields are specified, the latter will be ignored.

Fields from the change event’s source structure to add as metadata (prefixed with ““) to the flattened record.