Pulsar connector overview

当一起使用消息系统和外部系统(例如,数据库或其它消息系统)时,消息系统的功能十分强大。

Pulsar IO connectors enable you to easily create, deploy, and manage connectors that interact with external systems, such as Apache Cassandra, Aerospike, and many others.

概念

Pulsar IO connectors come in two types: source and sink.

下图说明了 source、Pulasr 和 sink 之间的关系:

Pulsar IO diagram

Source

Sources feed data from external systems into Pulsar.

通用的 source 包括其它消息系统和流水式数据管道 API。

想要了解 Pulsar 内置 source 连接器的完整列表,参阅 source connector

Sink

Sinks feed data from Pulsar into external systems.

通用的 Sinks 包含常见的消息系统,以及关系型数据库和NoSQL数据库。

了解 Pulsar 内置 sink 连接器的完整列表,参阅 sink connector

Processing guarantee

Processing guarantees 用于处理向 Pulsar 主题写入消息时发生的错误。

Pulsar connectors and Functions use the same processing guarantees as below.

传递语义说明
at-most-onceEach message sent to a connector is to be processed once or not to be processed.
at-least-onceEach message sent to a connector is to be processed once or more than once.
effectively-onceEach message sent to a connector has one output associated with it.

Processing guarantees for connectors not just rely on Pulsar guarantee but also relate to external systems, that is, the implementation of source and sink.

  • Source: Pulsar ensures that writing messages to Pulsar topics respects to the processing guarantees. It is within Pulsar’s control.

  • Sink: the processing guarantees rely on the sink implementation. If the sink implementation does not handle retries in an idempotent way, the sink does not respect to the processing guarantees.

Set

创建连接器时,可以使用以下语义设置 processing guarantee:

  • ATLEAST_ONCE

  • ATMOST_ONCE

  • EFFECTIVELY_ONCE

如果在创建连接器时,没有指定 --processing-guarantees,则默认语义为 ATLEAST_ONCE

Here takes Admin CLI as an example. For more information about REST API or JAVA Admin API, see here.

Source

Sink

  1. $ bin/pulsar-admin sources create \ --processing-guarantees ATMOST_ONCE \ # Other source configs

了解更多关于 pulsar-admin 源代码创建的信息,参阅这里

  1. $ bin/pulsar-admin sinks create \ --processing-guarantees EFFECTIVELY_ONCE \ # Other sink configs

了解更多关于 pulsar-admin sinks 创建的信息,参阅这里

更新

创建连接器后,可以使用以下语义更新 processing guarantee:

  • ATLEAST_ONCE

  • ATMOST_ONCE

  • EFFECTIVELY_ONCE

Here takes Admin CLI as an example. For more information about REST API or JAVA Admin API, see here.

Source

Sink

  1. $ bin/pulsar-admin sources update \ --processing-guarantees EFFECTIVELY_ONCE \ # Other source configs

了解更多关于 pulsar-admin 源代码更新的信息,参阅这里

  1. $ bin/pulsar-admin sinks update \ --processing-guarantees ATMOST_ONCE \ # Other sink configs

了解更多关于 pulsar-admin sinks 更新的信息,参阅这里

使用连接器

You can manage Pulsar connectors (for example, create, update, start, stop, restart, reload, delete and perform other operations on connectors) via the Connector Admin CLI with sources and sinks subcommands.

连接器(sources 和 sinks)和 Functions 是实例的组成部分,都在 Functions workers 上运行。 通过 Connector Admin CLIFunctions Admin CLI 管理 source 时,实例在 worker 上启动。 了解更多信息,参阅 Functions worker