Pulsar connector overview

当你可以轻松地将消息系统与数据库和其他消息系统等外部系统结合使用时,消息系统才能展示出最强大的力量。

Pulsar IO connectors enable you to easily create, deploy, and manage connectors that interact with external systems, such as Apache Cassandra, Aerospike, and many others.

概念

Pulsar IO connectors come in two types: source and sink.

下图说明了 source、Pulasr 和 sink 之间的关系:

Pulsar IO diagram

Source

Sources feed data from external systems into Pulsar.

通用的 source 包括其它消息系统和流式数据管道 API。

想要了解 Pulsar 内置 source 连接器的完整列表,参阅 source connector

Sink

Sinks feed data from Pulsar into external systems.

通用的 Sinks 包含常见的消息系统,以及关系型数据库和NoSQL数据库。

了解 Pulsar 内置 sink 连接器的完整列表,参阅 sink connector

处理保证(Processing guarantee)

Processing guarantees 用于处理向 Pulsar 主题写入消息时发生的错误。

Pulsar connectors and Functions use the same processing guarantees as below.

传递语义说明
at-most-onceEach message sent to a connector is to be processed once or not to be processed.
at-least-onceEach message sent to a connector is to be processed once or more than once.
effectively-onceEach message sent to a connector has one output associated with it.

Processing guarantees for connectors not just rely on Pulsar guarantee but also relate to external systems, that is, the implementation of source and sink.

  • Source: Pulsar ensures that writing messages to Pulsar topics respects to the processing guarantees. It is within Pulsar’s control.

  • Sink: the processing guarantees rely on the sink implementation. If the sink implementation does not handle retries in an idempotent way, the sink does not respect to the processing guarantees.

Set

创建连接器时,可以使用以下语义设置 processing guarantee:

  • ATLEAST_ONCE

  • ATMOST_ONCE

  • EFFECTIVELY_ONCE

如果在创建连接器时,没有指定 --processing-guarantees,则默认语义为 ATLEAST_ONCE

Here takes Admin CLI as an example. For more information about REST API or JAVA Admin API, see here.

Source

Sink

  1. $ bin/pulsar-admin sources create \ --processing-guarantees ATMOST_ONCE \ # Other source configs

了解更多关于 pulsar-admin sources create的信息,参阅这里

  1. $ bin/pulsar-admin sinks create \ --processing-guarantees EFFECTIVELY_ONCE \ # Other sink configs

了解更多关于 pulsar-admin sinks create的信息,参阅这里

更新

创建连接器后,可以使用以下语义更新 processing guarantee:

  • ATLEAST_ONCE

  • ATMOST_ONCE

  • EFFECTIVELY_ONCE

Here takes Admin CLI as an example. For more information about REST API or JAVA Admin API, see here.

Source

Sink

  1. $ bin/pulsar-admin sources update \ --processing-guarantees EFFECTIVELY_ONCE \ # Other source configs

了解更多关于 pulsar-admin sources update 的信息,参阅这里

  1. $ bin/pulsar-admin sinks update \ --processing-guarantees ATMOST_ONCE \ # Other sink configs

了解更多关于 pulsar-admin sinks update的信息,参阅这里

使用连接器

可以通过 Connector Admin CLI并结合 sourcessinks 子命令来管理 Pulsar 连接器(例如,创建、更新、启动、停止、重启、重载、删除以及其他操作)。

连接器(sources 和 sinks)和 Functions 是实例的组成部分,都在 Functions workers 上运行。 通过 Connector Admin CLIFunctions Admin CLI 管理 source、sink 或者 function 时,在 worker 上就启动了一个实例。 了解更多信息,参阅 Functions worker