Pulsar Terminology

下面是 Apache Pulsar 相关的一些术语:

概念

Pulsar

Pulsar 是一个分布式消息系统,最初由 Yahoo 创建,现在是 Apache 软件基金会的一个孵化器项目。

消息

Messages are the basic unit of Pulsar. They’re what producers publish to topics and what consumers then consume from topics.

主题(Topic)

分区是生产者发布的消息传递给处理这些消息的消费者的命名通道。

分区主题(Partitioned Topic)

分区主题在服务端会有多个Pulsar Broker处理,具有更高的吞吐能力。

命名空间(Namespace)

命名空间是多个主题间的一个分组机制。

命名空间Bundle(Namespace Bundle)

命名空间 Bundle 是同一个命名空间下的虚拟主题组, 一个命名空间 Bundle 是一个32位的哈希值,取值范围从 0x00000000 到 0xffffffff。

租户(Tenant)

租户是一个用于分配容量和执行身份验证/授权方案的管理单元。

订阅(Subscription)

A lease on a topic established by a group of consumers. Pulsar has three subscription modes (exclusive, shared, and failover).

发布 - 订阅(Pub-Sub)

是一种消息传递模式,即生产者进程发布消息到主题消费者进程消费处理这些消息。

生产者(Producer)

是指发送消息到 Pulsar 主题的进程。

消费者(Consumer)

是指订阅 Pulsar 主题,并处理生产者发布到该主题的消息的进程。

读者(Reader)

Pulsar Reader 是消息处理程序,与 Pulsar 消费者非常相似,但有两个重要区别:

  • you can specify where on a topic readers begin processing messages (consumers always begin with the latest available unacked message);
  • readers 不会保留数据或确认消息。

游标(Cursor)

消费者订阅分区的位置。

消息确认(ack)

消费者发送给 Pulsar broker 确认消息,表明消息已成功处理。 消息确认(ack)是一种 Pulsar 知道消息可以从系统中删除消息的方式。如果没有确认,则该消息将一直保留到被处理为止。

取消确认(nack)

当应用程序无法处理特定消息时,它可以向Pulsar发送“取消确认”,以表示消息在一定时间后重新允许被消费。 (默认情况下,失败的消息会在一分钟后允许被重新消费)。

Unacknowledged

A message that has been delivered to a consumer for processing but not yet confirmed as processed by the consumer.

Retention Policy

Size and/or time limits that you can set on a namespace to configure retention of messages that have already been acknowledged.

Multi-Tenancy

The ability to isolate namespaces, specify quotas, and configure authentication and authorization on a per-tenant basis.

架构

Standalone

A lightweight Pulsar broker in which all components run in a single Java Virtual Machine (JVM) process. Standalone clusters can be run on a single machine and are useful for development purposes.

Cluster

A set of Pulsar brokers and BookKeeper servers (aka bookies). Clusters can reside in different geographical regions and replicate messages to one another in a process called geo-replication.

Instance

A group of Pulsar clusters that act together as a single unit.

Geo-Replication

Replication of messages across Pulsar clusters, potentially in different datacenters or geographical regions.

Configuration Store

Pulsar’s configuration store (previously known as configuration store) is a ZooKeeper quorum that is used for configuration-specific tasks. A multi-cluster Pulsar installation requires just one configuration store across all clusters.

Topic Lookup

A service provided by Pulsar brokers that enables connecting clients to automatically determine which Pulsar cluster is responsible for a topic (and thus where message traffic for the topic needs to be routed).

Service Discovery

A mechanism provided by Pulsar that enables connecting clients to use just a single URL to interact with all the brokers in a cluster.

Broker

A stateless component of Pulsar clusters that runs two other components: an HTTP server exposing a REST interface for administration and topic lookup and a dispatcher that handles all message transfers. Pulsar clusters typically consist of multiple brokers.

Dispatcher

An asynchronous TCP server used for all data transfers in-and-out a Pulsar broker. The Pulsar dispatcher uses a custom binary protocol for all communications.

单机模式

BookKeeper

Apache BookKeeper is a scalable, low-latency persistent log storage service that Pulsar uses to store data.

Bookie

Bookie is the name of an individual BookKeeper server. It is effectively the storage server of Pulsar.

Ledger

An append-only data structure in BookKeeper that is used to persistently store messages in Pulsar topics.

存储过程

Pulsar Functions are lightweight functions that can consume messages from Pulsar topics, apply custom processing logic, and, if desired, publish results to topics.