The Pulsar Python client

The Pulsar Python client library is a wrapper over the existing C++ client library and exposes all of the same features. You can find the code in the python subdirectory of the C++ client code.

Installation

You can install the pulsar-client library either via PyPi, using pip, or by building the library from source.

Installation using pip

To install the pulsar-client library as a pre-built package using the pip package manager:

  1. $ pip install pulsar-client==2.4.0

Installation via PyPi is available for the following Python versions:

PlatformSupported Python versions
MacOS 10.11 (El Capitan) — 10.12 (Sierra) — 10.13 (High Sierra) — 10.14 (Mojave)2.7, 3.7
Linux2.7, 3.4, 3.5, 3.6, 3.7

Installing from source

To install the pulsar-client library by building from source, follow these instructions and compile the Pulsar C++ client library. That will also build the Python binding for the library.

To install the built Python bindings:

  1. $ git clone https://github.com/apache/pulsar
  2. $ cd pulsar/pulsar-client-cpp/python
  3. $ sudo python setup.py install

API Reference

The complete Python API reference is available at api/python.

Examples

Below you'll find a variety of Python code examples for the pulsar-client library.

Producer example

This creates a Python producer for the my-topic topic and send 10 messages on that topic:

  1. import pulsar
  2. client = pulsar.Client('pulsar://localhost:6650')
  3. producer = client.create_producer('my-topic')
  4. for i in range(10):
  5. producer.send(('Hello-%d' % i).encode('utf-8'))
  6. client.close()

Consumer example

This creates a consumer with the my-subscription subscription on the my-topic topic, listen for incoming messages, print the content and ID of messages that arrive, and acknowledge each message to the Pulsar broker:

  1. consumer = client.subscribe('my-topic', 'my-subscription')
  2. while True:
  3. msg = consumer.receive()
  4. try:
  5. print("Received message '{}' id='{}'".format(msg.data(), msg.message_id()))
  6. # Acknowledge successful processing of the message
  7. consumer.acknowledge(msg)
  8. except:
  9. # Message failed to be processed
  10. consumer.negative_acknowledge(msg)
  11. client.close()

Reader interface example

You can use the Pulsar Python API to use the Pulsar reader interface. Here's an example:

  1. # MessageId taken from a previously fetched message
  2. msg_id = msg.message_id()
  3. reader = client.create_reader('my-topic', msg_id)
  4. while True:
  5. msg = reader.read_next()
  6. print("Received message '{}' id='{}'".format(msg.data(), msg.message_id()))
  7. # No acknowledgment

Schema

Declaring and validating schema

A schema can be declared by passing a class that inheritsfrom pulsar.schema.Record and defines the fields asclass variables. For example:

  1. from pulsar.schema import *
  2. class Example(Record):
  3. a = String()
  4. b = Integer()
  5. c = Boolean()

With this simple schema definition we can then create producers,consumers and readers instances that will be referring to that.

  1. producer = client.create_producer(
  2. topic='my-topic',
  3. schema=AvroSchema(Example) )
  4. producer.send(Example(a='Hello', b=1))

When the producer is created, the Pulsar broker will validate thatthe existing topic schema is indeed of "Avro" type and that theformat is compatible with the schema definition of the Exampleclass.

If there is a mismatch, the producer creation will raise anexception.

Once a producer is created with a certain schema definition,it will only accept objects that are instances of the declaredschema class.

Similarly, for a consumer/reader, the consumer will return anobject, instance of the schema record class, rather than the rawbytes:

  1. consumer = client.subscribe(
  2. topic='my-topic',
  3. subscription_name='my-subscription',
  4. schema=AvroSchema(Example) )
  5. while True:
  6. msg = consumer.receive()
  7. ex = msg.value()
  8. try:
  9. print("Received message a={} b={} c={}".format(ex.a, ex.b, ex.c))
  10. # Acknowledge successful processing of the message
  11. consumer.acknowledge(msg)
  12. except:
  13. # Message failed to be processed
  14. consumer.negative_acknowledge(msg)

Supported schema types

There are different builtin schema types that can be used in Pulsar.All the definitions are in the pulsar.schema package.

SchemaNotes
BytesSchemaGet the raw payload as a bytes object. No serialization/deserialization are performed. This is the default schema mode
StringSchemaEncode/decode payload as a UTF-8 string. Uses str objects
JsonSchemaRequire record definition. Serializes the record into standard JSON payload
AvroSchemaRequire record definition. Serializes in AVRO format

Schema definition reference

The schema definition is done through a class that inherits frompulsar.schema.Record.

This class can have a number of fields which can be of eitherpulsar.schema.Field type or even another nested Record. All thefields are also specified in the pulsar.schema package. The fieldsare matching the AVRO fields types.

Field TypePython TypeNotes
Booleanbool
Integerint
Longint
Floatfloat
Doublefloat
Bytesbytes
Stringstr
ArraylistNeed to specify record type for items
MapdictKey is always String. Need to specify value type

Additionally, any Python Enum type can be used as a valid fieldtype

Fields parameters

When adding a field these parameters can be used in the constructor:

ArgumentDefaultNotes
defaultNoneSet a default value for the field. Eg: a = Integer(default=5)
requiredFalseMark the field as "required". This will set it in the schema accordingly.

Schema definition examples

Simple definition
  1. class Example(Record):
  2. a = String()
  3. b = Integer()
  4. c = Array(String())
  5. i = Map(String())
Using enums
  1. from enum import Enum
  2. class Color(Enum):
  3. red = 1
  4. green = 2
  5. blue = 3
  6. class Example(Record):
  7. name = String()
  8. color = Color
Complex types
  1. class MySubRecord(Record):
  2. x = Integer()
  3. y = Long()
  4. z = String()
  5. class Example(Record):
  6. a = String()
  7. sub = MySubRecord()