Deploying Debezium on OpenShift

Debezium Deployment

To set up Apache Kafka and Kafka Connect on OpenShift, use the set of images that are provided by the Strimzi project. These images offer “Kafka as a Service” by providing enterprise grade configuration files and images that bring Kafka to Kubernetes and OpenShift, as well as Kubernetes operators for running Kafka there.

Prerequisites

  • The OpenShift command line interface (oc) is installed.

  • Docker is installed.

Procedure

  1. In your OpenShift project, enter the following commands to install the operators and templates for the Kafka broker and Kafka Connect:

    1. export STRIMZI_VERSION=0.18.0
    2. git clone -b $STRIMZI_VERSION https://github.com/strimzi/strimzi-kafka-operator
    3. cd strimzi-kafka-operator
    4. # Switch to an admin user to create security objects as part of installation:
    5. oc login -u system:admin
    6. oc create -f install/cluster-operator && oc create -f examples/templates/cluster-operator

    To learn more about setting up Apache Kafka with Strimzi on Kubernetes and OpenShift, see Strimzi deployment of Kafka.

  2. Deploy a Kafka broker cluster:

    1. # Deploy an ephemeral single instance Kafka broker:
    2. oc process strimzi-ephemeral -p CLUSTER_NAME=broker -p ZOOKEEPER_NODE_COUNT=1 -p KAFKA_NODE_COUNT=1 -p KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 -p KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR=1 | oc apply -f -
  3. Create a Kafka Connect image with the Debezium connectors installed:

    1. Download and extract the archive for each Debezium connector you want to run. For example:

      1. curl https://repo1.maven.org/maven2/io/debezium/debezium-connector-mysql/1.4.2.Final/debezium-connector-mysql-1.4.2.Final-plugin.tar.gz tar xvz`
    2. Create a Dockerfile that uses a Strimzi Kafka image as the base image. The following example creates a plugins/debezium directory, which would contain a directory for each Debezium connector that you want to run. To run more than one Debezium connector, insert a COPY line for each connector.

      1. FROM strimzi/kafka:0.18.0-kafka-2.5.0
      2. USER root:root
      3. RUN mkdir -p /opt/kafka/plugins/debezium
      4. COPY ./debezium-connector-mysql/ /opt/kafka/plugins/debezium/
      5. USER 1001

      Before Kafka Connect starts running the connector, Kafka Connect loads any third-party plug-ins that are in the /opt/kafka/plugins directory.

    3. Build a Debezium image from your Dockerfile and push it to your preferred container registry, for example, quay.io or Docker Hub, by executing the following commands. Replace debezium-community with the name of your Docker Hub organization.

      1. export DOCKER_ORG=debezium-community
      2. docker build . -t ${DOCKER_ORG}/connect-debezium
      3. docker push ${DOCKER_ORG}/connect-debezium

      After a while all parts should be up and running:

      1. oc get pods
      2. NAME READY STATUS RESTARTS AGE
      3. broker-entity-operator-5fb7bc8b9b-r86nz 3/3 Running 1 4m
      4. broker-kafka-0 2/2 Running 0 4m
      5. broker-zookeeper-0 2/2 Running 0 5m
      6. debezium-connect-3-4sdjr 1/1 Running 0 1m
      7. strimzi-cluster-operator-d77476b8f-rblqf 1/1 Running 0 5m

      Alternatively, go to the “Pods” view of your OpenShift Web Console (https://myhost:8443/console/project/myproject/browse/pods) to confirm that all pods are up and running:

      openshift pods

Verifying the Deployment

Verify whether the deployment is correct by emulating the Debezium Tutorial in the OpenShift environment.

  1. Start a MySQL server instance that contains some example tables:

    1. # Deploy pre-populated MySQL instance
    2. oc new-app --name=mysql debezium/example-mysql:1.4
    3. # Configure credentials for the database
    4. oc set env dc/mysql MYSQL_ROOT_PASSWORD=debezium MYSQL_USER=mysqluser MYSQL_PASSWORD=mysqlpw

    A new pod with MySQL server should be up and running:

    1. oc get pods
    2. NAME READY STATUS RESTARTS AGE
    3. ...
    4. mysql-1-4503l 1/1 Running 0 2s
    5. mysql-1-deploy 1/1 Running 0 4s
    6. ...
  2. Register the Debezium MySQL connector to run against the deployed MySQL instance:

    1. oc exec -i -c kafka broker-kafka-0 -- curl -X POST \
    2. -H "Accept:application/json" \
    3. -H "Content-Type:application/json" \
    4. http://debezium-connect-api:8083/connectors -d @- <<'EOF'
    5. {
    6. "name": "inventory-connector",
    7. "config": {
    8. "connector.class": "io.debezium.connector.mysql.MySqlConnector",
    9. "tasks.max": "1",
    10. "database.hostname": "mysql",
    11. "database.port": "3306",
    12. "database.user": "debezium",
    13. "database.password": "dbz",
    14. "database.server.id": "184054",
    15. "database.server.name": "dbserver1",
    16. "database.include.list": "inventory",
    17. "database.history.kafka.bootstrap.servers": "broker-kafka-bootstrap:9092",
    18. "database.history.kafka.topic": "schema-changes.inventory"
    19. }
    20. }
    21. EOF

    Kafka Connect’s log file should contain messages regarding execution of the initial snapshot:

    1. oc logs $(oc get pods -o name -l strimzi.io/name=debezium-connect)
  3. Read change events for the customers table from the corresponding Kafka topic:

    1. oc exec -it broker-kafka-0 -- /opt/kafka/bin/kafka-console-consumer.sh \
    2. --bootstrap-server localhost:9092 \
    3. --from-beginning \
    4. --property print.key=true \
    5. --topic dbserver1.inventory.customers

    You should see an output like the following (formatted for the sake of readability):

    1. # Message 1
    2. {
    3. "id": 1001
    4. }
    5. # Message 1 Value
    6. {
    7. "before": null,
    8. "after": {
    9. "id": 1001,
    10. "first_name": "Sally",
    11. "last_name": "Thomas",
    12. "email": "sally.thomas@acme.com"
    13. },
    14. "source": {
    15. "version": "1.4.2.Final",
    16. "connector": "mysql",
    17. "name": "dbserver1",
    18. "server_id": 0,
    19. "ts_sec": 0,
    20. "gtid": null,
    21. "file": "mysql-bin.000003",
    22. "pos": 154,
    23. "row": 0,
    24. "snapshot": true,
    25. "thread": null,
    26. "db": "inventory",
    27. "table": "customers"
    28. },
    29. "op": "c",
    30. "ts_ms": 1509530901446
    31. }
    32. # Message 2 Key
    33. {
    34. "id": 1002
    35. }
    36. # Message 2 Value
    37. {
    38. "before": null,
    39. "after": {
    40. "id": 1002,
    41. "first_name": "George",
    42. "last_name": "Bailey",
    43. "email": "gbailey@foobar.com"
    44. },
    45. "source": {
    46. "version": "1.4.2.Final",
    47. "connector": "mysql",
    48. "name": "dbserver1",
    49. "server_id": 0,
    50. "ts_sec": 0,
    51. "gtid": null,
    52. "file": "mysql-bin.000003",
    53. "pos": 154,
    54. "row": 0,
    55. "snapshot": true,
    56. "thread": null,
    57. "db": "inventory",
    58. "table": "customers"
    59. },
    60. "op": "c",
    61. "ts_ms": 1509530901446
    62. }
    63. ...
  4. Modify some records in the customers table of the database:

    1. oc exec -it $(oc get pods -o custom-columns=NAME:.metadata.name --no-headers -l app=mysql) \
    2. -- bash -c 'mysql -u $MYSQL_USER -p$MYSQL_PASSWORD inventory'
    3. # For example, run UPDATE customers SET email="sally.thomas@example.com" WHERE ID = 1001;

    You should now see additional change messages in the consumer started previously.

If you have any questions or requests related to running Debezium on Kubernetes or OpenShift, let us know in our user group or in the Debezium developer’s chat.