Deploying Debezium on OpenShift

Prerequisites

To keep containers separated from other workloads on the cluster, create a dedicated project for Debezium. In the remainder of this document, the debezium-example namespace will be used:

  1. $ oc new-project debezium-example

Deploying Strimzi Operator

For the Debezium deployment we will use the Strimzi project, which manages the Kafka deployment on OpenShift clusters. The simplest way for installing Strimzi is to install the Strimzi operator from OperatorHub. Navigate to the “OperatorHub” tab in the OpenShift UI, select “Strimzi” and click the “install” button.

openshift strimzi operator

If you prefer command line tools, you can install the Strimzi operator this way as well:

  1. $ cat << EOF | oc create -f -
  2. apiVersion: operators.coreos.com/v1alpha1
  3. kind: Subscription
  4. metadata:
  5. name: my-strimzi-kafka-operator
  6. namespace: openshift-operators
  7. spec:
  8. channel: stable
  9. name: strimzi-kafka-operator
  10. source: operatorhubio-catalog
  11. sourceNamespace: olm
  12. EOF

Creating Secrets for the Database

Later on, when deploying Debezium Kafka connector, we will need to provide username and password for the connector to be able to connect to the database. For security reasons, it’s a good practice not to provide the credentials directly, but keep them in a separate secured place. OpenShift provides Secret object for this purpose. Besides creating Secret object itself, we have to also create a role and a role binding so that Kafka can access the credentials.

Let’s create Secret object first:

  1. $ cat << EOF | oc create -f -
  2. apiVersion: v1
  3. kind: Secret
  4. metadata:
  5. name: debezium-secret
  6. namespace: debezium-example
  7. type: Opaque
  8. data:
  9. username: ZGViZXppdW0=
  10. password: ZGJ6
  11. EOF

The username and password contain base64-encoded credentials (debezium/dbz) for connecting to the MySQL database, which we will deploy later.

Now, we can create a role, which refers secret created in the previous step:

  1. $ cat << EOF | oc create -f -
  2. apiVersion: rbac.authorization.k8s.io/v1
  3. kind: Role
  4. metadata:
  5. name: connector-configuration-role
  6. namespace: debezium-example
  7. rules:
  8. - apiGroups: [""]
  9. resources: ["secrets"]
  10. resourceNames: ["debezium-secret"]
  11. verbs: ["get"]
  12. EOF

We also have to bind this role to the Kafka Connect cluster service account so that Kafka Connect can access the secret:

  1. $ cat << EOF | oc create -f -
  2. apiVersion: rbac.authorization.k8s.io/v1
  3. kind: RoleBinding
  4. metadata:
  5. name: connector-configuration-role-binding
  6. namespace: debezium-example
  7. subjects:
  8. - kind: ServiceAccount
  9. name: debezium-connect-cluster-connect
  10. namespace: debezium-example
  11. roleRef:
  12. kind: Role
  13. name: connector-configuration-role
  14. apiGroup: rbac.authorization.k8s.io
  15. EOF

The service account will be create by Strimzi once we deploy Kafka Connect. The name of the service account take form $KafkaConnectName-connect. Later on, we will create Kafka Connect cluster named debezium-connect-cluster and therefore we used debezium-connect-cluster-connect here as a subjects.name.

Deploying Apache Kafka

Next, deploy a (single-node) Kafka cluster:

  1. $ cat << EOF | oc create -n debezium-example -f -
  2. apiVersion: kafka.strimzi.io/v1beta2
  3. kind: Kafka
  4. metadata:
  5. name: debezium-cluster
  6. spec:
  7. kafka:
  8. replicas: 1
  9. listeners:
  10. - name: plain
  11. port: 9092
  12. type: internal
  13. tls: false
  14. - name: tls
  15. port: 9093
  16. type: internal
  17. tls: true
  18. authentication:
  19. type: tls
  20. - name: external
  21. port: 9094
  22. type: nodeport
  23. tls: false
  24. storage:
  25. type: jbod
  26. volumes:
  27. - id: 0
  28. type: persistent-claim
  29. size: 100Gi
  30. deleteClaim: false
  31. config:
  32. offsets.topic.replication.factor: 1
  33. transaction.state.log.replication.factor: 1
  34. transaction.state.log.min.isr: 1
  35. default.replication.factor: 1
  36. min.insync.replicas: 1
  37. zookeeper:
  38. replicas: 1
  39. storage:
  40. type: persistent-claim
  41. size: 100Gi
  42. deleteClaim: false
  43. entityOperator:
  44. topicOperator: {}
  45. userOperator: {}
  46. EOF
  • Wait until it’s ready:
  1. $ oc wait kafka/debezium-cluster --for=condition=Ready --timeout=300s

Deploying a Data Source

As a data source, MySQL will be used in the following. Besides running a pod with MySQL, an appropriate service which will point to the pod with DB itself is needed. It can be created e.g. as follows:

  1. $ cat << EOF | oc create -f -
  2. apiVersion: v1
  3. kind: Service
  4. metadata:
  5. name: mysql
  6. spec:
  7. ports:
  8. - port: 3306
  9. selector:
  10. app: mysql
  11. clusterIP: None
  12. ---
  13. apiVersion: apps/v1
  14. kind: Deployment
  15. metadata:
  16. name: mysql
  17. spec:
  18. selector:
  19. matchLabels:
  20. app: mysql
  21. strategy:
  22. type: Recreate
  23. template:
  24. metadata:
  25. labels:
  26. app: mysql
  27. spec:
  28. containers:
  29. - image: quay.io/debezium/example-mysql:2.1
  30. name: mysql
  31. env:
  32. - name: MYSQL_ROOT_PASSWORD
  33. value: debezium
  34. - name: MYSQL_USER
  35. value: mysqluser
  36. - name: MYSQL_PASSWORD
  37. value: mysqlpw
  38. ports:
  39. - containerPort: 3306
  40. name: mysql
  41. EOF

Deploying a Debezium Connector

To deploy a Debezium connector, you need to deploy a Kafka Connect cluster with the required connector plug-in(s), before instantiating the actual connector itself. As the first step, a container image for Kafka Connect with the plug-in has to be created. If you already have a container image built and available in the registry, you can skip this step. In this document, the MySQL connector will be used as an example.

Creating Kafka Connect Cluster

Again, we will use Strimzi for creating the Kafka Connect cluster. Strimzi also can be used for building and pushing the required container image for us. In fact, both tasks can be merged together and instructions for building the container image can be provided directly within the KafkaConnect object specification:

  1. $ cat << EOF | oc create -f -
  2. apiVersion: kafka.strimzi.io/v1beta2
  3. kind: KafkaConnect
  4. metadata:
  5. name: debezium-connect-cluster
  6. annotations:
  7. strimzi.io/use-connector-resources: "true"
  8. spec:
  9. version: 3.1.0
  10. replicas: 1
  11. bootstrapServers: debezium-cluster-kafka-bootstrap:9092
  12. config:
  13. config.providers: secrets
  14. config.providers.secrets.class: io.strimzi.kafka.KubernetesSecretConfigProvider
  15. group.id: connect-cluster
  16. offset.storage.topic: connect-cluster-offsets
  17. config.storage.topic: connect-cluster-configs
  18. status.storage.topic: connect-cluster-status
  19. # -1 means it will use the default replication factor configured in the broker
  20. config.storage.replication.factor: -1
  21. offset.storage.replication.factor: -1
  22. status.storage.replication.factor: -1
  23. build:
  24. output:
  25. type: docker
  26. image: image-registry.openshift-image-registry.svc:5000/debezium-example/debezium-connect-mysql:latest
  27. plugins:
  28. - name: debezium-mysql-connector
  29. artifacts:
  30. - type: tgz
  31. url: https://repo1.maven.org/maven2/io/debezium/debezium-connector-mysql/{debezium-version}/debezium-connector-mysql-{debezium-version}-plugin.tar.gz
  32. EOF

Here we took the advantage of the OpenShift built-in registry, already running as a service on the OpenShift cluster.

For simplicity, we’ve skipped the checksum validation for the downloaded artifact. If you want to be sure the artifact was correctly downloaded, specify its checksum via the sha512sum attribute. See the Strimzi documentation for more details.

If you already have a suitable container image either in the local or a remote registry (such as quay.io or DockerHub), you can use this simplified version:

  1. $ cat << EOF | oc create -f -
  2. apiVersion: kafka.strimzi.io/v1beta2
  3. kind: KafkaConnect
  4. metadata:
  5. name: debezium-connect-cluster
  6. annotations:
  7. strimzi.io/use-connector-resources: "true"
  8. spec:
  9. version: 3.1.0
  10. image: 10.110.154.103/debezium-connect-mysql:latest
  11. replicas: 1
  12. bootstrapServers: debezium-cluster-kafka-bootstrap:9092
  13. config:
  14. config.providers: secrets
  15. config.providers.secrets.class: io.strimzi.kafka.KubernetesSecretConfigProvider
  16. group.id: connect-cluster
  17. offset.storage.topic: connect-cluster-offsets
  18. config.storage.topic: connect-cluster-configs
  19. status.storage.topic: connect-cluster-status
  20. # -1 means it will use the default replication factor configured in the broker
  21. config.storage.replication.factor: -1
  22. offset.storage.replication.factor: -1
  23. status.storage.replication.factor: -1
  24. EOF

You can also note, that we have configured the secret provider to use Strimzi secret provider Strimzi secret provider will create service account for this Kafka Connect cluster (and which we have already bound to the appropriate role), and allow Kafka Connect to access our Secret object.

Before creating a Debezium connector, check that all pods are already running:

openshift pods

Creating a Debezium Connector

To create a Debezium connector, you just need to create a KafkaConnector with the appropriate configuration, MySQL in this case:

  1. $ cat << EOF | oc create -f -
  2. apiVersion: kafka.strimzi.io/v1beta2
  3. kind: KafkaConnector
  4. metadata:
  5. name: debezium-connector-mysql
  6. labels:
  7. strimzi.io/cluster: debezium-connect-cluster
  8. spec:
  9. class: io.debezium.connector.mysql.MySqlConnector
  10. tasksMax: 1
  11. config:
  12. tasks.max: 1
  13. database.hostname: mysql
  14. database.port: 3306
  15. database.user: ${secrets:debezium-example/debezium-secret:username}
  16. database.password: ${secrets:debezium-example/debezium-secret:password}
  17. database.server.id: 184054
  18. topic.prefix: mysql
  19. database.include.list: inventory
  20. schema.history.internal.kafka.bootstrap.servers: debezium-cluster-kafka-bootstrap:9092
  21. schema.history.internal.kafka.topic: schema-changes.inventory
  22. EOF

As you can note, we didn’t use plain text user name and password in the connector configuration, but refer to Secret object we created previously.

Verifying the Deployment

To verify the everything works fine, you can e.g. start watching mysql.inventory.customers Kafka topic:

  1. $ oc run -n debezium-example -it --rm --image=quay.io/debezium/tooling:1.2 --restart=Never watcher -- kcat -b debezium-cluster-kafka-bootstrap:9092 -C -o beginning -t mysql.inventory.customers

Connect to the MySQL database:

  1. $ oc run -n debezium-example -it --rm --image=mysql:8.0 --restart=Never --env MYSQL_ROOT_PASSWORD=debezium mysqlterm -- mysql -hmysql -P3306 -uroot -pdebezium

Do some changes in the customers table:

  1. sql> update customers set first_name="Sally Marie" where id=1001;

You now should be able to observe the change events on the Kafka topic:

  1. {
  2. ...
  3. "payload": {
  4. "before": {
  5. "id": 1001,
  6. "first_name": "Sally",
  7. "last_name": "Thomas",
  8. "email": "sally.thomas@acme.com"
  9. },
  10. "after": {
  11. "id": 1001,
  12. "first_name": "Sally Marie",
  13. "last_name": "Thomas",
  14. "email": "sally.thomas@acme.com"
  15. },
  16. "source": {
  17. "version": "{debezium-version}",
  18. "connector": "mysql",
  19. "name": "mysql",
  20. "ts_ms": 1646300467000,
  21. "snapshot": "false",
  22. "db": "inventory",
  23. "sequence": null,
  24. "table": "customers",
  25. "server_id": 223344,
  26. "gtid": null,
  27. "file": "mysql-bin.000003",
  28. "pos": 401,
  29. "row": 0,
  30. "thread": null,
  31. "query": null
  32. },
  33. "op": "u",
  34. "ts_ms": 1646300467746,
  35. "transaction": null
  36. }
  37. }

If you have any questions or requests related to running Debezium on Kubernetes or OpenShift, then please let us know in our user group or in the Debezium developer’s chat.