Running on Kubernetes

This document describes how to run HStreamDB kubernetes using the specs that we provide. The document assumes basic previous kubernetes knowledge. By the end of this section, you’ll have a fully running HStreamDB cluster on kubernetes that’s ready to receive reads/writes, process datas, etc.

Building your Kubernetes Cluster

The first step is to have a running kubernetes cluster. You can use a managed cluster (provided by your cloud provider), a self-hosted cluster or a local kubernetes cluster using a tool like minikube. Make sure that kubectl points to whatever cluster you’re planning to use.

Also, you need a storageClass named hstream-store, you can create by kubectl or by your cloud provider web page if it has.

Install Zookeeper

HStreamDB depends on Zookeeper for storing queries information and some internal storage configuration. So we will need to provision a zookeeper ensemble that HStreamDB will be able to access. For this demo, we will use helm (A package manager for kubernetes) to install zookeeper. After installing helm run:

  1. helm repo add bitnami https://charts.bitnami.com/bitnami
  2. helm repo update
  3. helm install zookeeper bitnami/zookeeper \
  4. --set image.tag=3.6.3 \
  5. --set replicaCount=3 \
  6. --set persistence.storageClass=hstream-store \
  7. --set persistence.size=20Gi
  1. NAME: zookeeper
  2. LAST DEPLOYED: Tue Jul 6 10:51:37 2021
  3. NAMESPACE: test
  4. STATUS: deployed
  5. REVISION: 1
  6. TEST SUITE: None
  7. NOTES:
  8. ** Please be patient while the chart is being deployed **
  9. ZooKeeper can be accessed via port 2181 on the following DNS name from within your cluster:
  10. zookeeper.svc.cluster.local
  11. To connect to your ZooKeeper server run the following commands:
  12. export POD_NAME=$(kubectl get pods -l "app.kubernetes.io/name=zookeeper,app.kubernetes.io/instance=zookeeper,app.kubernetes.io/component=zookeeper" -o jsonpath="{.items[0].metadata.name}")
  13. kubectl exec -it $POD_NAME -- zkCli.sh
  14. To connect to your ZooKeeper server from outside the cluster execute the following commands:
  15. kubectl port-forward svc/zookeeper 2181:2181 &
  16. zkCli.sh 127.0.0.1:2181
  17. WARNING: Rolling tag detected (bitnami/zookeeper:3.6.3), please note that it is strongly recommended to avoid using rolling tags in a production environment.
  18. +info https://docs.bitnami.com/containers/how-to/understand-rolling-tags-containers/

This will by default install a 3 nodes zookeeper ensemble. Wait until all the three pods are marked as ready:

  1. kubectl get pods
  1. NAME READY STATUS RESTARTS AGE
  2. zookeeper-0 1/1 Running 0 22h
  3. zookeeper-1 1/1 Running 0 4d22h
  4. zookeeper-2 1/1 Running 0 16m

Configuring and Starting HStreamDB

Once all the zookeeper pods are ready, we’re ready to start installing the HStreamDB cluster.

Fetching The K8s Specs

  1. git clone git@github.com:hstreamdb/hstream.git
  2. cd hstream/k8s

Update Configuration

If you used a different way to install zookeeper, make sure to update the zookeeper connection string in storage config file config.json and server service file hstream-server.yaml.

It should look something like this:

  1. $ cat config.json | grep -A 2 zookeeper
  2. "zookeeper": {
  3. "zookeeper_uri": "ip://zookeeper-0.zookeeper-headless:2181,zookeeper-1.zookeeper-headless:2181,zookeeper-2.zookeeper-headless:2181",
  4. "timeout": "30s"
  5. }
  6. $ cat hstream-server.yaml | grep -A 1 zkuri
  7. - "--zkuri"
  8. - "zookeeper-0.zookeeper-headless:2181,zookeeper-1.zookeeper-headless:2181,zookeeper-2.zookeeper-headless:2181"

Tips

The zookeeper connection string in stotage config file and the service file can be different. But for normal scenario, they are the same.

By default, this spec installs a 3 nodes HStream server cluster and 4 nodes storage cluster. If you want a bigger cluster, modify the hstream-server.yaml and logdevice-statefulset.yaml file, and increase the number of replicas to the number of nodes you want in the cluster. Also by default, we attach a 40GB persistent storage to the nodes, if you want more you can change that under the volumeClaimTemplates section.

Starting the Cluster

  1. kubectl apply -k .

When you run kubectl get pods, you should see something like this:

  1. NAME READY STATUS RESTARTS AGE
  2. hstream-server-deployment-765c84c489-94nqd 1/1 Running 0 6d18h
  3. hstream-server-deployment-765c84c489-jrm5p 1/1 Running 0 6d18h
  4. hstream-server-deployment-765c84c489-jxsjd 1/1 Running 0 6d18h
  5. logdevice-0 1/1 Running 0 6d18h
  6. logdevice-1 1/1 Running 0 6d18h
  7. logdevice-2 1/1 Running 0 6d18h
  8. logdevice-3 1/1 Running 0 6d18h
  9. logdevice-admin-server-deployment-5c5fb9f8fb-27jlk 1/1 Running 0 6d18h
  10. zookeeper-0 1/1 Running 0 6d22h
  11. zookeeper-1 1/1 Running 0 10d
  12. zookeeper-2 1/1 Running 0 6d

Bootstrapping the Storage Cluster

Once all the logdevice pods are running and ready, you’ll need to bootstrap the cluster to enable all the nodes. To do that, run:

  1. kubectl run hadmin -it --rm --restart=Never --image=hstreamdb/hstream -- \
  2. hadmin --host logdevice-admin-server-service \
  3. nodes-config \
  4. bootstrap --metadata-replicate-across 'node:3'

This will start a hadmin pod, that connects to the admin server and invokes the nodes-config bootstrap hadmin command and sets the metadata replication property of the cluster to be replicated across three different nodes. On success, you should see something like:

  1. Successfully bootstrapped the cluster
  2. pod "hadmin" deleted

Managing the Storage Cluster

  1. kubectl run hadmin -it --rm --restart=Never --image=hstreamdb/hstream -- bash

Now you can run hadmin to manage the cluster:

  1. hadmin --help

To check the state of the cluster, you can then run:

  1. hadmin --host logdevice-admin-server-service status
  2. +----+-------------+-------+---------------+
  3. | ID | NAME | STATE | HEALTH STATUS |
  4. +----+-------------+-------+---------------+
  5. | 0 | logdevice-0 | ALIVE | HEALTHY |
  6. | 1 | logdevice-1 | ALIVE | HEALTHY |
  7. | 2 | logdevice-2 | ALIVE | HEALTHY |
  8. | 3 | logdevice-3 | ALIVE | HEALTHY |
  9. +----+-------------+-------+---------------+
  10. Took 2.567s