Deploying a TDengine Cluster in Kubernetes

TDengine is a cloud-native time-series database that can be deployed on Kubernetes. This document gives a step-by-step description of how you can use YAML files to create a TDengine cluster and introduces common operations for TDengine in a Kubernetes environment.

Prerequisites

Before deploying TDengine on Kubernetes, perform the following:

  • Current steps are compatible with Kubernetes v1.5 and later version.
  • Install and configure minikube, kubectl, and helm.
  • Install and deploy Kubernetes and ensure that it can be accessed and used normally. Update any container registries or other services as necessary.

You can download the configuration files in this document from GitHub.

Configure the service

Create a service configuration file named taosd-service.yaml. Record the value of metadata.name (in this example, taos) for use in the next step. Add the ports required by TDengine:

  1. ---
  2. apiVersion: v1
  3. kind: Service
  4. metadata:
  5. name: "taosd"
  6. labels:
  7. app: "tdengine"
  8. spec:
  9. ports:
  10. - name: tcp6030
  11. - protocol: "TCP"
  12. port: 6030
  13. - name: tcp6041
  14. - protocol: "TCP"
  15. port: 6041
  16. selector:
  17. app: "tdengine"

Configure the service as StatefulSet

Configure the TDengine service as a StatefulSet. Create the tdengine.yaml file and set replicas to 3. In this example, the region is set to Asia/Shanghai and 10 GB of standard storage are allocated per node. You can change the configuration based on your environment and business requirements.

  1. ---
  2. apiVersion: apps/v1
  3. kind: StatefulSet
  4. metadata:
  5. name: "tdengine"
  6. labels:
  7. app: "tdengine"
  8. spec:
  9. serviceName: "taosd"
  10. replicas: 3
  11. updateStrategy:
  12. type: RollingUpdate
  13. selector:
  14. matchLabels:
  15. app: "tdengine"
  16. template:
  17. metadata:
  18. name: "tdengine"
  19. labels:
  20. app: "tdengine"
  21. spec:
  22. containers:
  23. - name: "tdengine"
  24. image: "tdengine/tdengine:3.0.0.0"
  25. imagePullPolicy: "IfNotPresent"
  26. ports:
  27. - name: tcp6030
  28. - protocol: "TCP"
  29. containerPort: 6030
  30. - name: tcp6041
  31. - protocol: "TCP"
  32. containerPort: 6041
  33. env:
  34. # POD_NAME for FQDN config
  35. - name: POD_NAME
  36. valueFrom:
  37. fieldRef:
  38. fieldPath: metadata.name
  39. # SERVICE_NAME and NAMESPACE for fqdn resolve
  40. - name: SERVICE_NAME
  41. value: "taosd"
  42. - name: STS_NAME
  43. value: "tdengine"
  44. - name: STS_NAMESPACE
  45. valueFrom:
  46. fieldRef:
  47. fieldPath: metadata.namespace
  48. # TZ for timezone settings, we recommend to always set it.
  49. - name: TZ
  50. value: "Asia/Shanghai"
  51. # TAOS_ prefix will configured in taos.cfg, strip prefix and camelCase.
  52. - name: TAOS_SERVER_PORT
  53. value: "6030"
  54. # Must set if you want a cluster.
  55. - name: TAOS_FIRST_EP
  56. value: "$(STS_NAME)-0.$(SERVICE_NAME).$(STS_NAMESPACE).svc.cluster.local:$(TAOS_SERVER_PORT)"
  57. # TAOS_FQDN should always be set in k8s env.
  58. - name: TAOS_FQDN
  59. value: "$(POD_NAME).$(SERVICE_NAME).$(STS_NAMESPACE).svc.cluster.local"
  60. volumeMounts:
  61. - name: taosdata
  62. mountPath: /var/lib/taos
  63. readinessProbe:
  64. exec:
  65. command:
  66. - taos-check
  67. initialDelaySeconds: 5
  68. timeoutSeconds: 5000
  69. livenessProbe:
  70. exec:
  71. command:
  72. - taos-check
  73. initialDelaySeconds: 15
  74. periodSeconds: 20
  75. volumeClaimTemplates:
  76. - metadata:
  77. name: taosdata
  78. spec:
  79. accessModes:
  80. - "ReadWriteOnce"
  81. storageClassName: "standard"
  82. resources:
  83. requests:
  84. storage: "10Gi"

Use kubectl to deploy TDengine

Run the following commands:

  1. kubectl apply -f taosd-service.yaml
  2. kubectl apply -f tdengine.yaml

The preceding configuration generates a TDengine cluster with three nodes in which dnodes are automatically configured. You can run the show dnodes command to query the nodes in the cluster:

  1. kubectl exec -i -t tdengine-0 -- taos -s "show dnodes"
  2. kubectl exec -i -t tdengine-1 -- taos -s "show dnodes"
  3. kubectl exec -i -t tdengine-2 -- taos -s "show dnodes"

The output is as follows:

  1. taos> show dnodes
  2. id | endpoint | vnodes | support_vnodes | status | create_time | note |
  3. ============================================================================================================================================
  4. 1 | tdengine-0.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:14:57.285 | |
  5. 2 | tdengine-1.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:15:11.302 | |
  6. 3 | tdengine-2.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:15:23.290 | |
  7. Query OK, 3 rows in database (0.003655s)

Enable port forwarding

The kubectl port forwarding feature allows applications to access the TDengine cluster running on Kubernetes.

  1. kubectl port-forward tdengine-0 6041:6041 &

Use curl to verify that the TDengine REST API is working on port 6041:

  1. $ curl -u root:taosdata -d "show databases" 127.0.0.1:6041/rest/sql
  2. Handling connection for 6041
  3. {"code":0,"column_meta":[["name","VARCHAR",64],["create_time","TIMESTAMP",8],["vgroups","SMALLINT",2],["ntables","BIGINT",8],["replica","TINYINT",1],["strict","VARCHAR",4],["duration","VARCHAR",10],["keep","VARCHAR",32],["buffer","INT",4],["pagesize","INT",4],["pages","INT",4],["minrows","INT",4],["maxrows","INT",4],["comp","TINYINT",1],["precision","VARCHAR",2],["status","VARCHAR",10],["retention","VARCHAR",60],["single_stable","BOOL",1],["cachemodel","VARCHAR",11],["cachesize","INT",4],["wal_level","TINYINT",1],["wal_fsync_period","INT",4],["wal_retention_period","INT",4],["wal_retention_size","BIGINT",8],["wal_roll_period","INT",4],["wal_segment_size","BIGINT",8]],"data":[["information_schema",null,null,16,null,null,null,null,null,null,null,null,null,null,null,"ready",null,null,null,null,null,null,null,null,null,null],["performance_schema",null,null,10,null,null,null,null,null,null,null,null,null,null,null,"ready",null,null,null,null,null,null,null,null,null,null]],"rows":2}

Enable the dashboard for visualization

The minikube dashboard command enables visualized cluster management.

  1. $ minikube dashboard
  2. * Verifying dashboard health ...
  3. * Launching proxy ...
  4. * Verifying proxy health ...
  5. * Opening http://127.0.0.1:46617/api/v1/namespaces/kubernetes-dashboard/services/http:kubernetes-dashboard:/proxy/ in your default browser...
  6. http://127.0.0.1:46617/api/v1/namespaces/kubernetes-dashboard/services/http:kubernetes-dashboard:/proxy/

In some public clouds, minikube cannot be remotely accessed if it is bound to 127.0.0.1. In this case, use the kubectl proxy command to map the port to 0.0.0.0. Then, you can access the dashboard by using a web browser to open the dashboard URL above on the public IP address and port of the virtual machine.

  1. $ kubectl proxy --accept-hosts='^.*$' --address='0.0.0.0'

Scaling Out Your Cluster

TDengine clusters can scale automatically:

  1. kubectl scale statefulsets tdengine --replicas=4

The preceding command increases the number of replicas to 4. After running this command, query the pod status:

  1. kubectl get pods -l app=tdengine

The output is as follows:

  1. NAME READY STATUS RESTARTS AGE
  2. tdengine-0 1/1 Running 0 161m
  3. tdengine-1 1/1 Running 0 161m
  4. tdengine-2 1/1 Running 0 32m
  5. tdengine-3 1/1 Running 0 32m

The status of all pods is Running. Once the pod status changes to Ready, you can check the dnode status:

  1. kubectl exec -i -t tdengine-3 -- taos -s "show dnodes"

The following output shows that the TDengine cluster has been expanded to 4 replicas:

  1. taos> show dnodes
  2. id | endpoint | vnodes | support_vnodes | status | create_time | note |
  3. ============================================================================================================================================
  4. 1 | tdengine-0.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:14:57.285 | |
  5. 2 | tdengine-1.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:15:11.302 | |
  6. 3 | tdengine-2.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:15:23.290 | |
  7. 4 | tdengine-3.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:33:16.039 | |
  8. Query OK, 4 rows in database (0.008377s)

Scaling In Your Cluster

When you scale in a TDengine cluster, your data is migrated to different nodes. You must run the drop dnodes command in TDengine to remove dnodes before scaling in your Kubernetes environment.

Note: In a Kubernetes StatefulSet service, the newest pods are always removed first. For this reason, when you scale in your TDengine cluster, ensure that you drop the newest dnodes.

  1. $ kubectl exec -i -t tdengine-0 -- taos -s "drop dnode 4"
  1. $ kubectl exec -it tdengine-0 -- taos -s "show dnodes"
  2. taos> show dnodes
  3. id | endpoint | vnodes | support_vnodes | status | create_time | note |
  4. ============================================================================================================================================
  5. 1 | tdengine-0.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:14:57.285 | |
  6. 2 | tdengine-1.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:15:11.302 | |
  7. 3 | tdengine-2.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:15:23.290 | |
  8. Query OK, 3 rows in database (0.004861s)

Verify that the dnode have been successfully removed by running the kubectl exec -i -t tdengine-0 -- taos -s "show dnodes" command. Then run the following command to remove the pod:

  1. kubectl scale statefulsets tdengine --replicas=3

The newest pod in the deployment is removed. Run the kubectl get pods -l app=tdengine command to query the pod status:

  1. $ kubectl get pods -l app=tdengine
  2. NAME READY STATUS RESTARTS AGE
  3. tdengine-0 1/1 Running 0 4m7s
  4. tdengine-1 1/1 Running 0 3m55s
  5. tdengine-2 1/1 Running 0 2m28s

After the pod has been removed, manually delete the PersistentVolumeClaim (PVC). Otherwise, future scale-outs will attempt to use existing data.

  1. $ kubectl delete pvc taosdata-tdengine-3

Your cluster has now been safely scaled in, and you can scale it out again as necessary.

  1. $ kubectl scale statefulsets tdengine --replicas=4
  2. statefulset.apps/tdengine scaled
  3. it@k8s-2:~/TDengine-Operator/src/tdengine$ kubectl get pods -l app=tdengine
  4. NAME READY STATUS RESTARTS AGE
  5. tdengine-0 1/1 Running 0 35m
  6. tdengine-1 1/1 Running 0 34m
  7. tdengine-2 1/1 Running 0 12m
  8. tdengine-3 0/1 ContainerCreating 0 4s
  9. it@k8s-2:~/TDengine-Operator/src/tdengine$ kubectl get pods -l app=tdengine
  10. NAME READY STATUS RESTARTS AGE
  11. tdengine-0 1/1 Running 0 35m
  12. tdengine-1 1/1 Running 0 34m
  13. tdengine-2 1/1 Running 0 12m
  14. tdengine-3 0/1 Running 0 7s
  15. it@k8s-2:~/TDengine-Operator/src/tdengine$ kubectl exec -it tdengine-0 -- taos -s "show dnodes"
  16. taos> show dnodes
  17. id | endpoint | vnodes | support_vnodes | status | create_time | offline reason |
  18. ======================================================================================================================================
  19. 1 | tdengine-0.taosd.default.sv... | 0 | 4 | ready | 2022-07-25 17:38:49.012 | |
  20. 2 | tdengine-1.taosd.default.sv... | 1 | 4 | ready | 2022-07-25 17:39:01.517 | |
  21. 5 | tdengine-2.taosd.default.sv... | 0 | 4 | ready | 2022-07-25 18:01:36.479 | |
  22. 6 | tdengine-3.taosd.default.sv... | 0 | 4 | ready | 2022-07-25 18:13:54.411 | |
  23. Query OK, 4 row(s) in set (0.001348s)

Remove a TDengine Cluster

To fully remove a TDengine cluster, you must delete its statefulset, svc, configmap, and pvc entries:

  1. kubectl delete statefulset -l app=tdengine
  2. kubectl delete svc -l app=tdengine
  3. kubectl delete pvc -l app=tdengine
  4. kubectl delete configmap taoscfg

Troubleshooting

Error 1

If you remove a pod without first running drop dnode, some TDengine nodes will go offline.

  1. $ kubectl exec -it tdengine-0 -- taos -s "show dnodes"
  2. taos> show dnodes
  3. id | endpoint | vnodes | support_vnodes | status | create_time | offline reason |
  4. ======================================================================================================================================
  5. 1 | tdengine-0.taosd.default.sv... | 0 | 4 | ready | 2022-07-25 17:38:49.012 | |
  6. 2 | tdengine-1.taosd.default.sv... | 1 | 4 | ready | 2022-07-25 17:39:01.517 | |
  7. 5 | tdengine-2.taosd.default.sv... | 0 | 4 | offline | 2022-07-25 18:01:36.479 | status msg timeout |
  8. 6 | tdengine-3.taosd.default.sv... | 0 | 4 | offline | 2022-07-25 18:13:54.411 | status msg timeout |
  9. Query OK, 4 row(s) in set (0.001323s)

Error 2

If the number of nodes after a scale-in is less than the value of the replica parameter, the cluster will go down:

Create a database with replica set to 2 and add data.

  1. kubectl exec -i -t tdengine-0 -- \
  2. taos -s \
  3. "create database if not exists test replica 2;
  4. use test;
  5. create table if not exists t1(ts timestamp, n int);
  6. insert into t1 values(now, 1)(now+1s, 2);"

Scale in to one node:

  1. kubectl scale statefulsets tdengine --replicas=1

In the TDengine CLI, you can see that no database operations succeed:

  1. taos> show dnodes;
  2. id | end_point | vnodes | cores | status | role | create_time | offline reason |
  3. ======================================================================================================================================
  4. 1 | tdengine-0.taosd.default.sv... | 2 | 40 | ready | any | 2021-06-01 15:55:52.562 | |
  5. 2 | tdengine-1.taosd.default.sv... | 1 | 40 | offline | any | 2021-06-01 15:56:07.212 | status msg timeout |
  6. Query OK, 2 row(s) in set (0.000845s)
  7. taos> show dnodes;
  8. id | end_point | vnodes | cores | status | role | create_time | offline reason |
  9. ======================================================================================================================================
  10. 1 | tdengine-0.taosd.default.sv... | 2 | 40 | ready | any | 2021-06-01 15:55:52.562 | |
  11. 2 | tdengine-1.taosd.default.sv... | 1 | 40 | offline | any | 2021-06-01 15:56:07.212 | status msg timeout |
  12. Query OK, 2 row(s) in set (0.000837s)
  13. taos> use test;
  14. Database changed.
  15. taos> insert into t1 values(now, 3);
  16. DB error: Unable to resolve FQDN (0.013874s)