Deploying a TDengine Cluster in Kubernetes
TDengine is a cloud-native time-series database that can be deployed on Kubernetes. This document gives a step-by-step description of how you can use YAML files to create a TDengine cluster and introduces common operations for TDengine in a Kubernetes environment.
Prerequisites
Before deploying TDengine on Kubernetes, perform the following:
- Current steps are compatible with Kubernetes v1.5 and later version.
- Install and configure minikube, kubectl, and helm.
- Install and deploy Kubernetes and ensure that it can be accessed and used normally. Update any container registries or other services as necessary.
You can download the configuration files in this document from GitHub.
Configure the service
Create a service configuration file named taosd-service.yaml. Record the value of metadata.name (in this example, taos) for use in the next step. Add the ports required by TDengine:
---apiVersion: v1kind: Servicemetadata:name: "taosd"labels:app: "tdengine"spec:ports:- name: tcp6030- protocol: "TCP"port: 6030- name: tcp6041- protocol: "TCP"port: 6041selector:app: "tdengine"
Configure the service as StatefulSet
Configure the TDengine service as a StatefulSet. Create the tdengine.yaml file and set replicas to 3. In this example, the region is set to Asia/Shanghai and 10 GB of standard storage are allocated per node. You can change the configuration based on your environment and business requirements.
---apiVersion: apps/v1kind: StatefulSetmetadata:name: "tdengine"labels:app: "tdengine"spec:serviceName: "taosd"replicas: 3updateStrategy:type: RollingUpdateselector:matchLabels:app: "tdengine"template:metadata:name: "tdengine"labels:app: "tdengine"spec:containers:- name: "tdengine"image: "tdengine/tdengine:3.0.0.0"imagePullPolicy: "IfNotPresent"ports:- name: tcp6030- protocol: "TCP"containerPort: 6030- name: tcp6041- protocol: "TCP"containerPort: 6041env:# POD_NAME for FQDN config- name: POD_NAMEvalueFrom:fieldRef:fieldPath: metadata.name# SERVICE_NAME and NAMESPACE for fqdn resolve- name: SERVICE_NAMEvalue: "taosd"- name: STS_NAMEvalue: "tdengine"- name: STS_NAMESPACEvalueFrom:fieldRef:fieldPath: metadata.namespace# TZ for timezone settings, we recommend to always set it.- name: TZvalue: "Asia/Shanghai"# TAOS_ prefix will configured in taos.cfg, strip prefix and camelCase.- name: TAOS_SERVER_PORTvalue: "6030"# Must set if you want a cluster.- name: TAOS_FIRST_EPvalue: "$(STS_NAME)-0.$(SERVICE_NAME).$(STS_NAMESPACE).svc.cluster.local:$(TAOS_SERVER_PORT)"# TAOS_FQDN should always be set in k8s env.- name: TAOS_FQDNvalue: "$(POD_NAME).$(SERVICE_NAME).$(STS_NAMESPACE).svc.cluster.local"volumeMounts:- name: taosdatamountPath: /var/lib/taosreadinessProbe:exec:command:- taos-checkinitialDelaySeconds: 5timeoutSeconds: 5000livenessProbe:exec:command:- taos-checkinitialDelaySeconds: 15periodSeconds: 20volumeClaimTemplates:- metadata:name: taosdataspec:accessModes:- "ReadWriteOnce"storageClassName: "standard"resources:requests:storage: "10Gi"
Use kubectl to deploy TDengine
Run the following commands:
kubectl apply -f taosd-service.yamlkubectl apply -f tdengine.yaml
The preceding configuration generates a TDengine cluster with three nodes in which dnodes are automatically configured. You can run the show dnodes command to query the nodes in the cluster:
kubectl exec -i -t tdengine-0 -- taos -s "show dnodes"kubectl exec -i -t tdengine-1 -- taos -s "show dnodes"kubectl exec -i -t tdengine-2 -- taos -s "show dnodes"
The output is as follows:
taos> show dnodesid | endpoint | vnodes | support_vnodes | status | create_time | note |============================================================================================================================================1 | tdengine-0.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:14:57.285 | |2 | tdengine-1.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:15:11.302 | |3 | tdengine-2.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:15:23.290 | |Query OK, 3 rows in database (0.003655s)
Enable port forwarding
The kubectl port forwarding feature allows applications to access the TDengine cluster running on Kubernetes.
kubectl port-forward tdengine-0 6041:6041 &
Use curl to verify that the TDengine REST API is working on port 6041:
$ curl -u root:taosdata -d "show databases" 127.0.0.1:6041/rest/sqlHandling connection for 6041{"code":0,"column_meta":[["name","VARCHAR",64],["create_time","TIMESTAMP",8],["vgroups","SMALLINT",2],["ntables","BIGINT",8],["replica","TINYINT",1],["strict","VARCHAR",4],["duration","VARCHAR",10],["keep","VARCHAR",32],["buffer","INT",4],["pagesize","INT",4],["pages","INT",4],["minrows","INT",4],["maxrows","INT",4],["comp","TINYINT",1],["precision","VARCHAR",2],["status","VARCHAR",10],["retention","VARCHAR",60],["single_stable","BOOL",1],["cachemodel","VARCHAR",11],["cachesize","INT",4],["wal_level","TINYINT",1],["wal_fsync_period","INT",4],["wal_retention_period","INT",4],["wal_retention_size","BIGINT",8],["wal_roll_period","INT",4],["wal_segment_size","BIGINT",8]],"data":[["information_schema",null,null,16,null,null,null,null,null,null,null,null,null,null,null,"ready",null,null,null,null,null,null,null,null,null,null],["performance_schema",null,null,10,null,null,null,null,null,null,null,null,null,null,null,"ready",null,null,null,null,null,null,null,null,null,null]],"rows":2}
Enable the dashboard for visualization
The minikube dashboard command enables visualized cluster management.
$ minikube dashboard* Verifying dashboard health ...* Launching proxy ...* Verifying proxy health ...* Opening http://127.0.0.1:46617/api/v1/namespaces/kubernetes-dashboard/services/http:kubernetes-dashboard:/proxy/ in your default browser...http://127.0.0.1:46617/api/v1/namespaces/kubernetes-dashboard/services/http:kubernetes-dashboard:/proxy/
In some public clouds, minikube cannot be remotely accessed if it is bound to 127.0.0.1. In this case, use the kubectl proxy command to map the port to 0.0.0.0. Then, you can access the dashboard by using a web browser to open the dashboard URL above on the public IP address and port of the virtual machine.
$ kubectl proxy --accept-hosts='^.*$' --address='0.0.0.0'
Scaling Out Your Cluster
TDengine clusters can scale automatically:
kubectl scale statefulsets tdengine --replicas=4
The preceding command increases the number of replicas to 4. After running this command, query the pod status:
kubectl get pods -l app=tdengine
The output is as follows:
NAME READY STATUS RESTARTS AGEtdengine-0 1/1 Running 0 161mtdengine-1 1/1 Running 0 161mtdengine-2 1/1 Running 0 32mtdengine-3 1/1 Running 0 32m
The status of all pods is Running. Once the pod status changes to Ready, you can check the dnode status:
kubectl exec -i -t tdengine-3 -- taos -s "show dnodes"
The following output shows that the TDengine cluster has been expanded to 4 replicas:
taos> show dnodesid | endpoint | vnodes | support_vnodes | status | create_time | note |============================================================================================================================================1 | tdengine-0.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:14:57.285 | |2 | tdengine-1.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:15:11.302 | |3 | tdengine-2.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:15:23.290 | |4 | tdengine-3.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:33:16.039 | |Query OK, 4 rows in database (0.008377s)
Scaling In Your Cluster
When you scale in a TDengine cluster, your data is migrated to different nodes. You must run the drop dnodes command in TDengine to remove dnodes before scaling in your Kubernetes environment.
Note: In a Kubernetes StatefulSet service, the newest pods are always removed first. For this reason, when you scale in your TDengine cluster, ensure that you drop the newest dnodes.
$ kubectl exec -i -t tdengine-0 -- taos -s "drop dnode 4"
$ kubectl exec -it tdengine-0 -- taos -s "show dnodes"taos> show dnodesid | endpoint | vnodes | support_vnodes | status | create_time | note |============================================================================================================================================1 | tdengine-0.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:14:57.285 | |2 | tdengine-1.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:15:11.302 | |3 | tdengine-2.taosd.default.sv... | 0 | 256 | ready | 2022-08-10 13:15:23.290 | |Query OK, 3 rows in database (0.004861s)
Verify that the dnode have been successfully removed by running the kubectl exec -i -t tdengine-0 -- taos -s "show dnodes" command. Then run the following command to remove the pod:
kubectl scale statefulsets tdengine --replicas=3
The newest pod in the deployment is removed. Run the kubectl get pods -l app=tdengine command to query the pod status:
$ kubectl get pods -l app=tdengineNAME READY STATUS RESTARTS AGEtdengine-0 1/1 Running 0 4m7stdengine-1 1/1 Running 0 3m55stdengine-2 1/1 Running 0 2m28s
After the pod has been removed, manually delete the PersistentVolumeClaim (PVC). Otherwise, future scale-outs will attempt to use existing data.
$ kubectl delete pvc taosdata-tdengine-3
Your cluster has now been safely scaled in, and you can scale it out again as necessary.
$ kubectl scale statefulsets tdengine --replicas=4statefulset.apps/tdengine scaledit@k8s-2:~/TDengine-Operator/src/tdengine$ kubectl get pods -l app=tdengineNAME READY STATUS RESTARTS AGEtdengine-0 1/1 Running 0 35mtdengine-1 1/1 Running 0 34mtdengine-2 1/1 Running 0 12mtdengine-3 0/1 ContainerCreating 0 4sit@k8s-2:~/TDengine-Operator/src/tdengine$ kubectl get pods -l app=tdengineNAME READY STATUS RESTARTS AGEtdengine-0 1/1 Running 0 35mtdengine-1 1/1 Running 0 34mtdengine-2 1/1 Running 0 12mtdengine-3 0/1 Running 0 7sit@k8s-2:~/TDengine-Operator/src/tdengine$ kubectl exec -it tdengine-0 -- taos -s "show dnodes"taos> show dnodesid | endpoint | vnodes | support_vnodes | status | create_time | offline reason |======================================================================================================================================1 | tdengine-0.taosd.default.sv... | 0 | 4 | ready | 2022-07-25 17:38:49.012 | |2 | tdengine-1.taosd.default.sv... | 1 | 4 | ready | 2022-07-25 17:39:01.517 | |5 | tdengine-2.taosd.default.sv... | 0 | 4 | ready | 2022-07-25 18:01:36.479 | |6 | tdengine-3.taosd.default.sv... | 0 | 4 | ready | 2022-07-25 18:13:54.411 | |Query OK, 4 row(s) in set (0.001348s)
Remove a TDengine Cluster
To fully remove a TDengine cluster, you must delete its statefulset, svc, configmap, and pvc entries:
kubectl delete statefulset -l app=tdenginekubectl delete svc -l app=tdenginekubectl delete pvc -l app=tdenginekubectl delete configmap taoscfg
Troubleshooting
Error 1
If you remove a pod without first running drop dnode, some TDengine nodes will go offline.
$ kubectl exec -it tdengine-0 -- taos -s "show dnodes"taos> show dnodesid | endpoint | vnodes | support_vnodes | status | create_time | offline reason |======================================================================================================================================1 | tdengine-0.taosd.default.sv... | 0 | 4 | ready | 2022-07-25 17:38:49.012 | |2 | tdengine-1.taosd.default.sv... | 1 | 4 | ready | 2022-07-25 17:39:01.517 | |5 | tdengine-2.taosd.default.sv... | 0 | 4 | offline | 2022-07-25 18:01:36.479 | status msg timeout |6 | tdengine-3.taosd.default.sv... | 0 | 4 | offline | 2022-07-25 18:13:54.411 | status msg timeout |Query OK, 4 row(s) in set (0.001323s)
Error 2
If the number of nodes after a scale-in is less than the value of the replica parameter, the cluster will go down:
Create a database with replica set to 2 and add data.
kubectl exec -i -t tdengine-0 -- \taos -s \"create database if not exists test replica 2;use test;create table if not exists t1(ts timestamp, n int);insert into t1 values(now, 1)(now+1s, 2);"
Scale in to one node:
kubectl scale statefulsets tdengine --replicas=1
In the TDengine CLI, you can see that no database operations succeed:
taos> show dnodes;id | end_point | vnodes | cores | status | role | create_time | offline reason |======================================================================================================================================1 | tdengine-0.taosd.default.sv... | 2 | 40 | ready | any | 2021-06-01 15:55:52.562 | |2 | tdengine-1.taosd.default.sv... | 1 | 40 | offline | any | 2021-06-01 15:56:07.212 | status msg timeout |Query OK, 2 row(s) in set (0.000845s)taos> show dnodes;id | end_point | vnodes | cores | status | role | create_time | offline reason |======================================================================================================================================1 | tdengine-0.taosd.default.sv... | 2 | 40 | ready | any | 2021-06-01 15:55:52.562 | |2 | tdengine-1.taosd.default.sv... | 1 | 40 | offline | any | 2021-06-01 15:56:07.212 | status msg timeout |Query OK, 2 row(s) in set (0.000837s)taos> use test;Database changed.taos> insert into t1 values(now, 3);DB error: Unable to resolve FQDN (0.013874s)