In Kubernetes, a Service is a resource that defines a set of pods and provides stable access to this set of pods, so it can be associated with a set of pods. Deploying a cluster through Doris-Operator will automatically generate corresponding Service resources according to the spec.*Spec.service configuration. Currently, ClusterIP, LoadBalancer and NodePort modes are supported. Support users’ access needs in different scenarios.

Access within the Kubernetes cluster

ClusterIP mode

Doris provides ClusterIP access within the kubernetes cluster by default on kubernetes. For FE and BE components, we provide corresponding Service resources for users to use on demand on kubernetes. Use the following command to view the Service of the corresponding component. The Service naming rule provided by Doris-Operator is {clusterName}-{fe/be}-service.

  1. $ kubectl -n {namespace} get service

During use, please replace {namespace} with the namespace specified during deployment. Take our default sample deployed Doris cluster as an example:

  1. apiVersion: doris.selectdb.com/v1
  2. kind: DorisCluster
  3. metadata:
  4. labels:
  5. app.kubernetes.io/name: doriscluster
  6. app.kubernetes.io/instance: doriscluster-sample
  7. app.kubernetes.io/part-of: doris-operator
  8. name: doriscluster-sample
  9. spec:
  10. feSpec:
  11. replicas: 3
  12. limits:
  13. cpu: 6
  14. memory: 12Gi
  15. requests:
  16. cpu: 6
  17. memory: 12Gi
  18. image: selectdb/doris.fe-ubuntu:2.0.2
  19. beSpec:
  20. replicas: 3
  21. limits:
  22. cpu: 8
  23. memory: 16Gi
  24. requests:
  25. cpu: 8
  26. memory: 16Gi
  27. image: selectdb/doris.be-ubuntu:2.0.2

We view kubectl get service through the command as follows:

  1. $ kubectl get service
  2. NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
  3. doriscluster-sample-be-internal ClusterIP None <none> 9050/TCP 12h
  4. doriscluster-sample-be-service ClusterIP 172.20.217.234 <none> 9060/TCP,8040/TCP,9050/TCP,8060/TCP 12h
  5. doriscluster-sample-fe-internal ClusterIP None <none> 9030/TCP 12h
  6. doriscluster-sample-fe-service ClusterIP 172.20.183.136 <none> 8030/TCP,9020/TCP,9030/TCP,9010/TCP 12h

There are two types of services, FE and BE, displayed through the command. The service with the suffix internal is the service used by Doris for internal communication and is not available externally. The suffix -service is a Service for users to use. In this example, the CLUSTER-IP corresponding to doriscluster-sample-fe-service and the corresponding PORT can be used on the K8s cluster to access different port services of FE. Using doriscluster-sample-be-service Service And the corresponding PORT port to access BE’s services.

Access outside the Kubernetes cluster

LoadBalancer Mode

If the cluster is created on a relevant cloud platform, it is recommended to use the LoadBalancer mode to access the FE and BE services within the cluster. The ClusterIP mode is used by default. If you want to use the LoadBalancer mode, please configure the following configuration in the spec of each component:

  1. service:
  2. type: LoadBalancer

Taking the default configuration as a modification blueprint for example, we use LoadBalancer as the access mode of FE and BE on the cloud platform. The deployment configuration is as follows:

  1. apiVersion: doris.selectdb.com/v1
  2. kind: DorisCluster
  3. metadata:
  4. labels:
  5. app.kubernetes.io/name: doriscluster
  6. app.kubernetes.io/instance: doriscluster-sample
  7. app.kubernetes.io/part-of: doris-operator
  8. name: doriscluster-sample
  9. spec:
  10. feSpec:
  11. replicas: 3
  12. service:
  13. type: LoadBalancer
  14. limits:
  15. cpu: 6
  16. memory: 12Gi
  17. requests:
  18. cpu: 6
  19. memory: 12Gi
  20. image: selectdb/doris.fe-ubuntu:2.0.2
  21. beSpec:
  22. replicas: 3
  23. service:
  24. type: LoadBalancer
  25. limits:
  26. cpu: 8
  27. memory: 16Gi
  28. requests:
  29. cpu: 8
  30. memory: 16Gi
  31. image: selectdb/doris.be-ubuntu:2.0.2

By viewing the kubectl get service command, view the corresponding Service display as follows:

  1. $ kubectl get service
  2. NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
  3. doriscluster-sample-be-internal ClusterIP None <none> 9050/TCP 14h
  4. doriscluster-sample-be-service LoadBalancer 172.20.217.234 a46bbcd6998c7436bab8ee8fba9f5e81-808549982.us-east-1.elb.amazonaws.com 9060:32060/TCP,8040:30615/TCP,9050:31742/TCP,8060:31127/TCP 14h
  5. doriscluster-sample-fe-internal ClusterIP None <none> 9030/TCP 14h
  6. doriscluster-sample-fe-service LoadBalancer 172.20.183.136 ac48284932b044251bfac389b453118f-1412731848.us-east-1.elb.amazonaws.com 8030:32213/TCP,9020:31080/TCP,9030:31433/TCP,9010:30585/TCP 14h

External ports corresponding to EXTERNAL-IP and PORT can be used outside K8s to access the services of various components within K8s. For example, to access the mysql client service corresponding to FE’s 9030, you can use the following command to connect:

  1. mysql -h ac48284932b044251bfac389b453118f-1412731848.us-east-1.elb.amazonaws.com -P 9030 -uroot

NodePort Mode

In a private network environment, to access internal services outside K8s, it is recommended to use the NodePort mode of K8s. Use the default configuration as a blueprint to configure the NodePort access mode in the private network as follows:

  1. apiVersion: doris.selectdb.com/v1
  2. kind: DorisCluster
  3. metadata:
  4. labels:
  5. app.kubernetes.io/name: doriscluster
  6. app.kubernetes.io/instance: doriscluster-sample
  7. app.kubernetes.io/part-of: doris-operator
  8. name: doriscluster-sample
  9. spec:
  10. feSpec:
  11. replicas: 3
  12. service:
  13. type: NodePort
  14. limits:
  15. cpu: 6
  16. memory: 12Gi
  17. requests:
  18. cpu: 6
  19. memory: 12Gi
  20. image: selectdb/doris.fe-ubuntu:2.0.2
  21. beSpec:
  22. replicas: 3
  23. service:
  24. type: NodePort
  25. limits:
  26. cpu: 8
  27. memory: 16Gi
  28. requests:
  29. cpu: 8
  30. memory: 16Gi
  31. image: selectdb/doris.be-ubuntu:2.0.2

After deployment, view the corresponding Service by viewing the kubectl get service command:

  1. $ kubectl get service
  2. NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
  3. kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 169d
  4. doriscluster-sample-fe-internal ClusterIP None <none> 9030/TCP 2d
  5. doriscluster-sample-fe-service NodePort 10.152.183.58 <none> 8030:31041/TCP,9020:30783/TCP,9030:31545/TCP,9010:31610/TCP 2d
  6. doriscluster-sample-be-internal ClusterIP None <none> 9050/TCP 2d
  7. doriscluster-sample-be-service NodePort 10.152.183.244 <none> 9060:30940/TCP,8040:32713/TCP,9050:30621/TCP,8060:30926/TCP 2d

The above command obtains the port that can be used outside K8s, and obtains the host managed by K8s through the following command:

  1. $ kubectl get nodes -owide
  2. NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
  3. vm-10-7-centos Ready <none> 88d v1.23.17-2+40cc20cc310518 10.16.10.7 <none> TencentOS Server 3.1 (Final) 5.4.119-19.0009.25 containerd://1.5.13
  4. vm-10-8-centos Ready <none> 169d v1.23.17-2+40cc20cc310518 10.16.10.8 <none> TencentOS Server 3.1 (Final) 5.4.119-19-0009.3 containerd://1.5.13

In a private network environment, use the K8s host and mapped ports to access K8s internal services. For example, we use the host’s IP and FE’s 9030 mapped port (31545) to connect to mysql:

  1. $ mysql -h 10.16.10.8 -P 31545 -uroot

In addition, you can specify the nodePort you need according to your own platform needs. The Kubernetes master will allocate a port from the given configuration range (general default: 30000-32767) and each Node will proxy to the Service from that port (the same port on each Node). Like the example above, a random port will be automatically generated if not specified.

  1. apiVersion: doris.selectdb.com/v1
  2. kind: DorisCluster
  3. metadata:
  4. labels:
  5. app.kubernetes.io/name: doriscluster
  6. app.kubernetes.io/instance: doriscluster-sample
  7. app.kubernetes.io/part-of: doris-operator
  8. name: doriscluster-sample
  9. spec:
  10. feSpec:
  11. replicas: 3
  12. service:
  13. type: NodePort
  14. servicePorts:
  15. - nodePort: 31001
  16. targetPort: 8030
  17. - nodePort: 31002
  18. targetPort: 9020
  19. - nodePort: 31003
  20. targetPort: 9030
  21. - nodePort: 31004
  22. targetPort: 9010
  23. limits:
  24. cpu: 6
  25. memory: 12Gi
  26. requests:
  27. cpu: 6
  28. memory: 12Gi
  29. image: selectdb/doris.fe-ubuntu:2.0.2
  30. beSpec:
  31. replicas: 3
  32. service:
  33. type: NodePort
  34. servicePorts:
  35. - nodePort: 31005
  36. targetPort: 9060
  37. - nodePort: 31006
  38. targetPort: 8040
  39. - nodePort: 31007
  40. targetPort: 9050
  41. - nodePort: 31008
  42. targetPort: 8060
  43. limits:
  44. cpu: 8
  45. memory: 16Gi
  46. requests:
  47. cpu: 8
  48. memory: 16Gi
  49. image: selectdb/doris.be-ubuntu:2.0.2

After deployment, check the corresponding Service by viewing the kubectl get service command. For access methods, please refer to the above:

  1. $ kubectl get service
  2. NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
  3. kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 169d
  4. doriscluster-sample-fe-internal ClusterIP None <none> 9030/TCP 2d
  5. doriscluster-sample-fe-service NodePort 10.152.183.67 <none> 8030:31001/TCP,9020:31002/TCP,9030:31003/TCP,9010:31004/TCP 2d
  6. doriscluster-sample-be-internal ClusterIP None <none> 9050/TCP 2d
  7. doriscluster-sample-be-service NodePort 10.152.183.24 <none> 9060:31005/TCP,8040:31006/TCP,9050:31007/TCP,8060:31008/TCP 2d

Doris data exchange

Stream load

Stream load is a synchronous import method. Users send requests to import local files or data streams into Doris by sending HTTP protocol. In a regular deployment, users submit import commands via the HTTP protocol. Generally, users will submit the request to FE, and FE will forward the request to a certain BE through the HTTP redirect command. However, in a Kubernetes-based deployment scenario, it is recommended that users directly submit the import command to BE’s Srevice, and then the Service will be load balanced to a certain BE pod based on Kubernetes rules. The actual effects of these two operations are the same. When Flink or Spark uses the official connector to submit, the write request can also be submitted to the BE Service.

ErrorURL

When import methods such as Stream load and Routine load These import methods will print errorURL or tracking_url in the return structure or log when encountering errors such as incorrect data format. You can locate the cause of the import error by visiting this link. However, this URL is only accessible within the internal environment of a specific BE node container in a Kubernetes deployed cluster.

The following scenario takes the errorURL returned by Doris as an example: http://doriscluster-sample-be-2.doriscluster-sample-be-internal.doris.svc.cluster.local:8040/api/_load_error_log?file=__shard_1/error_log_insert_stmt_af474190276a2e9c-49bb9d175b8e968e_af474190276a2e9c_49bb9d175b8e968e

1. Kubernetes cluster internal access

You need to obtain the access method of BE’s Service or pod through the kubectl get service or kubectl get pod -o wide command, replace the domain name and port of the original URL, and then access again.

for example:

  1. $ kubectl get pod -o wide
  2. NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
  3. doriscluster-sample-be-0 1/1 Running 0 9h 10.0.2.105 10-0-2-47.ec2.internal <none> <none>
  4. doriscluster-sample-be-1 1/1 Running 0 9h 10.0.2.104 10-0-2-5.ec2.internal <none> <none>
  5. doriscluster-sample-be-2 1/1 Running 0 9h 10.0.2.103 10-0-2-6.ec2.internal <none> <none>
  6. doriscluster-sample-fe-0 1/1 Running 0 9h 10.0.2.102 10-0-2-47.ec2.internal <none> <none>
  7. doriscluster-sample-fe-1 1/1 Running 0 9h 10.0.2.101 10-0-2-5.ec2.internal <none> <none>
  8. doriscluster-sample-fe-2 1/1 Running 0 9h 10.0.2.100 10-0-2-6.ec2.internal <none> <none>

The above errorURL is changed to: http://10.0.2.103:8040/api/_load_error_log?file=__shard_1/error_log_insert_stmt_af474190276a2e9c-49bb9d175b8e968e_af474190276a2e9c_49bb9d175b8e968e

2. NodePort mode for external access to Kubernetes cluster

Obtaining error report details from outside Kubernetes requires additional bridging means. The following are the processing steps for using NodePort mode Service when deploying Doris. Obtain error report details by creating a new Service: Handle Service template be_streamload_errror_service.yaml:

  1. apiVersion: v1
  2. kind: Service
  3. metadata:
  4. labels:
  5. app.doris.service/role: debug
  6. app.kubernetes.io/component: be
  7. name: doriscluster-detail-error
  8. spec:
  9. externalTrafficPolicy: Cluster
  10. internalTrafficPolicy: Cluster
  11. ipFamilies:
  12. - IPv4
  13. ipFamilyPolicy: SingleStack
  14. ports:
  15. - name: webserver-port
  16. port: 8040
  17. protocol: TCP
  18. targetPort: 8040
  19. selector:
  20. app.kubernetes.io/component: be
  21. statefulset.kubernetes.io/pod-name: ${podName}
  22. sessionAffinity: None
  23. type: NodePort

Use the following command to view the mapping of Service 8040 port on the host machine

  1. $ kubectl get service -n doris doriscluster-detail-error
  2. NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
  3. doriscluster-detail-error NodePort 10.152.183.35 <none> 8040:31201/TCP 32s

Use the IP of any host and the NodePort (31201) port obtained above to access and replace errorURL to obtain a detailed error report. The above errorURL is changed to: http://10.152.183.35:31201/api/_load_error_log?file=__shard_1/error_log_insert_stmt_af474190276a2e9c-49bb9d175b8e968e_af474190276a2e9c_49bb9d175b8e968e

3. Access LoadBalancer mode from outside the Kubernetes cluster

Obtaining error report details from outside Kubernetes requires additional bridging means. The following are the processing steps for using the LoadBalancer mode Service when deploying Doris. Obtain error report details by creating a new Service: Handle Service template be_streamload_errror_service.yaml:

  1. apiVersion: v1
  2. kind: Service
  3. metadata:
  4. labels:
  5. app.doris.service/role: debug
  6. app.kubernetes.io/component: be
  7. name: doriscluster-detail-error
  8. spec:
  9. externalTrafficPolicy: Cluster
  10. internalTrafficPolicy: Cluster
  11. ipFamilies:
  12. - IPv4
  13. ipFamilyPolicy: SingleStack
  14. ports:
  15. - name: webserver-port
  16. port: 8040
  17. protocol: TCP
  18. targetPort: 8040
  19. selector:
  20. app.kubernetes.io/component: be
  21. statefulset.kubernetes.io/pod-name: ${podName}
  22. sessionAffinity: None
  23. type: LoadBalancer

podName is replaced with the highest-level domain name of errorURL: doriscluster-sample-be-2.

In the namespace deployed by Doris (generally the default is doris, use doris for the following operations), use the following command to deploy the service processed above:

  1. $ kubectl apply -n doris -f be_streamload_errror_service.yaml

Use the following command to view the mapping of Service 8040 port on the host machine

  1. $ kubectl get service -n doris doriscluster-detail-error
  2. NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
  3. doriscluster-detail-error LoadBalancer 172.20.183.136 ac4828493dgrftb884g67wg4tb68gyut-1137856348.us-east-1.elb.amazonaws.com 8040:32003/TCP 14s

Use EXTERNAL-IP and Port (8040) port access to replace errorURL to obtain a detailed error report. The above errorURL is changed to: http://ac4828493dgrftb884g67wg4tb68gyut-1137856348.us-east-1.elb.amazonaws.com:8040/api/_load_error_log?file=__shard_1/error_log_insert_stmt_af474190276a2e9c-49bb9d175b8e968 e_af474190276a2e9c_49bb9d175b8e968e

Note: For the above method of setting up access outside the cluster, it is recommended to clear the current Service after use. The reference command is as follows:

  1. $ kubectl delete service -n doris doriscluster-detail-error