This document describes how to use updates to upgrade an Apache Doris cluster based on a Doris-Operator deployment.

Similar to conventionally deployed cluster upgrades, Doris clusters deployed by Doris-Operator still require rolling upgrades from BE to FE nodes. Doris-Operator is based on Kubernetes’ Performing a Rolling Update provides rolling upgrade capabilities.

Things to note before upgrading

  • It is recommended that the upgrade operation be performed during off-peak periods.
  • During the rolling upgrade process, the connection to the closed node will fail, causing the request to fail. For this type of business, it is recommended to add retry capabilities to the client.
  • Before upgrading, you can read the General Upgrade Manual to help you understand some principles and precautions during the upgrade. .
  • The compatibility of data and metadata cannot be verified before upgrading. Therefore, cluster upgrade must avoid single copy of data and single FE FOLLOWER node in the cluster.
  • Nodes will be restarted during the upgrade process, so unnecessary cluster balancing and replica repair logic may be triggered. Please shut it down first with the following command.
  1. admin set frontend config("disable_balance" = "true");
  2. admin set frontend config("disable_colocate_balance" = "true");
  3. admin set frontend config("disable_tablet_scheduler" = "true");
  • When upgrading Doris, please follow the principle of not upgrading across two or more key node versions. If you want to upgrade across multiple key node versions, upgrade to the latest key node version first, and then upgrade in sequence. If it is a non-key node version, You can ignore skipping. For details, please refer to Upgrade Version Instructions

Upgrade operation

The order of node types in the upgrade process is as follows. If a certain type of node does not exist, it will be skipped:

  1. cn/be -> fe -> broker

It is recommended to modify the image of the corresponding cluster components in sequence and then apply the configuration. After the current type of component is fully upgraded and the status returns to normal, the rolling upgrade of the next type of node can be performed.

Upgrade BE

If you retain the cluster’s crd (Doris-Operator defines the abbreviation of DorisCluster type resource name) file, you can upgrade by modifying the configuration file and running the kubectl apply command.

  1. Modify spec.beSpec.image

    Change selectdb/doris.be-ubuntu:2.0.4 to selectdb/doris.be-ubuntu:2.1.0

  1. $ vim doriscluster-sample.yaml
  1. Save the changes and apply the changes to be upgraded:
  1. $ kubectl apply -f doriscluster-sample.yaml -n doris

It can also be modified directly through kubectl edit dcr.

  1. Check the dcr list under namespace ‘doris’ to obtain the cluster_name that needs to be updated.
  1. $ kubectl get dcr -n doris
  2. NAME FESTATUS BESTATUS CNSTATUS
  3. doriscluster-sample available available
  1. Modify, save and take effect
  1. $ kubectl edit dcr doriscluster-sample -n doris
  1. After entering the text editor, you will find `spec.beSpec.image` and change `selectdb/doris.be-ubuntu:2.0.4` to `selectdb/doris.be-ubuntu:2.1.0`
  1. View the upgrade process and results:
  1. $ kubectl get pod -n doris

When all Pods are rebuilt and enter the Running state, the upgrade is complete.

Upgrade FE

If you retain the cluster’s crd (Doris-Operator defines the abbreviation of the DorisCluster type resource name) file, you can upgrade by modifying the configuration file and running the kubectl apply command.

  1. Modify spec.feSpec.image

    Change selectdb/doris.fe-ubuntu:2.0.4 to selectdb/doris.fe-ubuntu:2.1.0

  1. $ vim doriscluster-sample.yaml
  1. Save the changes and apply the changes to be upgraded:
  1. $ kubectl apply -f doriscluster-sample.yaml -n doris

It can also be modified directly through kubectl edit dcr.

  1. Modify, save and take effect
  1. $ kubectl edit dcr doriscluster-sample -n doris
  1. After entering the text editor, you will find `spec.feSpec.image` and change `selectdb/doris.fe-ubuntu:2.0.4` to `selectdb/doris.fe-ubuntu:2.1.0`
  1. View the upgrade process and results:
  1. $ kubectl get pod -n doris

When all Pods are rebuilt and enter the Running state, the upgrade is complete.

After the upgrade is completed

Verify cluster node status

Access Doris through mysql-client through the method provided in the Access Doris Cluster document. Use SQL such as show frontends and show backends to view the version and status of each component.

  1. mysql> show frontends\G;
  2. *************************** 1. row ***************************
  3. Name: fe_13c132aa_3281_4f4f_97e8_655d01287425
  4. Host: doriscluster-sample-fe-0.doriscluster-sample-fe-internal.doris.svc.cluster.local
  5. EditLogPort: 9010
  6. HttpPort: 8030
  7. QueryPort: 9030
  8. RpcPort: 9020
  9. ArrowFlightSqlPort: -1
  10. Role: FOLLOWER
  11. IsMaster: false
  12. ClusterId: 1779160761
  13. Join: true
  14. Alive: true
  15. ReplayedJournalId: 2422
  16. LastStartTime: 2024-02-19 06:38:47
  17. LastHeartbeat: 2024-02-19 09:31:33
  18. IsHelper: true
  19. ErrMsg:
  20. Version: doris-2.1.0
  21. CurrentConnected: Yes
  22. *************************** 2. row ***************************
  23. Name: fe_f1a9d008_d110_4780_8e60_13d392faa54e
  24. Host: doriscluster-sample-fe-2.doriscluster-sample-fe-internal.doris.svc.cluster.local
  25. EditLogPort: 9010
  26. HttpPort: 8030
  27. QueryPort: 9030
  28. RpcPort: 9020
  29. ArrowFlightSqlPort: -1
  30. Role: FOLLOWER
  31. IsMaster: true
  32. ClusterId: 1779160761
  33. Join: true
  34. Alive: true
  35. ReplayedJournalId: 2423
  36. LastStartTime: 2024-02-19 06:37:35
  37. LastHeartbeat: 2024-02-19 09:31:33
  38. IsHelper: true
  39. ErrMsg:
  40. Version: doris-2.1.0
  41. CurrentConnected: No
  42. *************************** 3. row ***************************
  43. Name: fe_e42bf9da_006f_4302_b861_770d2c955a47
  44. Host: doriscluster-sample-fe-1.doriscluster-sample-fe-internal.doris.svc.cluster.local
  45. EditLogPort: 9010
  46. HttpPort: 8030
  47. QueryPort: 9030
  48. RpcPort: 9020
  49. ArrowFlightSqlPort: -1
  50. Role: FOLLOWER
  51. IsMaster: false
  52. ClusterId: 1779160761
  53. Join: true
  54. Alive: true
  55. ReplayedJournalId: 2422
  56. LastStartTime: 2024-02-19 06:38:17
  57. LastHeartbeat: 2024-02-19 09:31:33
  58. IsHelper: true
  59. ErrMsg:
  60. Version: doris-2.1.0
  61. CurrentConnected: No
  62. 3 rows in set (0.02 sec)

If the Alive status of the FE node is true and the Version value is the new version, the FE node is upgraded successfully.

  1. mysql> show backends\G;
  2. *************************** 1. row ***************************
  3. BackendId: 10002
  4. Host: doriscluster-sample-be-0.doriscluster-sample-be-internal.doris.svc.cluster.local
  5. HeartbeatPort: 9050
  6. BePort: 9060
  7. HttpPort: 8040
  8. BrpcPort: 8060
  9. ArrowFlightSqlPort: -1
  10. LastStartTime: 2024-02-19 06:37:56
  11. LastHeartbeat: 2024-02-19 09:32:43
  12. Alive: true
  13. SystemDecommissioned: false
  14. TabletNum: 14
  15. DataUsedCapacity: 0.000
  16. TrashUsedCapcacity: 0.000
  17. AvailCapacity: 12.719 GB
  18. TotalCapacity: 295.167 GB
  19. UsedPct: 95.69 %
  20. MaxDiskUsedPct: 95.69 %
  21. RemoteUsedCapacity: 0.000
  22. Tag: {"location" : "default"}
  23. ErrMsg:
  24. Version: doris-2.1.0
  25. Status: {"lastSuccessReportTabletsTime":"2024-02-19 09:31:48","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false}
  26. HeartbeatFailureCounter: 0
  27. NodeRole: mix
  28. *************************** 2. row ***************************
  29. BackendId: 10003
  30. Host: doriscluster-sample-be-1.doriscluster-sample-be-internal.doris.svc.cluster.local
  31. HeartbeatPort: 9050
  32. BePort: 9060
  33. HttpPort: 8040
  34. BrpcPort: 8060
  35. ArrowFlightSqlPort: -1
  36. LastStartTime: 2024-02-19 06:37:35
  37. LastHeartbeat: 2024-02-19 09:32:43
  38. Alive: true
  39. SystemDecommissioned: false
  40. TabletNum: 8
  41. DataUsedCapacity: 0.000
  42. TrashUsedCapcacity: 0.000
  43. AvailCapacity: 12.719 GB
  44. TotalCapacity: 295.167 GB
  45. UsedPct: 95.69 %
  46. MaxDiskUsedPct: 95.69 %
  47. RemoteUsedCapacity: 0.000
  48. Tag: {"location" : "default"}
  49. ErrMsg:
  50. Version: doris-2.1.0
  51. Status: {"lastSuccessReportTabletsTime":"2024-02-19 09:31:43","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false}
  52. HeartbeatFailureCounter: 0
  53. NodeRole: mix
  54. *************************** 3. row ***************************
  55. BackendId: 11024
  56. Host: doriscluster-sample-be-2.doriscluster-sample-be-internal.doris.svc.cluster.local
  57. HeartbeatPort: 9050
  58. BePort: 9060
  59. HttpPort: 8040
  60. BrpcPort: 8060
  61. ArrowFlightSqlPort: -1
  62. LastStartTime: 2024-02-19 08:50:36
  63. LastHeartbeat: 2024-02-19 09:32:43
  64. Alive: true
  65. SystemDecommissioned: false
  66. TabletNum: 0
  67. DataUsedCapacity: 0.000
  68. TrashUsedCapcacity: 0.000
  69. AvailCapacity: 12.719 GB
  70. TotalCapacity: 295.167 GB
  71. UsedPct: 95.69 %
  72. MaxDiskUsedPct: 95.69 %
  73. RemoteUsedCapacity: 0.000
  74. Tag: {"location" : "default"}
  75. ErrMsg:
  76. Version: doris-2.1.0
  77. Status: {"lastSuccessReportTabletsTime":"2024-02-19 09:32:04","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false}
  78. HeartbeatFailureCounter: 0
  79. NodeRole: mix
  80. 3 rows in set (0.01 sec)

If the Alive status of the BE node is true and the Version value is the new version, the BE node is upgraded successfully.

Restore cluster replica synchronization and balancing

After confirming that the status of each node is correct, execute the following SQL to restore cluster balancing and replica repair:

  1. admin set frontend config("disable_balance" = "false");
  2. admin set frontend config("disable_colocate_balance" = "false");
  3. admin set frontend config("disable_tablet_scheduler" = "false");