serviceGroup

边缘应用管理 - 提供区域闭环的边缘应用管理及灰度发布能力。

功能背景

边缘特点

  • 边缘计算场景中,往往会在同一个集群中管理多个边缘站点,每个边缘站点内有一个或多个计算节点。
  • 同时希望在每个站点中都运行一组有业务逻辑联系的服务,每个站点内的服务是一套完整的功能,可以为用户提供服务
  • 由于受到网络限制,有业务联系的服务之间不希望或者不能跨站点访问

操作场景

serviceGroup可以便捷地在共属同一个集群的不同机房或区域中各自部署一组服务,并且使得各个服务间的请求在本机房或本地域内部即可完成,避免服务跨地域访问。

原生 k8s 无法控制deployment的pod创建的具体节点位置,需要通过统筹规划节点的亲和性来间接完成,当边缘站点数量以及需要部署的服务数量过多时,管理和部署方面的极为复杂,乃至仅存在理论上的可能性;

与此同时,为了将服务间的相互调用限制在一定范围,业务方需要为各个deployment分别创建专属的service,管理方面的工作量巨大且极容易出错并引起线上业务异常。

serviceGroup就是为这种场景设计的,客户只需要使用ServiceGroup提供的DeploymentGrid,StatefulSetGrid以及ServiceGrid三种SuperEdge自研的kubernetes 资源,即可方便地将服务分别部署到这些节点组中,并进行服务流量管控,另外,还能保证各区域服务数量及容灾。

关键概念

整体架构

serviceGroup - 图1

NodeUnit

  • NodeUnit通常是位于同一边缘站点内的一个或多个计算资源实例,需要保证同一NodeUnit中的节点内网是通的
  • ServiceGroup组中的服务运行在一个NodeUnit之内
  • ServiceGroup 允许用户设置服务在一个 NodeUnit中运行的pod数量
  • ServiceGroup 能够把服务之间的调用限制在本 NodeUnit 内

NodeGroup

  • NodeGroup 包含一个或者多个 NodeUnit
  • 保证在集合中每个 NodeUnit上均部署ServiceGroup中的服务
  • 集群中增加 NodeUnit 时自动将 ServiceGroup 中的服务部署到新增 NodeUnit

ServiceGroup

  • ServiceGroup 包含一个或者多个业务服务:适用场景:1)业务需要打包部署;2)或者,需要在每一个 NodeUnit 中均运行起来并且保证pod数量;3)或者,需要将服务之间的调用控制在同一个 NodeUnit 中,不能将流量转发到其他 NodeUnit。
  • 注意:ServiceGroup是一种抽象资源,一个集群中可以创建多个ServiceGroup

涉及的资源类型

DeploymentGrid

DeploymentGrid的格式与Deployment类似,字段就是原先deployment的template字段,比较特殊的是gridUniqKey字段,该字段指明了节点分组的label的key值:

  1. apiVersion: superedge.io/v1
  2. kind: DeploymentGrid
  3. metadata:
  4. name:
  5. namespace:
  6. spec:
  7. gridUniqKey: <NodeLabel Key>
  8. <deployment-template>

StatefulSetGrid

StatefulSetGrid的格式与StatefulSet类似,字段就是原先statefulset的template字段,比较特殊的是gridUniqKey字段,该字段指明了节点分组的label的key值:

  1. apiVersion: superedge.io/v1
  2. kind: StatefulSetGrid
  3. metadata:
  4. name:
  5. namespace:
  6. spec:
  7. gridUniqKey: <NodeLabel Key>
  8. <statefulset-template>

ServiceGrid

ServiceGrid的格式与Service类似,字段就是原先service的template字段,比较特殊的是gridUniqKey字段,该字段指明了节点分组的label的key值:

  1. apiVersion: superedge.io/v1
  2. kind: ServiceGrid
  3. metadata:
  4. name:
  5. namespace:
  6. spec:
  7. gridUniqKey: <NodeLabel Key>
  8. <service-template>

操作步骤

以在边缘部署echo-service为例,我们希望在多个节点组内分别部署echo-service服务,需要做如下事情:

确定ServiceGroup唯一标识

这一步是逻辑规划,不需要做任何实际操作。我们将目前要创建的serviceGroup逻辑标记使用的UniqKey为:zone

将边缘节点分组

这一步需要使用kubectl对边缘节点打label

例如,我们选定Node12、Node14,打上label: zone=nodeunit1;Node21、Node23,打上label: zone=nodeunit2

注意:上一步中,label的key与ServiceGroup的UniqKey一致,value是NodeUnit的唯一key,value相同的节点表示属于同一个NodeUnit

如果同一个集群中有多个ServiceGroup请为每一个ServiceGroup分配不同的UniqKey

无状态ServiceGroup

部署DeploymentGrid

  1. apiVersion: superedge.io/v1
  2. kind: DeploymentGrid
  3. metadata:
  4. name: deploymentgrid-demo
  5. namespace: default
  6. spec:
  7. gridUniqKey: zone
  8. template:
  9. replicas: 2
  10. selector:
  11. matchLabels:
  12. appGrid: echo
  13. strategy: {}
  14. template:
  15. metadata:
  16. creationTimestamp: null
  17. labels:
  18. appGrid: echo
  19. spec:
  20. containers:
  21. - image: superedge/echoserver:2.2
  22. name: echo
  23. ports:
  24. - containerPort: 8080
  25. protocol: TCP
  26. env:
  27. - name: NODE_NAME
  28. valueFrom:
  29. fieldRef:
  30. fieldPath: spec.nodeName
  31. - name: POD_NAME
  32. valueFrom:
  33. fieldRef:
  34. fieldPath: metadata.name
  35. - name: POD_NAMESPACE
  36. valueFrom:
  37. fieldRef:
  38. fieldPath: metadata.namespace
  39. - name: POD_IP
  40. valueFrom:
  41. fieldRef:
  42. fieldPath: status.podIP
  43. resources: {}

部署ServiceGrid

  1. apiVersion: superedge.io/v1
  2. kind: ServiceGrid
  3. metadata:
  4. name: servicegrid-demo
  5. namespace: default
  6. spec:
  7. gridUniqKey: zone
  8. template:
  9. selector:
  10. appGrid: echo
  11. ports:
  12. - protocol: TCP
  13. port: 80
  14. targetPort: 8080

gridUniqKey字段设置为了zone,所以我们在将节点分组时采用label的key为zone,如果有三组节点,分别为他们添加zone: zone-0, zone: zone-1, zone: zone-2的label即可;这时,每组节点内都有了echo-service的deployment和对应的pod,在节点内访问统一的service-name也只会将请求发向本组的节点

  1. [~]# kubectl get dg
  2. NAME AGE
  3. deploymentgrid-demo 85s
  4. [~]# kubectl get deploy
  5. NAME READY UP-TO-DATE AVAILABLE AGE
  6. deploymentgrid-demo-zone-0 2/2 2 2 85s
  7. deploymentgrid-demo-zone-1 2/2 2 2 85s
  8. deploymentgrid-demo-zone-2 2/2 2 2 85s
  9. [~]# kubectl get svc
  10. NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
  11. kubernetes ClusterIP 172.19.0.1 <none> 443/TCP 87m
  12. servicegrid-demo-svc ClusterIP 172.19.0.177 <none> 80/TCP 80s
  13. # execute on zone-0 nodeunit
  14. [~]# curl 172.19.0.177|grep "node name"
  15. node name: node0
  16. ...
  17. # execute on zone-1 nodeunit
  18. [~]# curl 172.19.0.177|grep "node name"
  19. node name: node1
  20. ...
  21. # execute on zone-2 nodeunit
  22. [~]# curl 172.19.0.177|grep "node name"
  23. node name: node2
  24. ...

另外,对于部署了DeploymentGrid和ServiceGrid后才添加进集群的节点组,该功能会在新的节点组内自动创建指定的deployment

有状态ServiceGroup

部署StatefulSetGrid

  1. apiVersion: superedge.io/v1
  2. kind: StatefulSetGrid
  3. metadata:
  4. name: statefulsetgrid-demo
  5. namespace: default
  6. spec:
  7. gridUniqKey: zone
  8. template:
  9. selector:
  10. matchLabels:
  11. appGrid: echo
  12. serviceName: "servicegrid-demo-svc"
  13. replicas: 3
  14. template:
  15. metadata:
  16. labels:
  17. appGrid: echo
  18. spec:
  19. terminationGracePeriodSeconds: 10
  20. containers:
  21. - image: superedge/echoserver:2.2
  22. name: echo
  23. ports:
  24. - containerPort: 8080
  25. protocol: TCP
  26. env:
  27. - name: NODE_NAME
  28. valueFrom:
  29. fieldRef:
  30. fieldPath: spec.nodeName
  31. - name: POD_NAME
  32. valueFrom:
  33. fieldRef:
  34. fieldPath: metadata.name
  35. - name: POD_NAMESPACE
  36. valueFrom:
  37. fieldRef:
  38. fieldPath: metadata.namespace
  39. - name: POD_IP
  40. valueFrom:
  41. fieldRef:
  42. fieldPath: status.podIP
  43. resources: {}

注意:template中的serviceName设置成即将创建的service名称

部署ServiceGrid

  1. apiVersion: superedge.io/v1
  2. kind: ServiceGrid
  3. metadata:
  4. name: servicegrid-demo
  5. namespace: default
  6. spec:
  7. gridUniqKey: zone
  8. template:
  9. selector:
  10. appGrid: echo
  11. ports:
  12. - protocol: TCP
  13. port: 80
  14. targetPort: 8080

gridUniqKey字段设置为了zone,所以我们在将节点分组时采用label的key为zone,如果有三组节点,分别为他们添加zone: zone-0, zone: zone-1, zone: zone-2的label即可;这时,每组节点内都有了echo-service的statefulset和对应的pod,在节点内访问统一的service-name也只会将请求发向本组的节点

  1. [~]# kubectl get ssg
  2. NAME AGE
  3. statefulsetgrid-demo 21h
  4. [~]# kubectl get statefulset
  5. NAME READY AGE
  6. statefulsetgrid-demo-zone-0 3/3 21h
  7. statefulsetgrid-demo-zone-1 3/3 21h
  8. statefulsetgrid-demo-zone-2 3/3 21h
  9. [~]# kubectl get svc
  10. NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
  11. kubernetes ClusterIP 192.168.0.1 <none> 443/TCP 22h
  12. servicegrid-demo-svc ClusterIP 192.168.21.99 <none> 80/TCP 21h
  13. # execute on zone-0 nodeunit
  14. [~]# curl 192.168.21.99|grep "node name"
  15. node name: node0
  16. ...
  17. # execute on zone-1 nodeunit
  18. [~]# curl 192.168.21.99|grep "node name"
  19. node name: node1
  20. ...
  21. # execute on zone-2 nodeunit
  22. [~]# curl 192.168.21.99|grep "node name"
  23. node name: node2
  24. ...

注意:在各NodeUnit内通过service访问本组服务时,对应clusterIP不能设置成None,暂不支持此种情况下的闭环访问

除了采用service访问statefulset负载,StatefulSetGrid还支持使用headless service的方式进行访问,如下所示:

serviceGroup - 图2

StatefulSetGrid提供屏蔽NodeUnit的统一headless service访问形式,如下:

  1. {StatefulSetGrid}-{0..N-1}.{ServiceGrid}-svc.ns.svc.cluster.local

上述访问会对应实际各个NodeUnit的具体pod:

  1. {StatefulSetGrid}-{NodeUnit}-{0..N-1}.{ServiceGrid}-svc.ns.svc.cluster.local

每个NodeUnit通过相同的headless service只会访问本组内的pod。也即:对于NodeUnit:zone-1来说,会访问statefulsetgrid-demo-zone-1(statefulset)对应的pod;而对于NodeUnit:zone-2来说,会访问statefulsetgrid-demo-zone-2(statefulset)对应的pod

  1. # execute on zone-0 nodeunit
  2. [~]# curl statefulsetgrid-demo-0.servicegrid-demo-svc.default.svc.cluster.local|grep "pod name"
  3. pod name: statefulsetgrid-demo-zone-0-0
  4. [~]# curl statefulsetgrid-demo-1.servicegrid-demo-svc.default.svc.cluster.local|grep "pod name"
  5. pod name: statefulsetgrid-demo-zone-0-1
  6. [~]# curl statefulsetgrid-demo-2.servicegrid-demo-svc.default.svc.cluster.local|grep "pod name"
  7. pod name: statefulsetgrid-demo-zone-0-2
  8. ...
  9. # execute on zone-1 nodeunit
  10. [~]# curl statefulsetgrid-demo-0.servicegrid-demo-svc.default.svc.cluster.local|grep "pod name"
  11. pod name: statefulsetgrid-demo-zone-1-0
  12. [~]# curl statefulsetgrid-demo-1.servicegrid-demo-svc.default.svc.cluster.local|grep "pod name"
  13. pod name: statefulsetgrid-demo-zone-1-1
  14. [~]# curl statefulsetgrid-demo-2.servicegrid-demo-svc.default.svc.cluster.local|grep "pod name"
  15. pod name: statefulsetgrid-demo-zone-1-2
  16. ...
  17. # execute on zone-2 nodeunit
  18. [~]# curl statefulsetgrid-demo-0.servicegrid-demo-svc.default.svc.cluster.local|grep "pod name"
  19. pod name: statefulsetgrid-demo-zone-2-0
  20. [~]# curl statefulsetgrid-demo-1.servicegrid-demo-svc.default.svc.cluster.local|grep "pod name"
  21. pod name: statefulsetgrid-demo-zone-2-1
  22. [~]# curl statefulsetgrid-demo-2.servicegrid-demo-svc.default.svc.cluster.local|grep "pod name"
  23. pod name: statefulsetgrid-demo-zone-2-2
  24. ...

按NodeUnit灰度

DeploymentGrid和StatefulSetGrid均支持按照NodeUnit进行灰度

重要字段

和灰度功能相关的字段有这些:

autoDeleteUnusedTemplate,templatePool,templates,defaultTemplateName

templatePool:用于灰度的template集合

templates:NodeUnit和其使用的templatePool中的template的映射关系,如果没有指定,NodeUnit使用defaultTemplateName指定的template

defaultTemplateName:默认使用的template,如果不填写或者使用”default”就采用spec.template

autoDeleteUnusedTemplate:默认为false,如果设置为ture,会自动删除templatePool中既不在templates中也不在spec.template中的template模板

使用相同的template创建workload

和上面的DeploymentGrid和StatefulsetGrid例子完全一致,如果不需要使用灰度功能,则无需添加额外字段

使用不同的template创建workload

  1. apiVersion: superedge.io/v1
  2. kind: DeploymentGrid
  3. metadata:
  4. name: deploymentgrid-demo
  5. namespace: default
  6. spec:
  7. defaultTemplateName: test1
  8. gridUniqKey: zone
  9. template:
  10. replicas: 1
  11. selector:
  12. matchLabels:
  13. appGrid: echo
  14. strategy: {}
  15. template:
  16. metadata:
  17. creationTimestamp: null
  18. labels:
  19. appGrid: echo
  20. spec:
  21. containers:
  22. - image: superedge/echoserver:2.2
  23. name: echo
  24. ports:
  25. - containerPort: 8080
  26. protocol: TCP
  27. env:
  28. - name: NODE_NAME
  29. valueFrom:
  30. fieldRef:
  31. fieldPath: spec.nodeName
  32. - name: POD_NAME
  33. valueFrom:
  34. fieldRef:
  35. fieldPath: metadata.name
  36. - name: POD_NAMESPACE
  37. valueFrom:
  38. fieldRef:
  39. fieldPath: metadata.namespace
  40. - name: POD_IP
  41. valueFrom:
  42. fieldRef:
  43. fieldPath: status.podIP
  44. resources: {}
  45. templatePool:
  46. test1:
  47. replicas: 2
  48. selector:
  49. matchLabels:
  50. appGrid: echo
  51. strategy: {}
  52. template:
  53. metadata:
  54. creationTimestamp: null
  55. labels:
  56. appGrid: echo
  57. spec:
  58. containers:
  59. - image: superedge/echoserver:2.2
  60. name: echo
  61. ports:
  62. - containerPort: 8080
  63. protocol: TCP
  64. env:
  65. - name: NODE_NAME
  66. valueFrom:
  67. fieldRef:
  68. fieldPath: spec.nodeName
  69. - name: POD_NAME
  70. valueFrom:
  71. fieldRef:
  72. fieldPath: metadata.name
  73. - name: POD_NAMESPACE
  74. valueFrom:
  75. fieldRef:
  76. fieldPath: metadata.namespace
  77. - name: POD_IP
  78. valueFrom:
  79. fieldRef:
  80. fieldPath: status.podIP
  81. resources: {}
  82. test2:
  83. replicas: 3
  84. selector:
  85. matchLabels:
  86. appGrid: echo
  87. strategy: {}
  88. template:
  89. metadata:
  90. creationTimestamp: null
  91. labels:
  92. appGrid: echo
  93. spec:
  94. containers:
  95. - image: superedge/echoserver:2.3
  96. name: echo
  97. ports:
  98. - containerPort: 8080
  99. protocol: TCP
  100. env:
  101. - name: NODE_NAME
  102. valueFrom:
  103. fieldRef:
  104. fieldPath: spec.nodeName
  105. - name: POD_NAME
  106. valueFrom:
  107. fieldRef:
  108. fieldPath: metadata.name
  109. - name: POD_NAMESPACE
  110. valueFrom:
  111. fieldRef:
  112. fieldPath: metadata.namespace
  113. - name: POD_IP
  114. valueFrom:
  115. fieldRef:
  116. fieldPath: status.podIP
  117. resources: {}
  118. templates:
  119. zone1: test1
  120. zone2: test2

这个例子中,NodeUnit zone1将会使用test1 template,NodeUnit zone2将会使用test2 template,其余NodeUnit将会使用defaultTemplateName中指定的template,这里 会使用test1

参考

最后修改 June 15, 2021 : initial commit (974355a)