This document describes how to check the status of your HPAs after scaling them up or down with your load testing tool. For information on how to check the status from the Rancher UI (at least version 2.3.x), refer to Managing HPAs with the Rancher UI.

    For HPA to work correctly, service deployments should have resources request definitions for containers. Follow this hello-world example to test if HPA is working correctly.

    1. Configure kubectl to connect to your Kubernetes cluster.

    2. Copy the hello-world deployment manifest below.

      Hello World Manifest

      1. apiVersion: apps/v1beta2
      2. kind: Deployment
      3. metadata:
      4. labels:
      5. app: hello-world
      6. name: hello-world
      7. namespace: default
      8. spec:
      9. replicas: 1
      10. selector:
      11. matchLabels:
      12. app: hello-world
      13. strategy:
      14. rollingUpdate:
      15. maxSurge: 1
      16. maxUnavailable: 0
      17. type: RollingUpdate
      18. template:
      19. metadata:
      20. labels:
      21. app: hello-world
      22. spec:
      23. containers:
      24. - image: rancher/hello-world
      25. imagePullPolicy: Always
      26. name: hello-world
      27. resources:
      28. requests:
      29. cpu: 500m
      30. memory: 64Mi
      31. ports:
      32. - containerPort: 80
      33. protocol: TCP
      34. restartPolicy: Always
      35. ---
      36. apiVersion: v1
      37. kind: Service
      38. metadata:
      39. name: hello-world
      40. namespace: default
      41. spec:
      42. ports:
      43. - port: 80
      44. protocol: TCP
      45. targetPort: 80
      46. selector:
      47. app: hello-world
    3. Deploy it to your cluster.

      1. # kubectl create -f <HELLO_WORLD_MANIFEST>
    4. Copy one of the HPAs below based on the metric type you’re using:

      Hello World HPA: Resource Metrics

      1. apiVersion: autoscaling/v2beta1
      2. kind: HorizontalPodAutoscaler
      3. metadata:
      4. name: hello-world
      5. namespace: default
      6. spec:
      7. scaleTargetRef:
      8. apiVersion: extensions/v1beta1
      9. kind: Deployment
      10. name: hello-world
      11. minReplicas: 1
      12. maxReplicas: 10
      13. metrics:
      14. - type: Resource
      15. resource:
      16. name: cpu
      17. targetAverageUtilization: 50
      18. - type: Resource
      19. resource:
      20. name: memory
      21. targetAverageValue: 1000Mi

      Hello World HPA: Custom Metrics

      1. apiVersion: autoscaling/v2beta1
      2. kind: HorizontalPodAutoscaler
      3. metadata:
      4. name: hello-world
      5. namespace: default
      6. spec:
      7. scaleTargetRef:
      8. apiVersion: extensions/v1beta1
      9. kind: Deployment
      10. name: hello-world
      11. minReplicas: 1
      12. maxReplicas: 10
      13. metrics:
      14. - type: Resource
      15. resource:
      16. name: cpu
      17. targetAverageUtilization: 50
      18. - type: Resource
      19. resource:
      20. name: memory
      21. targetAverageValue: 100Mi
      22. - type: Pods
      23. pods:
      24. metricName: cpu_system
      25. targetAverageValue: 20m
    5. View the HPA info and description. Confirm that metric data is shown.

      Resource Metrics

      1. Enter the following commands.

        1. # kubectl get hpa
        2. NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
        3. hello-world Deployment/hello-world 1253376 / 100Mi, 0% / 50% 1 10 1 6m
        4. # kubectl describe hpa
        5. Name: hello-world
        6. Namespace: default
        7. Labels: <none>
        8. Annotations: <none>
        9. CreationTimestamp: Mon, 23 Jul 2018 20:21:16 +0200
        10. Reference: Deployment/hello-world
        11. Metrics: ( current / target )
        12. resource memory on pods: 1253376 / 100Mi
        13. resource cpu on pods (as a percentage of request): 0% (0) / 50%
        14. Min replicas: 1
        15. Max replicas: 10
        16. Conditions:
        17. Type Status Reason Message
        18. ---- ------ ------ -------
        19. AbleToScale True ReadyForNewScale the last scale time was sufficiently old as to warrant a new scale
        20. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
        21. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
        22. Events: <none>

        Custom Metrics

      2. Enter the following command.

        1. # kubectl describe hpa

        You should receive the output that follows.

        1. Name: hello-world
        2. Namespace: default
        3. Labels: <none>
        4. Annotations: <none>
        5. CreationTimestamp: Tue, 24 Jul 2018 18:36:28 +0200
        6. Reference: Deployment/hello-world
        7. Metrics: ( current / target )
        8. resource memory on pods: 3514368 / 100Mi
        9. "cpu_system" on pods: 0 / 20m
        10. resource cpu on pods (as a percentage of request): 0% (0) / 50%
        11. Min replicas: 1
        12. Max replicas: 10
        13. Conditions:
        14. Type Status Reason Message
        15. ---- ------ ------ -------
        16. AbleToScale True ReadyForNewScale the last scale time was sufficiently old as to warrant a new scale
        17. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
        18. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
        19. Events: <none>
    6. Generate a load for the service to test that your pods autoscale as intended. You can use any load-testing tool (Hey, Gatling, etc.), but we’re using Hey.

    7. Test that pod autoscaling works as intended.

      To Test Autoscaling Using Resource Metrics:

      Upscale to 2 Pods: CPU Usage Up to Target

      Use your load testing tool to scale up to two pods based on CPU Usage.

      1. View your HPA.

        1. # kubectl describe hpa

        You should receive output similar to what follows.

        1. Name: hello-world
        2. Namespace: default
        3. Labels: <none>
        4. Annotations: <none>
        5. CreationTimestamp: Mon, 23 Jul 2018 22:22:04 +0200
        6. Reference: Deployment/hello-world
        7. Metrics: ( current / target )
        8. resource memory on pods: 10928128 / 100Mi
        9. resource cpu on pods (as a percentage of request): 56% (280m) / 50%
        10. Min replicas: 1
        11. Max replicas: 10
        12. Conditions:
        13. Type Status Reason Message
        14. ---- ------ ------ -------
        15. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2
        16. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
        17. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
        18. Events:
        19. Type Reason Age From Message
        20. ---- ------ ---- ---- -------
        21. Normal SuccessfulRescale 13s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
      2. Enter the following command to confirm you’ve scaled to two pods.

        1. # kubectl get pods

        You should receive output similar to what follows:

        1. NAME READY STATUS RESTARTS AGE
        2. hello-world-54764dfbf8-k8ph2 1/1 Running 0 1m
        3. hello-world-54764dfbf8-q6l4v 1/1 Running 0 3h

        Upscale to 3 pods: CPU Usage Up to Target

      Use your load testing tool to upscale to 3 pods based on CPU usage with horizontal-pod-autoscaler-upscale-delay set to 3 minutes.

      1. Enter the following command.

        1. # kubectl describe hpa

        You should receive output similar to what follows

        1. Name: hello-world
        2. Namespace: default
        3. Labels: <none>
        4. Annotations: <none>
        5. CreationTimestamp: Mon, 23 Jul 2018 22:22:04 +0200
        6. Reference: Deployment/hello-world
        7. Metrics: ( current / target )
        8. resource memory on pods: 9424896 / 100Mi
        9. resource cpu on pods (as a percentage of request): 66% (333m) / 50%
        10. Min replicas: 1
        11. Max replicas: 10
        12. Conditions:
        13. Type Status Reason Message
        14. ---- ------ ------ -------
        15. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 3
        16. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
        17. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
        18. Events:
        19. Type Reason Age From Message
        20. ---- ------ ---- ---- -------
        21. Normal SuccessfulRescale 4m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
        22. Normal SuccessfulRescale 16s horizontal-pod-autoscaler New size: 3; reason: cpu resource utilization (percentage of request) above target
      2. Enter the following command to confirm three pods are running.

        1. # kubectl get pods

        You should receive output similar to what follows.

        1. NAME READY STATUS RESTARTS AGE
        2. hello-world-54764dfbf8-f46kh 0/1 Running 0 1m
        3. hello-world-54764dfbf8-k8ph2 1/1 Running 0 5m
        4. hello-world-54764dfbf8-q6l4v 1/1 Running 0 3h

        Downscale to 1 Pod: All Metrics Below Target

      Use your load testing to scale down to 1 pod when all metrics are below target for horizontal-pod-autoscaler-downscale-delay (5 minutes by default).

      1. Enter the following command.

        1. # kubectl describe hpa

        You should receive output similar to what follows.

        1. Name: hello-world
        2. Namespace: default
        3. Labels: <none>
        4. Annotations: <none>
        5. CreationTimestamp: Mon, 23 Jul 2018 22:22:04 +0200
        6. Reference: Deployment/hello-world
        7. Metrics: ( current / target )
        8. resource memory on pods: 10070016 / 100Mi
        9. resource cpu on pods (as a percentage of request): 0% (0) / 50%
        10. Min replicas: 1
        11. Max replicas: 10
        12. Conditions:
        13. Type Status Reason Message
        14. ---- ------ ------ -------
        15. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 1
        16. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
        17. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
        18. Events:
        19. Type Reason Age From Message
        20. ---- ------ ---- ---- -------
        21. Normal SuccessfulRescale 10m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
        22. Normal SuccessfulRescale 6m horizontal-pod-autoscaler New size: 3; reason: cpu resource utilization (percentage of request) above target
        23. Normal SuccessfulRescale 1s horizontal-pod-autoscaler New size: 1; reason: All metrics below target
    1. **To Test Autoscaling Using Custom Metrics:**
    2. Upscale to 2 Pods: CPU Usage Up to Target
    3. Use your load testing tool to upscale two pods based on CPU usage.
    4. 1. Enter the following command.
    5. ```
    6. # kubectl describe hpa
    7. ```
    8. You should receive output similar to what follows.
    9. ```
    10. Name: hello-world
    11. Namespace: default
    12. Labels: <none>
    13. Annotations: <none>
    14. CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
    15. Reference: Deployment/hello-world
    16. Metrics: ( current / target )
    17. resource memory on pods: 8159232 / 100Mi
    18. "cpu_system" on pods: 7m / 20m
    19. resource cpu on pods (as a percentage of request): 64% (321m) / 50%
    20. Min replicas: 1
    21. Max replicas: 10
    22. Conditions:
    23. Type Status Reason Message
    24. ---- ------ ------ -------
    25. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2
    26. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
    27. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
    28. Events:
    29. Type Reason Age From Message
    30. ---- ------ ---- ---- -------
    31. Normal SuccessfulRescale 16s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
    32. ```
    33. 2. Enter the following command to confirm two pods are running.
    34. ```
    35. # kubectl get pods
    36. ```
    37. You should receive output similar to what follows.
    38. ```
    39. NAME READY STATUS RESTARTS AGE
    40. hello-world-54764dfbf8-5pfdr 1/1 Running 0 3s
    41. hello-world-54764dfbf8-q6l82 1/1 Running 0 6h
    42. ```
    43. Upscale to 3 Pods: CPU Usage Up to Target
    44. Use your load testing tool to scale up to three pods when the cpu\_system usage limit is up to target.
    45. 1. Enter the following command.
    46. ```
    47. # kubectl describe hpa
    48. ```
    49. You should receive output similar to what follows:
    50. ```
    51. Name: hello-world
    52. Namespace: default
    53. Labels: <none>
    54. Annotations: <none>
    55. CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
    56. Reference: Deployment/hello-world
    57. Metrics: ( current / target )
    58. resource memory on pods: 8374272 / 100Mi
    59. "cpu_system" on pods: 27m / 20m
    60. resource cpu on pods (as a percentage of request): 71% (357m) / 50%
    61. Min replicas: 1
    62. Max replicas: 10
    63. Conditions:
    64. Type Status Reason Message
    65. ---- ------ ------ -------
    66. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 3
    67. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
    68. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
    69. Events:
    70. Type Reason Age From Message
    71. ---- ------ ---- ---- -------
    72. Normal SuccessfulRescale 3m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
    73. Normal SuccessfulRescale 3s horizontal-pod-autoscaler New size: 3; reason: pods metric cpu_system above target
    74. ```
    75. 2. Enter the following command to confirm three pods are running.
    76. ```
    77. # kubectl get pods
    78. ```
    79. You should receive output similar to what follows:
    80. ```
    81. # kubectl get pods
    82. NAME READY STATUS RESTARTS AGE
    83. hello-world-54764dfbf8-5pfdr 1/1 Running 0 3m
    84. hello-world-54764dfbf8-m2hrl 1/1 Running 0 1s
    85. hello-world-54764dfbf8-q6l82 1/1 Running 0 6h
    86. ```
    87. Upscale to 4 Pods: CPU Usage Up to Target
    88. Use your load testing tool to upscale to four pods based on CPU usage. `horizontal-pod-autoscaler-upscale-delay` is set to three minutes by default.
    89. 1. Enter the following command.
    90. ```
    91. # kubectl describe hpa
    92. ```
    93. You should receive output similar to what follows.
    94. ```
    95. Name: hello-world
    96. Namespace: default
    97. Labels: <none>
    98. Annotations: <none>
    99. CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
    100. Reference: Deployment/hello-world
    101. Metrics: ( current / target )
    102. resource memory on pods: 8374272 / 100Mi
    103. "cpu_system" on pods: 27m / 20m
    104. resource cpu on pods (as a percentage of request): 71% (357m) / 50%
    105. Min replicas: 1
    106. Max replicas: 10
    107. Conditions:
    108. Type Status Reason Message
    109. ---- ------ ------ -------
    110. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 3
    111. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
    112. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
    113. Events:
    114. Type Reason Age From Message
    115. ---- ------ ---- ---- -------
    116. Normal SuccessfulRescale 5m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
    117. Normal SuccessfulRescale 3m horizontal-pod-autoscaler New size: 3; reason: pods metric cpu_system above target
    118. Normal SuccessfulRescale 4s horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
    119. ```
    120. 2. Enter the following command to confirm four pods are running.
    121. ```
    122. # kubectl get pods
    123. ```
    124. You should receive output similar to what follows.
    125. ```
    126. NAME READY STATUS RESTARTS AGE
    127. hello-world-54764dfbf8-2p9xb 1/1 Running 0 5m
    128. hello-world-54764dfbf8-5pfdr 1/1 Running 0 2m
    129. hello-world-54764dfbf8-m2hrl 1/1 Running 0 1s
    130. hello-world-54764dfbf8-q6l82 1/1 Running 0 6h
    131. ```
    132. Downscale to 1 Pod: All Metrics Below Target
    133. Use your load testing tool to scale down to one pod when all metrics below target for `horizontal-pod-autoscaler-downscale-delay`.
    134. 1. Enter the following command.
    135. ```
    136. # kubectl describe hpa
    137. ```
    138. You should receive similar output to what follows.
    139. ```
    140. Name: hello-world
    141. Namespace: default
    142. Labels: <none>
    143. Annotations: <none>
    144. CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
    145. Reference: Deployment/hello-world
    146. Metrics: ( current / target )
    147. resource memory on pods: 8101888 / 100Mi
    148. "cpu_system" on pods: 8m / 20m
    149. resource cpu on pods (as a percentage of request): 0% (0) / 50%
    150. Min replicas: 1
    151. Max replicas: 10
    152. Conditions:
    153. Type Status Reason Message
    154. ---- ------ ------ -------
    155. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 1
    156. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
    157. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
    158. Events:
    159. Type Reason Age From Message
    160. ---- ------ ---- ---- -------
    161. Normal SuccessfulRescale 10m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
    162. Normal SuccessfulRescale 8m horizontal-pod-autoscaler New size: 3; reason: pods metric cpu_system above target
    163. Normal SuccessfulRescale 5m horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
    164. Normal SuccessfulRescale 13s horizontal-pod-autoscaler New size: 1; reason: All metrics below target
    165. ```
    166. 2. Enter the following command to confirm a single pods is running.
    167. ```
    168. # kubectl get pods
    169. ```
    170. You should receive output similar to what follows.
    171. ```
    172. NAME READY STATUS RESTARTS AGE
    173. hello-world-54764dfbf8-q6l82 1/1 Running 0 6h
    174. ```