Troubleshooting

Policy Tracing

If Cilium is allowing / denying connections in a way that is not aligned with the intent of your Cilium Network policy, there is an easy way to verify if and what policy rules apply between two endpoints. We can use the cilium policy trace to simulate a policy decision between the source and destination endpoints.

We will use the example from the Minikube Getting Started Guide to trace the policy. In this example, there is:

  • deathstar service identified by labels: org=empire, class=deathstar. The service is backed by two pods.
  • tiefighter spaceship client pod with labels: org=empire, class=tiefighter
  • xwing spaceship client pod with labels: org=alliance, class=xwing

An L3/L4 policy is enforced on the deathstar service to allow access to all spaceships with labels org=empire. With this policy, the tiefighter access is allowed but xwing access will be denied. Let’s use the cilium policy trace to simulate the policy decision. The command provides flexibility to run using pod names, labels or Cilium security identities.

Note

If the --dport option is not specified, then L4 policy will not be consulted in this policy trace command.

Currently, there is no support for tracing L7 policies via this tool.

  1. # Policy trace using pod name and service labels
  2. $ kubectl exec -ti cilium-88k78 -n kube-system -- cilium policy trace --src-k8s-pod default:xwing -d any:class=deathstar,k8s:org=empire,k8s:io.kubernetes.pod.namespace=default --dport 80
  3. level=info msg="Waiting for k8s api-server to be ready..." subsys=k8s
  4. level=info msg="Connected to k8s api-server" ipAddr="https://10.96.0.1:443" subsys=k8s
  5. ----------------------------------------------------------------
  6. Tracing From: [k8s:class=xwing, k8s:io.cilium.k8s.policy.serviceaccount=default, k8s:io.kubernetes.pod.namespace=default, k8s:org=alliance] => To: [any:class=deathstar, k8s:org=empire, k8s:io.kubernetes.pod.namespace=default] Ports: [80/ANY]
  7. Resolving ingress policy for [any:class=deathstar k8s:org=empire k8s:io.kubernetes.pod.namespace=default]
  8. * Rule {"matchLabels":{"any:class":"deathstar","any:org":"empire","k8s:io.kubernetes.pod.namespace":"default"}}: selected
  9. Allows from labels {"matchLabels":{"any:org":"empire","k8s:io.kubernetes.pod.namespace":"default"}}
  10. Labels [k8s:class=xwing k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=default k8s:org=alliance] not found
  11. 1/1 rules selected
  12. Found no allow rule
  13. Ingress verdict: denied
  14. Final verdict: DENIED
  1. # Get the Cilium security id
  2. $ kubectl exec -ti cilium-88k78 -n kube-system -- cilium endpoint list | egrep 'deathstar|xwing|tiefighter'
  3. ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS
  4. ENFORCEMENT ENFORCEMENT
  5. 568 Enabled Disabled 22133 k8s:class=deathstar f00d::a0f:0:0:238 10.15.65.193 ready
  6. 900 Enabled Disabled 22133 k8s:class=deathstar f00d::a0f:0:0:384 10.15.114.17 ready
  7. 33633 Disabled Disabled 53208 k8s:class=xwing f00d::a0f:0:0:8361 10.15.151.230 ready
  8. 38654 Disabled Disabled 22962 k8s:class=tiefighter f00d::a0f:0:0:96fe 10.15.88.156 ready
  9. # Policy trace using Cilium security ids
  10. $ kubectl exec -ti cilium-88k78 -n kube-system -- cilium policy trace --src-identity 53208 --dst-identity 22133 --dport 80
  11. ----------------------------------------------------------------
  12. Tracing From: [k8s:class=xwing, k8s:io.cilium.k8s.policy.serviceaccount=default, k8s:io.kubernetes.pod.namespace=default, k8s:org=alliance] => To: [any:class=deathstar, k8s:org=empire, k8s:io.kubernetes.pod.namespace=default] Ports: [80/ANY]
  13. Resolving ingress policy for [any:class=deathstar k8s:org=empire k8s:io.kubernetes.pod.namespace=default]
  14. * Rule {"matchLabels":{"any:class":"deathstar","any:org":"empire","k8s:io.kubernetes.pod.namespace":"default"}}: selected
  15. Allows from labels {"matchLabels":{"any:org":"empire","k8s:io.kubernetes.pod.namespace":"default"}}
  16. Labels [k8s:class=xwing k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=default k8s:org=alliance] not found
  17. 1/1 rules selected
  18. Found no allow rule
  19. Ingress verdict: denied
  20. Final verdict: DENIED

Policy Rule to Endpoint Mapping

To determine which policy rules are currently in effect for an endpoint the data from cilium endpoint list and cilium endpoint get can be paired with the data from cilium policy get. cilium endpoint get will list the labels of each rule that applies to an endpoint. The list of labels can be passed to cilium policy get to show that exact source policy. Note that rules that have no labels cannot be fetched alone (a no label cililum policy get returns the complete policy on the node). Rules with the same labels will be returned together.

In the above example, for one of the deathstar pods the endpoint id is 568. We can print all policies applied to it with:

  1. # Get a shell on the Cilium pod
  2. $ kubectl exec -ti cilium-88k78 -n kube-system -- /bin/bash
  3. # print out the ingress labels
  4. # clean up the data
  5. # fetch each policy via each set of labels
  6. # (Note that while the structure is "...l4.ingress...", it reflects all L3, L4 and L7 policy.
  7. $ cilium endpoint get 568 -o jsonpath='{range ..status.policy.realized.l4.ingress[*].derived-from-rules}{@}{"\n"}{end}'|tr -d '][' | xargs -I{} bash -c 'echo "Labels: {}"; cilium policy get {}'
  8. Labels: k8s:io.cilium.k8s.policy.name=rule1 k8s:io.cilium.k8s.policy.namespace=default
  9. [
  10. {
  11. "endpointSelector": {
  12. "matchLabels": {
  13. "any:class": "deathstar",
  14. "any:org": "empire",
  15. "k8s:io.kubernetes.pod.namespace": "default"
  16. }
  17. },
  18. "ingress": [
  19. {
  20. "fromEndpoints": [
  21. {
  22. "matchLabels": {
  23. "any:org": "empire",
  24. "k8s:io.kubernetes.pod.namespace": "default"
  25. }
  26. }
  27. ],
  28. "toPorts": [
  29. {
  30. "ports": [
  31. {
  32. "port": "80",
  33. "protocol": "TCP"
  34. }
  35. ],
  36. "rules": {
  37. "http": [
  38. {
  39. "path": "/v1/request-landing",
  40. "method": "POST"
  41. }
  42. ]
  43. }
  44. }
  45. ]
  46. }
  47. ],
  48. "labels": [
  49. {
  50. "key": "io.cilium.k8s.policy.name",
  51. "value": "rule1",
  52. "source": "k8s"
  53. },
  54. {
  55. "key": "io.cilium.k8s.policy.namespace",
  56. "value": "default",
  57. "source": "k8s"
  58. }
  59. ]
  60. }
  61. ]
  62. Revision: 217
  63. # repeat for egress
  64. $ cilium endpoint get 568 -o jsonpath='{range ..status.policy.realized.l4.egress[*].derived-from-rules}{@}{"\n"}{end}' | tr -d '][' | xargs -I{} bash -c 'echo "Labels: {}"; cilium policy get {}'

Troubleshooting toFQDNs rules

The effect of toFQDNs may change long after a policy is applied, as DNS data changes. This can make it difficult to debug unexpectedly blocked connections, or transient failures. Cilium provides CLI tools to introspect the state of applying FQDN policy in multiple layers of the daemon:

  1. cilium policy get should show the FQDN policy that was imported:

    1. {
    2. "endpointSelector": {
    3. "matchLabels": {
    4. "any:class": "mediabot",
    5. "any:org": "empire",
    6. "k8s:io.kubernetes.pod.namespace": "default"
    7. }
    8. },
    9. "egress": [
    10. {
    11. "toFQDNs": [
    12. {
    13. "matchName": "api.twitter.com"
    14. }
    15. ]
    16. },
    17. {
    18. "toEndpoints": [
    19. {
    20. "matchLabels": {
    21. "k8s:io.kubernetes.pod.namespace": "kube-system",
    22. "k8s:k8s-app": "kube-dns"
    23. }
    24. }
    25. ],
    26. "toPorts": [
    27. {
    28. "ports": [
    29. {
    30. "port": "53",
    31. "protocol": "ANY"
    32. }
    33. ],
    34. "rules": {
    35. "dns": [
    36. {
    37. "matchPattern": "*"
    38. }
    39. ]
    40. }
    41. }
    42. ]
    43. }
    44. ],
    45. "labels": [
    46. {
    47. "key": "io.cilium.k8s.policy.derived-from",
    48. "value": "CiliumNetworkPolicy",
    49. "source": "k8s"
    50. },
    51. {
    52. "key": "io.cilium.k8s.policy.name",
    53. "value": "fqdn",
    54. "source": "k8s"
    55. },
    56. {
    57. "key": "io.cilium.k8s.policy.namespace",
    58. "value": "default",
    59. "source": "k8s"
    60. },
    61. {
    62. "key": "io.cilium.k8s.policy.uid",
    63. "value": "fc9d6022-2ffa-4f72-b59e-b9067c3cfecf",
    64. "source": "k8s"
    65. }
    66. ]
    67. }
  2. After making a DNS request, the FQDN to IP mapping should be available via cilium fqdn cache list:

    1. # cilium fqdn cache list
    2. Endpoint FQDN TTL ExpirationTime IPs
    3. 2761 help.twitter.com. 604800 2019-07-16T17:57:38.179Z 104.244.42.67,104.244.42.195,104.244.42.3,104.244.42.131
    4. 2761 api.twitter.com. 604800 2019-07-16T18:11:38.627Z 104.244.42.194,104.244.42.130,104.244.42.66,104.244.42.2
  3. If the traffic is allowed, then these IPs should have corresponding local identities via cilium identity list | grep <IP>:

    1. # cilium identity list | grep -A 1 104.244.42.194
    2. 16777220 cidr:104.244.42.194/32
    3. reserved:world
  4. Given the identity of the traffic that should be allowed, the regular Policy Tracing steps can be used to validate that the policy is calculated correctly.