Add Facilities to Chaos Daemon

In Develop a new chaos, we have added a new chaos type named HelloWorldChaos, which will print hello world in chaos-controller-manager. To actually run the chaos, we need to configure some facilities for Chaos Daemon - so that controller-manager can select the specified Pods according to the chaos configuration and sends the chaos request to the chaos-daemon corresponding to these Pods. Once these are done, the chaos-daemon could run the chaos at last.

This guide covers the following steps:

Add selector for HelloWorldChaos

In Chaos Mesh, we have defined the spec.selector field to specify the scope of the chaos by namespace, labels, annotation, etc. You can refer to Define the Scope of Chaos Experiment for more information. To specify the Pods for HelloWorld chaos:

  1. Add the Spec field in HelloWorldChaos:

    ``` // HelloWorldChaos is the Schema for the helloworldchaos API type HelloWorldChaos struct {

    1. metav1.TypeMeta `json:",inline"`
    2. metav1.ObjectMeta `json:"metadata,omitempty"`
  1. // Spec defines the behavior of a pod chaos experiment
  2. Spec HelloWorldSpec `json:"spec"`
  3. }
  4. type HelloWorldSpec struct {
  5. Selector SelectorSpec `json:"selector"`
  6. }
  7. // GetSelector is a getter for Selector (for implementing SelectSpec)
  8. func (in *HelloWorldSpec) GetSelector() SelectorSpec {
  9. return in.Selector
  10. }
  11. ```
  1. Generate boilerplate functions for the spec field. This is required to integrate the resource in Chaos Mesh.

    1. make generate

Implement the gRPC interface

In order for chaos-daemon to accept requests from chaos-controller-manager, a new gRPC interface is required for chaos-controller-manager and chaos-daemon. Take the steps below to add the gRPC interface:

  1. Add the RPC in chaosdaemon.proto.

    ``` service chaosDaemon {

    1. ...
  1. rpc ExecHelloWorldChaos(ExecHelloWorldRequest) returns (google.protobuf.Empty) {}
  2. }
  3. message ExecHelloWorldRequest {
  4. string container_id = 1;
  5. }
  6. ```
  7. You will need to update golang code generated by this proto file:
  8. ```
  9. make proto
  10. ```
  1. Implement the gRPC service in chaos-daemon.

    Add a new file named helloworld_server.go under chaosdaemon, with the content as below:

    ``` package chaosdaemon

  1. import (
  2. "context"
  3. "fmt"
  4. "github.com/golang/protobuf/ptypes/empty"
  5. "github.com/chaos-mesh/chaos-mesh/pkg/bpm"
  6. pb "github.com/chaos-mesh/chaos-mesh/pkg/chaosdaemon/pb"
  7. )
  8. func (s *daemonServer) ExecHelloWorldChaos(ctx context.Context, req *pb.ExecHelloWorldRequest) (*empty.Empty, error) {
  9. log.Info("ExecHelloWorldChaos", "request", req)
  10. pid, err := s.crClient.GetPidFromContainerID(ctx, req.ContainerId)
  11. if err != nil {
  12. return nil, err
  13. }
  14. cmd := bpm.DefaultProcessBuilder("sh", "-c", fmt.Sprintf("echo 'hello' `hostname`")).
  15. SetNS(pid, bpm.UtsNS).
  16. SetContext(ctx).
  17. Build()
  18. out, err := cmd.Output()
  19. if err != nil {
  20. return nil, err
  21. }
  22. if len(out) != 0 {
  23. log.Info("cmd output", "output", string(out))
  24. }
  25. return &empty.Empty{}, nil
  26. }
  27. ```
  28. After `chaos-daemon` receives the `ExecHelloWorldChaos` request, `chaos-daemon` will print `hello` to this container's hostname.
  1. Send gRPC requests in reconcile.

    When a CRD object is updated (for example: create or delete), we need to compare the state specified in the object against the actual state, and then perform operations to make the actual cluster state reflect the state specified. This process is called reconcile.

    For HelloworldChaos, chaos-controller-manager needs to send chaos request to chaos-daemon in reconcile. To do this, we need to update the file controllers/helloworldchaos/types.go created in Develop a New Chaos with the content as below:

    ``` package helloworldchaos

  1. import (
  2. "context"
  3. "errors"
  4. "fmt"
  5. "k8s.io/apimachinery/pkg/runtime"
  6. ctrl "sigs.k8s.io/controller-runtime"
  7. "github.com/chaos-mesh/chaos-mesh/api/v1alpha1"
  8. "github.com/chaos-mesh/chaos-mesh/controllers/common"
  9. "github.com/chaos-mesh/chaos-mesh/controllers/config"
  10. pb "github.com/chaos-mesh/chaos-mesh/pkg/chaosdaemon/pb"
  11. "github.com/chaos-mesh/chaos-mesh/pkg/router"
  12. ctx "github.com/chaos-mesh/chaos-mesh/pkg/router/context"
  13. end "github.com/chaos-mesh/chaos-mesh/pkg/router/endpoint"
  14. "github.com/chaos-mesh/chaos-mesh/pkg/selector"
  15. "github.com/chaos-mesh/chaos-mesh/pkg/utils"
  16. )
  17. type endpoint struct {
  18. ctx.Context
  19. }
  20. // Apply applies helloworld chaos
  21. func (r *endpoint) Apply(ctx context.Context, req ctrl.Request, chaos v1alpha1.InnerObject) error {
  22. r.Log.Info("Apply helloworld chaos")
  23. helloworldchaos, ok := chaos.(*v1alpha1.HelloWorldChaos)
  24. if !ok {
  25. return errors.New("chaos is not helloworldchaos")
  26. }
  27. pods, err := selector.SelectAndFilterPods(ctx, r.Client, r.Reader, &helloworldchaos.Spec, config.ControllerCfg.ClusterScoped, config.ControllerCfg.TargetNamespace, config.ControllerCfg.AllowedNamespaces, config.ControllerCfg.IgnoredNamespaces)
  28. if err != nil {
  29. r.Log.Error(err, "failed to select and filter pods")
  30. return err
  31. }
  32. for _, pod := range pods {
  33. daemonClient, err := utils.NewChaosDaemonClient(ctx, r.Client, &pod, common.ControllerCfg.ChaosDaemonPort)
  34. if err != nil {
  35. r.Log.Error(err, "get chaos daemon client")
  36. return err
  37. }
  38. defer daemonClient.Close()
  39. if len(pod.Status.ContainerStatuses) == 0 {
  40. return fmt.Errorf("%s %s can't get the state of container", pod.Namespace, pod.Name)
  41. }
  42. containerID := pod.Status.ContainerStatuses[0].ContainerID
  43. _, err = daemonClient.ExecHelloWorldChaos(ctx, &pb.ExecHelloWorldRequest{
  44. ContainerId: containerID,
  45. })
  46. if err != nil {
  47. return err
  48. }
  49. }
  50. return nil
  51. }
  52. // Recover means the reconciler recovers the chaos action
  53. func (r *endpoint) Recover(ctx context.Context, req ctrl.Request, chaos v1alpha1.InnerObject) error {
  54. return nil
  55. }
  56. // Object would return the instance of chaos
  57. func (r *endpoint) Object() v1alpha1.InnerObject {
  58. return &v1alpha1.HelloWorldChaos{}
  59. }
  60. func init() {
  61. router.Register("helloworldchaos", &v1alpha1.HelloWorldChaos{}, func(obj runtime.Object) bool {
  62. return true
  63. }, func(ctx ctx.Context) end.Endpoint {
  64. return &endpoint{
  65. Context: ctx,
  66. }
  67. })
  68. }
  69. ```
  70. > **Notes:**
  71. >
  72. > In our case here, the `Recover` function does nothing because `HelloWorldChaos` only prints some log and doesn't change anything. You may need to implement the `Recover` function in your development.

Verify your chaos

Now you are all set. It’s time to verify the chaos type you just created. Take the steps below:

  1. Make the Docker image. Refer to Make the Docker image.

  2. Upgrade Chaos Mesh. Since we have already installed Chaos Mesh in Develop a New Chaos, we only need to restart it with the latest image:

    1. kubectl rollout restart deployment chaos-controller-manager -n chaos-testing
    2. kubectl rollout restart daemonset chaos-daemon -n chaos-testing
  1. Deploy the Pods for test:

    1. kubectl apply -f https://raw.githubusercontent.com/chaos-mesh/apps/master/ping/busybox-statefulset.yaml
  1. This command deploys two Pods in the `busybox` namespace.
  1. Create the chaos YAML file:

    1. apiVersion: chaos-mesh.org/v1alpha1
    2. kind: HelloWorldChaos
    3. metadata:
    4. name: busybox-helloworld-chaos
    5. spec:
    6. selector:
    7. namespaces:
    8. - busybox
  1. Apply the chaos:

    1. kubectl apply -f /path/to/helloworld.yaml
  1. Verify your chaos. There are different logs to check to see whether your chaos works as expected:

    • Check the log of chaos-controller-manager:

      1. kubectl logs chaos-controller-manager-{pod-post-fix} -n chaos-testing
  1. The log is as follows:
  2. ```
  3. 2020-09-09T09:13:36.018Z INFO controllers.HelloWorldChaos Reconciling helloworld chaos {"reconciler": "helloworldchaos"}
  4. 2020-09-09T09:13:36.018Z INFO controllers.HelloWorldChaos Apply helloworld chaos {"reconciler": "helloworldchaos"}
  5. ```
  6. - Check the log of `chaos-daemon`:
  7. ```
  8. kubectl logs chaos-daemon-{pod-post-fix} -n chaos-testing
  9. ```
  10. The log is as follows:
  11. ```
  12. 2020-09-09T09:13:36.036Z INFO chaos-daemon-server exec hello world chaos {"request": "container_id:\"docker://8f2918ee05ed587f7074a923cede3bbe5886277faca95d989e513f7b7e831da5\" "}
  13. 2020-09-09T09:13:36.044Z INFO chaos-daemon-server build command {"command": "nsenter -u/proc/45664/ns/uts -- sh -c echo 'hello' `hostname`"}
  14. 2020-09-09T09:13:36.058Z INFO chaos-daemon-server cmd output {"output": "hello busybox-1\n"}
  15. 2020-09-09T09:13:36.064Z INFO chaos-daemon-server exec hello world chaos {"request": "container_id:\"docker://53e982ba5593fa87648edba665ba0f7da3f58df67f8b70a1354ca00447c00524\" "}
  16. 2020-09-09T09:13:36.066Z INFO chaos-daemon-server build command {"command": "nsenter -u/proc/45620/ns/uts -- sh -c echo 'hello' `hostname`"}
  17. 2020-09-09T09:13:36.070Z INFO chaos-daemon-server cmd output {"output": "hello busybox-0\n"}
  18. ```
  19. We can see the `chaos-daemon` prints `hello` to these two Pods.