4. Stopping and restarting HAProxy

  1. HAProxy supports a graceful and a hard stop. The hard stop is simple, when the
  2. SIGTERM signal is sent to the haproxy process, it immediately quits and all
  3. established connections are closed. The graceful stop is triggered when the
  4. SIGUSR1 signal is sent to the haproxy process. It consists in only unbinding
  5. from listening ports, but continue to process existing connections until they
  6. close. Once the last connection is closed, the process leaves.
  7.  
  8. The hard stop method is used for the "stop" or "restart" actions of the service
  9. management script. The graceful stop is used for the "reload" action which
  10. tries to seamlessly reload a new configuration in a new process.
  11.  
  12. Both of these signals may be sent by the new haproxy process itself during a
  13. reload or restart, so that they are sent at the latest possible moment and only
  14. if absolutely required. This is what is performed by the "-st" (hard) and "-sf"
  15. (graceful) options respectively.
  16.  
  17. In master-worker mode, it is not needed to start a new haproxy process in
  18. order to reload the configuration. The master process reacts to the SIGUSR2
  19. signal by reexecuting itself with the -sf parameter followed by the PIDs of
  20. the workers. The master will then parse the configuration file and fork new
  21. workers.
  22.  
  23. To understand better how these signals are used, it is important to understand
  24. the whole restart mechanism.
  25.  
  26. First, an existing haproxy process is running. The administrator uses a system
  27. specific command such as "/etc/init.d/haproxy reload" to indicate he wants to
  28. take the new configuration file into effect. What happens then is the following.
  29. First, the service script (/etc/init.d/haproxy or equivalent) will verify that
  30. the configuration file parses correctly using "haproxy -c". After that it will
  31. try to start haproxy with this configuration file, using "-st" or "-sf".
  32.  
  33. Then HAProxy tries to bind to all listening ports. If some fatal errors happen
  34. (eg: address not present on the system, permission denied), the process quits
  35. with an error. If a socket binding fails because a port is already in use, then
  36. the process will first send a SIGTTOU signal to all the pids specified in the
  37. "-st" or "-sf" pid list. This is what is called the "pause" signal. It instructs
  38. all existing haproxy processes to temporarily stop listening to their ports so
  39. that the new process can try to bind again. During this time, the old process
  40. continues to process existing connections. If the binding still fails (because
  41. for example a port is shared with another daemon), then the new process sends a
  42. SIGTTIN signal to the old processes to instruct them to resume operations just
  43. as if nothing happened. The old processes will then restart listening to the
  44. ports and continue to accept connections. Not that this mechanism is system
  45. dependent and some operating systems may not support it in multi-process mode.
  46.  
  47. If the new process manages to bind correctly to all ports, then it sends either
  48. the SIGTERM (hard stop in case of "-st") or the SIGUSR1 (graceful stop in case
  49. of "-sf") to all processes to notify them that it is now in charge of operations
  50. and that the old processes will have to leave, either immediately or once they
  51. have finished their job.
  52.  
  53. It is important to note that during this timeframe, there are two small windows
  54. of a few milliseconds each where it is possible that a few connection failures
  55. will be noticed during high loads. Typically observed failure rates are around
  56. 1 failure during a reload operation every 10000 new connections per second,
  57. which means that a heavily loaded site running at 30000 new connections per
  58. second may see about 3 failed connection upon every reload. The two situations
  59. where this happens are :
  60.  
  61. - if the new process fails to bind due to the presence of the old process,
  62. it will first have to go through the SIGTTOU+SIGTTIN sequence, which
  63. typically lasts about one millisecond for a few tens of frontends, and
  64. during which some ports will not be bound to the old process and not yet
  65. bound to the new one. HAProxy works around this on systems that support the
  66. SO_REUSEPORT socket options, as it allows the new process to bind without
  67. first asking the old one to unbind. Most BSD systems have been supporting
  68. this almost forever. Linux has been supporting this in version 2.0 and
  69. dropped it around 2.2, but some patches were floating around by then. It
  70. was reintroduced in kernel 3.9, so if you are observing a connection
  71. failure rate above the one mentioned above, please ensure that your kernel
  72. is 3.9 or newer, or that relevant patches were backported to your kernel
  73. (less likely).
  74.  
  75. - when the old processes close the listening ports, the kernel may not always
  76. redistribute any pending connection that was remaining in the socket's
  77. backlog. Under high loads, a SYN packet may happen just before the socket
  78. is closed, and will lead to an RST packet being sent to the client. In some
  79. critical environments where even one drop is not acceptable, these ones are
  80. sometimes dealt with using firewall rules to block SYN packets during the
  81. reload, forcing the client to retransmit. This is totally system-dependent,
  82. as some systems might be able to visit other listening queues and avoid
  83. this RST. A second case concerns the ACK from the client on a local socket
  84. that was in SYN_RECV state just before the close. This ACK will lead to an
  85. RST packet while the haproxy process is still not aware of it. This one is
  86. harder to get rid of, though the firewall filtering rules mentioned above
  87. will work well if applied one second or so before restarting the process.
  88.  
  89. For the vast majority of users, such drops will never ever happen since they
  90. don't have enough load to trigger the race conditions. And for most high traffic
  91. users, the failure rate is still fairly within the noise margin provided that at
  92. least SO_REUSEPORT is properly supported on their systems.