2. Quick reminder about HAProxy’s architecture

  1. HAProxy is a multi-threaded, event-driven, non-blocking daemon. This means is
  2. uses event multiplexing to schedule all of its activities instead of relying on
  3. the system to schedule between multiple activities. Most of the time it runs as
  4. a single process, so the output of "ps aux" on a system will report only one
  5. "haproxy" process, unless a soft reload is in progress and an older process is
  6. finishing its job in parallel to the new one. It is thus always easy to trace
  7. its activity using the strace utility. In order to scale with the number of
  8. available processors, by default haproxy will start one worker thread per
  9. processor it is allowed to run on. Unless explicitly configured differently,
  10. the incoming traffic is spread over all these threads, all running the same
  11. event loop. A great care is taken to limit inter-thread dependencies to the
  12. strict minimum, so as to try to achieve near-linear scalability. This has some
  13. impacts such as the fact that a given connection is served by a single thread.
  14. Thus in order to use all available processing capacity, it is needed to have at
  15. least as many connections as there are threads, which is almost always granted.
  16.  
  17. HAProxy is designed to isolate itself into a chroot jail during startup, where
  18. it cannot perform any file-system access at all. This is also true for the
  19. libraries it depends on (eg: libc, libssl, etc). The immediate effect is that
  20. a running process will not be able to reload a configuration file to apply
  21. changes, instead a new process will be started using the updated configuration
  22. file. Some other less obvious effects are that some timezone files or resolver
  23. files the libc might attempt to access at run time will not be found, though
  24. this should generally not happen as they're not needed after startup. A nice
  25. consequence of this principle is that the HAProxy process is totally stateless,
  26. and no cleanup is needed after it's killed, so any killing method that works
  27. will do the right thing.
  28.  
  29. HAProxy doesn't write log files, but it relies on the standard syslog protocol
  30. to send logs to a remote server (which is often located on the same system).
  31.  
  32. HAProxy uses its internal clock to enforce timeouts, that is derived from the
  33. system's time but where unexpected drift is corrected. This is done by limiting
  34. the time spent waiting in poll() for an event, and measuring the time it really
  35. took. In practice it never waits more than one second. This explains why, when
  36. running strace over a completely idle process, periodic calls to poll() (or any
  37. of its variants) surrounded by two gettimeofday() calls are noticed. They are
  38. normal, completely harmless and so cheap that the load they imply is totally
  39. undetectable at the system scale, so there's nothing abnormal there. Example :
  40.  
  41. 16:35:40.002320 gettimeofday({1442759740, 2605}, NULL) = 0
  42. 16:35:40.002942 epoll_wait(0, {}, 200, 1000) = 0
  43. 16:35:41.007542 gettimeofday({1442759741, 7641}, NULL) = 0
  44. 16:35:41.007998 gettimeofday({1442759741, 8114}, NULL) = 0
  45. 16:35:41.008391 epoll_wait(0, {}, 200, 1000) = 0
  46. 16:35:42.011313 gettimeofday({1442759742, 11411}, NULL) = 0
  47.  
  48. HAProxy is a TCP proxy, not a router. It deals with established connections that
  49. have been validated by the kernel, and not with packets of any form nor with
  50. sockets in other states (eg: no SYN_RECV nor TIME_WAIT), though their existence
  51. may prevent it from binding a port. It relies on the system to accept incoming
  52. connections and to initiate outgoing connections. An immediate effect of this is
  53. that there is no relation between packets observed on the two sides of a
  54. forwarded connection, which can be of different size, numbers and even family.
  55. Since a connection may only be accepted from a socket in LISTEN state, all the
  56. sockets it is listening to are necessarily visible using the "netstat" utility
  57. to show listening sockets. Example :
  58.  
  59. # netstat -ltnp
  60. Active Internet connections (only servers)
  61. Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
  62. tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1629/sshd
  63. tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 2847/haproxy
  64. tcp 0 0 0.0.0.0:443 0.0.0.0:* LISTEN 2847/haproxy