3.1. What HAProxy is and isn’t

  1. HAProxy is :
  2.  
  3. - a TCP proxy : it can accept a TCP connection from a listening socket,
  4. connect to a server and attach these sockets together allowing traffic to
  5. flow in both directions;
  6.  
  7. - an HTTP reverse-proxy (called a "gateway" in HTTP terminology) : it presents
  8. itself as a server, receives HTTP requests over connections accepted on a
  9. listening TCP socket, and passes the requests from these connections to
  10. servers using different connections.
  11.  
  12. - an SSL terminator / initiator / offloader : SSL/TLS may be used on the
  13. connection coming from the client, on the connection going to the server,
  14. or even on both connections.
  15.  
  16. - a TCP normalizer : since connections are locally terminated by the operating
  17. system, there is no relation between both sides, so abnormal traffic such as
  18. invalid packets, flag combinations, window advertisements, sequence numbers,
  19. incomplete connections (SYN floods), or so will not be passed to the other
  20. side. This protects fragile TCP stacks from protocol attacks, and also
  21. allows to optimize the connection parameters with the client without having
  22. to modify the servers' TCP stack settings.
  23.  
  24. - an HTTP normalizer : when configured to process HTTP traffic, only valid
  25. complete requests are passed. This protects against a lot of protocol-based
  26. attacks. Additionally, protocol deviations for which there is a tolerance
  27. in the specification are fixed so that they don't cause problem on the
  28. servers (e.g. multiple-line headers).
  29.  
  30. - an HTTP fixing tool : it can modify / fix / add / remove / rewrite the URL
  31. or any request or response header. This helps fixing interoperability issues
  32. in complex environments.
  33.  
  34. - a content-based switch : it can consider any element from the request to
  35. decide what server to pass the request or connection to. Thus it is possible
  36. to handle multiple protocols over a same port (e.g. HTTP, HTTPS, SSH).
  37.  
  38. - a server load balancer : it can load balance TCP connections and HTTP
  39. requests. In TCP mode, load balancing decisions are taken for the whole
  40. connection. In HTTP mode, decisions are taken per request.
  41.  
  42. - a traffic regulator : it can apply some rate limiting at various points,
  43. protect the servers against overloading, adjust traffic priorities based on
  44. the contents, and even pass such information to lower layers and outer
  45. network components by marking packets.
  46.  
  47. - a protection against DDoS and service abuse : it can maintain a wide number
  48. of statistics per IP address, URL, cookie, etc and detect when an abuse is
  49. happening, then take action (slow down the offenders, block them, send them
  50. to outdated contents, etc).
  51.  
  52. - an observation point for network troubleshooting : due to the precision of
  53. the information reported in logs, it is often used to narrow down some
  54. network-related issues.
  55.  
  56. - an HTTP compression offloader : it can compress responses which were not
  57. compressed by the server, thus reducing the page load time for clients with
  58. poor connectivity or using high-latency, mobile networks.
  59.  
  60. HAProxy is not :
  61.  
  62. - an explicit HTTP proxy, i.e. the proxy that browsers use to reach the
  63. internet. There are excellent open-source software dedicated for this task,
  64. such as Squid. However HAProxy can be installed in front of such a proxy to
  65. provide load balancing and high availability.
  66.  
  67. - a caching proxy : it will return the contents received from the server as-is
  68. and will not interfere with any caching policy. There are excellent
  69. open-source software for this task such as Varnish. HAProxy can be installed
  70. in front of such a cache to provide SSL offloading, and scalability through
  71. smart load balancing.
  72.  
  73. - a data scrubber : it will not modify the body of requests nor responses.
  74.  
  75. - a web server : during startup, it isolates itself inside a chroot jail and
  76. drops its privileges, so that it will not perform any single file-system
  77. access once started. As such it cannot be turned into a web server. There
  78. are excellent open-source software for this such as Apache or Nginx, and
  79. HAProxy can be installed in front of them to provide load balancing and
  80. high availability.
  81.  
  82. - a packet-based load balancer : it will not see IP packets nor UDP datagrams,
  83. will not perform NAT or even less DSR. These are tasks for lower layers.
  84. Some kernel-based components such as IPVS (Linux Virtual Server) already do
  85. this pretty well and complement perfectly with HAProxy.