3.3. Basic features

  1. This section will enumerate a number of features that HAProxy implements, some
  2. of which are generally expected from any modern load balancer, and some of
  3. which are a direct benefit of HAProxy's architecture. More advanced features
  4. will be detailed in the next section.

3.3.1. Basic features : Proxying

  1. Proxying is the action of transferring data between a client and a server over
  2. two independent connections. The following basic features are supported by
  3. HAProxy regarding proxying and connection management :
  4.  
  5. - Provide the server with a clean connection to protect them against any
  6. client-side defect or attack;
  7.  
  8. - Listen to multiple IP addresses and/or ports, even port ranges;
  9.  
  10. - Transparent accept : intercept traffic targeting any arbitrary IP address
  11. that doesn't even belong to the local system;
  12.  
  13. - Server port doesn't need to be related to listening port, and may even be
  14. translated by a fixed offset (useful with ranges);
  15.  
  16. - Transparent connect : spoof the client's (or any) IP address if needed
  17. when connecting to the server;
  18.  
  19. - Provide a reliable return IP address to the servers in multi-site LBs;
  20.  
  21. - Offload the server thanks to buffers and possibly short-lived connections
  22. to reduce their concurrent connection count and their memory footprint;
  23.  
  24. - Optimize TCP stacks (e.g. SACK), congestion control, and reduce RTT impacts;
  25.  
  26. - Support different protocol families on both sides (e.g. IPv4/IPv6/Unix);
  27.  
  28. - Timeout enforcement : HAProxy supports multiple levels of timeouts depending
  29. on the stage the connection is, so that a dead client or server, or an
  30. attacker cannot be granted resources for too long;
  31.  
  32. - Protocol validation: HTTP, SSL, or payload are inspected and invalid
  33. protocol elements are rejected, unless instructed to accept them anyway;
  34.  
  35. - Policy enforcement : ensure that only what is allowed may be forwarded;
  36.  
  37. - Both incoming and outgoing connections may be limited to certain network
  38. namespaces (Linux only), making it easy to build a cross-container,
  39. multi-tenant load balancer;
  40.  
  41. - PROXY protocol presents the client's IP address to the server even for
  42. non-HTTP traffic. This is an HAProxy extension that was adopted by a number
  43. of third-party products by now, at least these ones at the time of writing :
  44. - client : haproxy, stud, stunnel, exaproxy, ELB, squid
  45. - server : haproxy, stud, postfix, exim, nginx, squid, node.js, varnish

3.3.2. Basic features : SSL

  1. HAProxy's SSL stack is recognized as one of the most featureful according to
  2. Google's engineers (http://istlsfastyet.com/). The most commonly used features
  3. making it quite complete are :
  4.  
  5. - SNI-based multi-hosting with no limit on sites count and focus on
  6. performance. At least one deployment is known for running 50000 domains
  7. with their respective certificates;
  8.  
  9. - support for wildcard certificates reduces the need for many certificates ;
  10.  
  11. - certificate-based client authentication with configurable policies on
  12. failure to present a valid certificate. This allows to present a different
  13. server farm to regenerate the client certificate for example;
  14.  
  15. - authentication of the backend server ensures the backend server is the real
  16. one and not a man in the middle;
  17.  
  18. - authentication with the backend server lets the backend server know it's
  19. really the expected haproxy node that is connecting to it;
  20.  
  21. - TLS NPN and ALPN extensions make it possible to reliably offload SPDY/HTTP2
  22. connections and pass them in clear text to backend servers;
  23.  
  24. - OCSP stapling further reduces first page load time by delivering inline an
  25. OCSP response when the client requests a Certificate Status Request;
  26.  
  27. - Dynamic record sizing provides both high performance and low latency, and
  28. significantly reduces page load time by letting the browser start to fetch
  29. new objects while packets are still in flight;
  30.  
  31. - permanent access to all relevant SSL/TLS layer information for logging,
  32. access control, reporting etc. These elements can be embedded into HTTP
  33. header or even as a PROXY protocol extension so that the offloaded server
  34. gets all the information it would have had if it performed the SSL
  35. termination itself.
  36.  
  37. - Detect, log and block certain known attacks even on vulnerable SSL libs,
  38. such as the Heartbleed attack affecting certain versions of OpenSSL.
  39.  
  40. - support for stateless session resumption (RFC 5077 TLS Ticket extension).
  41. TLS tickets can be updated from CLI which provides them means to implement
  42. Perfect Forward Secrecy by frequently rotating the tickets.

3.3.3. Basic features : Monitoring

  1. HAProxy focuses a lot on availability. As such it cares about servers state,
  2. and about reporting its own state to other network components :
  3.  
  4. - Servers' state is continuously monitored using per-server parameters. This
  5. ensures the path to the server is operational for regular traffic;
  6.  
  7. - Health checks support two hysteresis for up and down transitions in order
  8. to protect against state flapping;
  9.  
  10. - Checks can be sent to a different address/port/protocol : this makes it
  11. easy to check a single service that is considered representative of multiple
  12. ones, for example the HTTPS port for an HTTP+HTTPS server.
  13.  
  14. - Servers can track other servers and go down simultaneously : this ensures
  15. that servers hosting multiple services can fail atomically and that no one
  16. will be sent to a partially failed server;
  17.  
  18. - Agents may be deployed on the server to monitor load and health : a server
  19. may be interested in reporting its load, operational status, administrative
  20. status independently from what health checks can see. By running a simple
  21. agent on the server, it's possible to consider the server's view of its own
  22. health in addition to the health checks validating the whole path;
  23.  
  24. - Various check methods are available : TCP connect, HTTP request, SMTP hello,
  25. SSL hello, LDAP, SQL, Redis, send/expect scripts, all with/without SSL;
  26.  
  27. - State change is notified in the logs and stats page with the failure reason
  28. (e.g. the HTTP response received at the moment the failure was detected). An
  29. e-mail can also be sent to a configurable address upon such a change ;
  30.  
  31. - Server state is also reported on the stats interface and can be used to take
  32. routing decisions so that traffic may be sent to different farms depending
  33. on their sizes and/or health (e.g. loss of an inter-DC link);
  34.  
  35. - HAProxy can use health check requests to pass information to the servers,
  36. such as their names, weight, the number of other servers in the farm etc.
  37. so that servers can adjust their response and decisions based on this
  38. knowledge (e.g. postpone backups to keep more CPU available);
  39.  
  40. - Servers can use health checks to report more detailed state than just on/off
  41. (e.g. I would like to stop, please stop sending new visitors);
  42.  
  43. - HAProxy itself can report its state to external components such as routers
  44. or other load balancers, allowing to build very complete multi-path and
  45. multi-layer infrastructures.

3.3.4. Basic features : High availability

  1. Just like any serious load balancer, HAProxy cares a lot about availability to
  2. ensure the best global service continuity :
  3.  
  4. - Only valid servers are used ; the other ones are automatically evicted from
  5. load balancing farms ; under certain conditions it is still possible to
  6. force to use them though;
  7.  
  8. - Support for a graceful shutdown so that it is possible to take servers out
  9. of a farm without affecting any connection;
  10.  
  11. - Backup servers are automatically used when active servers are down and
  12. replace them so that sessions are not lost when possible. This also allows
  13. to build multiple paths to reach the same server (e.g. multiple interfaces);
  14.  
  15. - Ability to return a global failed status for a farm when too many servers
  16. are down. This, combined with the monitoring capabilities makes it possible
  17. for an upstream component to choose a different LB node for a given service;
  18.  
  19. - Stateless design makes it easy to build clusters : by design, HAProxy does
  20. its best to ensure the highest service continuity without having to store
  21. information that could be lost in the event of a failure. This ensures that
  22. a takeover is the most seamless possible;
  23.  
  24. - Integrates well with standard VRRP daemon keepalived : HAProxy easily tells
  25. keepalived about its state and copes very well with floating virtual IP
  26. addresses. Note: only use IP redundancy protocols (VRRP/CARP) over cluster-
  27. based solutions (Heartbeat, ...) as they're the ones offering the fastest,
  28. most seamless, and most reliable switchover.

3.3.5. Basic features : Load balancing

  1. HAProxy offers a fairly complete set of load balancing features, most of which
  2. are unfortunately not available in a number of other load balancing products :
  3.  
  4. - no less than 9 load balancing algorithms are supported, some of which apply
  5. to input data to offer an infinite list of possibilities. The most common
  6. ones are round-robin (for short connections, pick each server in turn),
  7. leastconn (for long connections, pick the least recently used of the servers
  8. with the lowest connection count), source (for SSL farms or terminal server
  9. farms, the server directly depends on the client's source address), URI (for
  10. HTTP caches, the server directly depends on the HTTP URI), hdr (the server
  11. directly depends on the contents of a specific HTTP header field), first
  12. (for short-lived virtual machines, all connections are packed on the
  13. smallest possible subset of servers so that unused ones can be powered
  14. down);
  15.  
  16. - all algorithms above support per-server weights so that it is possible to
  17. accommodate from different server generations in a farm, or direct a small
  18. fraction of the traffic to specific servers (debug mode, running the next
  19. version of the software, etc);
  20.  
  21. - dynamic weights are supported for round-robin, leastconn and consistent
  22. hashing ; this allows server weights to be modified on the fly from the CLI
  23. or even by an agent running on the server;
  24.  
  25. - slow-start is supported whenever a dynamic weight is supported; this allows
  26. a server to progressively take the traffic. This is an important feature
  27. for fragile application servers which require to compile classes at runtime
  28. as well as cold caches which need to fill up before being run at full
  29. throttle;
  30.  
  31. - hashing can apply to various elements such as client's source address, URL
  32. components, query string element, header field values, POST parameter, RDP
  33. cookie;
  34.  
  35. - consistent hashing protects server farms against massive redistribution when
  36. adding or removing servers in a farm. That's very important in large cache
  37. farms and it allows slow-start to be used to refill cold caches;
  38.  
  39. - a number of internal metrics such as the number of connections per server,
  40. per backend, the amount of available connection slots in a backend etc makes
  41. it possible to build very advanced load balancing strategies.

3.3.6. Basic features : Stickiness

  1. Application load balancing would be useless without stickiness. HAProxy provides
  2. a fairly comprehensive set of possibilities to maintain a visitor on the same
  3. server even across various events such as server addition/removal, down/up
  4. cycles, and some methods are designed to be resistant to the distance between
  5. multiple load balancing nodes in that they don't require any replication :
  6.  
  7. - stickiness information can be individually matched and learned from
  8. different places if desired. For example a JSESSIONID cookie may be matched
  9. both in a cookie and in the URL. Up to 8 parallel sources can be learned at
  10. the same time and each of them may point to a different stick-table;
  11.  
  12. - stickiness information can come from anything that can be seen within a
  13. request or response, including source address, TCP payload offset and
  14. length, HTTP query string elements, header field values, cookies, and so
  15. on.
  16.  
  17. - stick-tables are replicated between all nodes in a multi-master fashion;
  18.  
  19. - commonly used elements such as SSL-ID or RDP cookies (for TSE farms) are
  20. directly accessible to ease manipulation;
  21.  
  22. - all sticking rules may be dynamically conditioned by ACLs;
  23.  
  24. - it is possible to decide not to stick to certain servers, such as backup
  25. servers, so that when the nominal server comes back, it automatically takes
  26. the load back. This is often used in multi-path environments;
  27.  
  28. - in HTTP it is often preferred not to learn anything and instead manipulate
  29. a cookie dedicated to stickiness. For this, it's possible to detect,
  30. rewrite, insert or prefix such a cookie to let the client remember what
  31. server was assigned;
  32.  
  33. - the server may decide to change or clean the stickiness cookie on logout,
  34. so that leaving visitors are automatically unbound from the server;
  35.  
  36. - using ACL-based rules it is also possible to selectively ignore or enforce
  37. stickiness regardless of the server's state; combined with advanced health
  38. checks, that helps admins verify that the server they're installing is up
  39. and running before presenting it to the whole world;
  40.  
  41. - an innovative mechanism to set a maximum idle time and duration on cookies
  42. ensures that stickiness can be smoothly stopped on devices which are never
  43. closed (smartphones, TVs, home appliances) without having to store them on
  44. persistent storage;
  45.  
  46. - multiple server entries may share the same stickiness keys so that
  47. stickiness is not lost in multi-path environments when one path goes down;
  48.  
  49. - soft-stop ensures that only users with stickiness information will continue
  50. to reach the server they've been assigned to but no new users will go there.

3.3.7. Basic features : Sampling and converting information

  1. HAProxy supports information sampling using a wide set of "sample fetch
  2. functions". The principle is to extract pieces of information known as samples,
  3. for immediate use. This is used for stickiness, to build conditions, to produce
  4. information in logs or to enrich HTTP headers.
  5.  
  6. Samples can be fetched from various sources :
  7.  
  8. - constants : integers, strings, IP addresses, binary blocks;
  9.  
  10. - the process : date, environment variables, server/frontend/backend/process
  11. state, byte/connection counts/rates, queue length, random generator, ...
  12.  
  13. - variables : per-session, per-request, per-response variables;
  14.  
  15. - the client connection : source and destination addresses and ports, and all
  16. related statistics counters;
  17.  
  18. - the SSL client session : protocol, version, algorithm, cipher, key size,
  19. session ID, all client and server certificate fields, certificate serial,
  20. SNI, ALPN, NPN, client support for certain extensions;
  21.  
  22. - request and response buffers contents : arbitrary payload at offset/length,
  23. data length, RDP cookie, decoding of SSL hello type, decoding of TLS SNI;
  24.  
  25. - HTTP (request and response) : method, URI, path, query string arguments,
  26. status code, headers values, positional header value, cookies, captures,
  27. authentication, body elements;
  28.  
  29. A sample may then pass through a number of operators known as "converters" to
  30. experience some transformation. A converter consumes a sample and produces a
  31. new one, possibly of a completely different type. For example, a converter may
  32. be used to return only the integer length of the input string, or could turn a
  33. string to upper case. Any arbitrary number of converters may be applied in
  34. series to a sample before final use. Among all available sample converters, the
  35. following ones are the most commonly used :
  36.  
  37. - arithmetic and logic operators : they make it possible to perform advanced
  38. computation on input data, such as computing ratios, percentages or simply
  39. converting from one unit to another one;
  40.  
  41. - IP address masks are useful when some addresses need to be grouped by larger
  42. networks;
  43.  
  44. - data representation : URL-decode, base64, hex, JSON strings, hashing;
  45.  
  46. - string conversion : extract substrings at fixed positions, fixed length,
  47. extract specific fields around certain delimiters, extract certain words,
  48. change case, apply regex-based substitution;
  49.  
  50. - date conversion : convert to HTTP date format, convert local to UTC and
  51. conversely, add or remove offset;
  52.  
  53. - lookup an entry in a stick table to find statistics or assigned server;
  54.  
  55. - map-based key-to-value conversion from a file (mostly used for geolocation).

3.3.8. Basic features : Maps

  1. Maps are a powerful type of converter consisting in loading a two-columns file
  2. into memory at boot time, then looking up each input sample from the first
  3. column and either returning the corresponding pattern on the second column if
  4. the entry was found, or returning a default value. The output information also
  5. being a sample, it can in turn experience other transformations including other
  6. map lookups. Maps are most commonly used to translate the client's IP address
  7. to an AS number or country code since they support a longest match for network
  8. addresses but they can be used for various other purposes.
  9.  
  10. Part of their strength comes from being updatable on the fly either from the CLI
  11. or from certain actions using other samples, making them capable of storing and
  12. retrieving information between subsequent accesses. Another strength comes from
  13. the binary tree based indexation which makes them extremely fast even when they
  14. contain hundreds of thousands of entries, making geolocation very cheap and easy
  15. to set up.

3.3.9. Basic features : ACLs and conditions

  1. Most operations in HAProxy can be made conditional. Conditions are built by
  2. combining multiple ACLs using logic operators (AND, OR, NOT). Each ACL is a
  3. series of tests based on the following elements :
  4.  
  5. - a sample fetch method to retrieve the element to test ;
  6.  
  7. - an optional series of converters to transform the element ;
  8.  
  9. - a list of patterns to match against ;
  10.  
  11. - a matching method to indicate how to compare the patterns with the sample
  12.  
  13. For example, the sample may be taken from the HTTP "Host" header, it could then
  14. be converted to lower case, then matched against a number of regex patterns
  15. using the regex matching method.
  16.  
  17. Technically, ACLs are built on the same core as the maps, they share the exact
  18. same internal structure, pattern matching methods and performance. The only real
  19. difference is that instead of returning a sample, they only return "found" or
  20. or "not found". In terms of usage, ACL patterns may be declared inline in the
  21. configuration file and do not require their own file. ACLs may be named for ease
  22. of use or to make configurations understandable. A named ACL may be declared
  23. multiple times and it will evaluate all definitions in turn until one matches.
  24.  
  25. About 13 different pattern matching methods are provided, among which IP address
  26. mask, integer ranges, substrings, regex. They work like functions, and just like
  27. with any programming language, only what is needed is evaluated, so when a
  28. condition involving an OR is already true, next ones are not evaluated, and
  29. similarly when a condition involving an AND is already false, the rest of the
  30. condition is not evaluated.
  31.  
  32. There is no practical limit to the number of declared ACLs, and a handful of
  33. commonly used ones are provided. However experience has shown that setups using
  34. a lot of named ACLs are quite hard to troubleshoot and that sometimes using
  35. anonymous ACLs inline is easier as it requires less references out of the scope
  36. being analyzed.

3.3.10. Basic features : Content switching

  1. HAProxy implements a mechanism known as content-based switching. The principle
  2. is that a connection or request arrives on a frontend, then the information
  3. carried with this request or connection are processed, and at this point it is
  4. possible to write ACLs-based conditions making use of these information to
  5. decide what backend will process the request. Thus the traffic is directed to
  6. one backend or another based on the request's contents. The most common example
  7. consists in using the Host header and/or elements from the path (sub-directories
  8. or file-name extensions) to decide whether an HTTP request targets a static
  9. object or the application, and to route static objects traffic to a backend made
  10. of fast and light servers, and all the remaining traffic to a more complex
  11. application server, thus constituting a fine-grained virtual hosting solution.
  12. This is quite convenient to make multiple technologies coexist as a more global
  13. solution.
  14.  
  15. Another use case of content-switching consists in using different load balancing
  16. algorithms depending on various criteria. A cache may use a URI hash while an
  17. application would use round-robin.
  18.  
  19. Last but not least, it allows multiple customers to use a small share of a
  20. common resource by enforcing per-backend (thus per-customer connection limits).
  21.  
  22. Content switching rules scale very well, though their performance may depend on
  23. the number and complexity of the ACLs in use. But it is also possible to write
  24. dynamic content switching rules where a sample value directly turns into a
  25. backend name and without making use of ACLs at all. Such configurations have
  26. been reported to work fine at least with 300000 backends in production.

3.3.11. Basic features : Stick-tables

  1. Stick-tables are commonly used to store stickiness information, that is, to keep
  2. a reference to the server a certain visitor was directed to. The key is then the
  3. identifier associated with the visitor (its source address, the SSL ID of the
  4. connection, an HTTP or RDP cookie, the customer number extracted from the URL or
  5. from the payload, ...) and the stored value is then the server's identifier.
  6.  
  7. Stick tables may use 3 different types of samples for their keys : integers,
  8. strings and addresses. Only one stick-table may be referenced in a proxy, and it
  9. is designated everywhere with the proxy name. Up to 8 keys may be tracked in
  10. parallel. The server identifier is committed during request or response
  11. processing once both the key and the server are known.
  12.  
  13. Stick-table contents may be replicated in active-active mode with other HAProxy
  14. nodes known as "peers" as well as with the new process during a reload operation
  15. so that all load balancing nodes share the same information and take the same
  16. routing decision if client's requests are spread over multiple nodes.
  17.  
  18. Since stick-tables are indexed on what allows to recognize a client, they are
  19. often also used to store extra information such as per-client statistics. The
  20. extra statistics take some extra space and need to be explicitly declared. The
  21. type of statistics that may be stored includes the input and output bandwidth,
  22. the number of concurrent connections, the connection rate and count over a
  23. period, the amount and frequency of errors, some specific tags and counters,
  24. etc. In order to support keeping such information without being forced to
  25. stick to a given server, a special "tracking" feature is implemented and allows
  26. to track up to 3 simultaneous keys from different tables at the same time
  27. regardless of stickiness rules. Each stored statistics may be searched, dumped
  28. and cleared from the CLI and adds to the live troubleshooting capabilities.
  29.  
  30. While this mechanism can be used to surclass a returning visitor or to adjust
  31. the delivered quality of service depending on good or bad behavior, it is
  32. mostly used to fight against service abuse and more generally DDoS as it allows
  33. to build complex models to detect certain bad behaviors at a high processing
  34. speed.

3.3.12. Basic features : Formatted strings

  1. There are many places where HAProxy needs to manipulate character strings, such
  2. as logs, redirects, header additions, and so on. In order to provide the
  3. greatest flexibility, the notion of Formatted strings was introduced, initially
  4. for logging purposes, which explains why it's still called "log-format". These
  5. strings contain escape characters allowing to introduce various dynamic data
  6. including variables and sample fetch expressions into strings, and even to
  7. adjust the encoding while the result is being turned into a string (for example,
  8. adding quotes). This provides a powerful way to build header contents or to
  9. customize log lines. Additionally, in order to remain simple to build most
  10. common strings, about 50 special tags are provided as shortcuts for information
  11. commonly used in logs.

3.3.13. Basic features : HTTP rewriting and redirection

  1. Installing a load balancer in front of an application that was never designed
  2. for this can be a challenging task without the proper tools. One of the most
  3. commonly requested operation in this case is to adjust requests and response
  4. headers to make the load balancer appear as the origin server and to fix hard
  5. coded information. This comes with changing the path in requests (which is
  6. strongly advised against), modifying Host header field, modifying the Location
  7. response header field for redirects, modifying the path and domain attribute
  8. for cookies, and so on. It also happens that a number of servers are somewhat
  9. verbose and tend to leak too much information in the response, making them more
  10. vulnerable to targeted attacks. While it's theoretically not the role of a load
  11. balancer to clean this up, in practice it's located at the best place in the
  12. infrastructure to guarantee that everything is cleaned up.
  13.  
  14. Similarly, sometimes the load balancer will have to intercept some requests and
  15. respond with a redirect to a new target URL. While some people tend to confuse
  16. redirects and rewriting, these are two completely different concepts, since the
  17. rewriting makes the client and the server see different things (and disagree on
  18. the location of the page being visited) while redirects ask the client to visit
  19. the new URL so that it sees the same location as the server.
  20.  
  21. In order to do this, HAProxy supports various possibilities for rewriting and
  22. redirects, among which :
  23.  
  24. - regex-based URL and header rewriting in requests and responses. Regex are
  25. the most commonly used tool to modify header values since they're easy to
  26. manipulate and well understood;
  27.  
  28. - headers may also be appended, deleted or replaced based on formatted strings
  29. so that it is possible to pass information there (e.g. client side TLS
  30. algorithm and cipher);
  31.  
  32. - HTTP redirects can use any 3xx code to a relative, absolute, or completely
  33. dynamic (formatted string) URI;
  34.  
  35. - HTTP redirects also support some extra options such as setting or clearing
  36. a specific cookie, dropping the query string, appending a slash if missing,
  37. and so on;
  38.  
  39. - all operations support ACL-based conditions;

3.3.14. Basic features : Server protection

  1. HAProxy does a lot to maximize service availability, and for this it takes
  2. large efforts to protect servers against overloading and attacks. The first
  3. and most important point is that only complete and valid requests are forwarded
  4. to the servers. The initial reason is that HAProxy needs to find the protocol
  5. elements it needs to stay synchronized with the byte stream, and the second
  6. reason is that until the request is complete, there is no way to know if some
  7. elements will change its semantics. The direct benefit from this is that servers
  8. are not exposed to invalid or incomplete requests. This is a very effective
  9. protection against slowloris attacks, which have almost no impact on HAProxy.
  10.  
  11. Another important point is that HAProxy contains buffers to store requests and
  12. responses, and that by only sending a request to a server when it's complete and
  13. by reading the whole response very quickly from the local network, the server
  14. side connection is used for a very short time and this preserves server
  15. resources as much as possible.
  16.  
  17. A direct extension to this is that HAProxy can artificially limit the number of
  18. concurrent connections or outstanding requests to a server, which guarantees
  19. that the server will never be overloaded even if it continuously runs at 100% of
  20. its capacity during traffic spikes. All excess requests will simply be queued to
  21. be processed when one slot is released. In the end, this huge resource savings
  22. most often ensures so much better server response times that it ends up actually
  23. being faster than by overloading the server. Queued requests may be redispatched
  24. to other servers, or even aborted in queue when the client aborts, which also
  25. protects the servers against the "reload effect", where each click on "reload"
  26. by a visitor on a slow-loading page usually induces a new request and maintains
  27. the server in an overloaded state.
  28.  
  29. The slow-start mechanism also protects restarting servers against high traffic
  30. levels while they're still finalizing their startup or compiling some classes.
  31.  
  32. Regarding the protocol-level protection, it is possible to relax the HTTP parser
  33. to accept non standard-compliant but harmless requests or responses and even to
  34. fix them. This allows bogus applications to be accessible while a fix is being
  35. developed. In parallel, offending messages are completely captured with a
  36. detailed report that help developers spot the issue in the application. The most
  37. dangerous protocol violations are properly detected and dealt with and fixed.
  38. For example malformed requests or responses with two Content-length headers are
  39. either fixed if the values are exactly the same, or rejected if they differ,
  40. since it becomes a security problem. Protocol inspection is not limited to HTTP,
  41. it is also available for other protocols like TLS or RDP.
  42.  
  43. When a protocol violation or attack is detected, there are various options to
  44. respond to the user, such as returning the common "HTTP 400 bad request",
  45. closing the connection with a TCP reset, or faking an error after a long delay
  46. ("tarpit") to confuse the attacker. All of these contribute to protecting the
  47. servers by discouraging the offending client from pursuing an attack that
  48. becomes very expensive to maintain.
  49.  
  50. HAProxy also proposes some more advanced options to protect against accidental
  51. data leaks and session crossing. Not only it can log suspicious server responses
  52. but it will also log and optionally block a response which might affect a given
  53. visitors' confidentiality. One such example is a cacheable cookie appearing in a
  54. cacheable response and which may result in an intermediary cache to deliver it
  55. to another visitor, causing an accidental session sharing.

3.3.15. Basic features : Logging

  1. Logging is an extremely important feature for a load balancer, first because a
  2. load balancer is often wrongly accused of causing the problems it reveals, and
  3. second because it is placed at a critical point in an infrastructure where all
  4. normal and abnormal activity needs to be analyzed and correlated with other
  5. components.
  6.  
  7. HAProxy provides very detailed logs, with millisecond accuracy and the exact
  8. connection accept time that can be searched in firewalls logs (e.g. for NAT
  9. correlation). By default, TCP and HTTP logs are quite detailed an contain
  10. everything needed for troubleshooting, such as source IP address and port,
  11. frontend, backend, server, timers (request receipt duration, queue duration,
  12. connection setup time, response headers time, data transfer time), global
  13. process state, connection counts, queue status, retries count, detailed
  14. stickiness actions and disconnect reasons, header captures with a safe output
  15. encoding. It is then possible to extend or replace this format to include any
  16. sampled data, variables, captures, resulting in very detailed information. For
  17. example it is possible to log the number of cumulative requests or number of
  18. different URLs visited by a client.
  19.  
  20. The log level may be adjusted per request using standard ACLs, so it is possible
  21. to automatically silent some logs considered as pollution and instead raise
  22. warnings when some abnormal behavior happen for a small part of the traffic
  23. (e.g. too many URLs or HTTP errors for a source address). Administrative logs
  24. are also emitted with their own levels to inform about the loss or recovery of a
  25. server for example.
  26.  
  27. Each frontend and backend may use multiple independent log outputs, which eases
  28. multi-tenancy. Logs are preferably sent over UDP, maybe JSON-encoded, and are
  29. truncated after a configurable line length in order to guarantee delivery.

3.3.16. Basic features : Statistics

  1. HAProxy provides a web-based statistics reporting interface with authentication,
  2. security levels and scopes. It is thus possible to provide each hosted customer
  3. with his own page showing only his own instances. This page can be located in a
  4. hidden URL part of the regular web site so that no new port needs to be opened.
  5. This page may also report the availability of other HAProxy nodes so that it is
  6. easy to spot if everything works as expected at a glance. The view is synthetic
  7. with a lot of details accessible (such as error causes, last access and last
  8. change duration, etc), which are also accessible as a CSV table that other tools
  9. may import to draw graphs. The page may self-refresh to be used as a monitoring
  10. page on a large display. In administration mode, the page also allows to change
  11. server state to ease maintenance operations.