1.2. HTTP request

  1. First, let's consider this HTTP request :
  2.  
  3. Line Contents
  4. number
  5. 1 GET /serv/login.php?lang=en&profile=2 HTTP/1.1
  6. 2 Host: www.mydomain.com
  7. 3 User-agent: my small browser
  8. 4 Accept: image/jpeg, image/gif
  9. 5 Accept: image/png

1.2.1. The Request line

  1. Line 1 is the "request line". It is always composed of 3 fields :
  2.  
  3. - a METHOD : GET
  4. - a URI : /serv/login.php?lang=en&profile=2
  5. - a version tag : HTTP/1.1
  6.  
  7. All of them are delimited by what the standard calls LWS (linear white spaces),
  8. which are commonly spaces, but can also be tabs or line feeds/carriage returns
  9. followed by spaces/tabs. The method itself cannot contain any colon (':') and
  10. is limited to alphabetic letters. All those various combinations make it
  11. desirable that HAProxy performs the splitting itself rather than leaving it to
  12. the user to write a complex or inaccurate regular expression.
  13.  
  14. The URI itself can have several forms :
  15.  
  16. - A "relative URI" :
  17.  
  18. /serv/login.php?lang=en&profile=2
  19.  
  20. It is a complete URL without the host part. This is generally what is
  21. received by servers, reverse proxies and transparent proxies.
  22.  
  23. - An "absolute URI", also called a "URL" :
  24.  
  25. http://192.168.0.12:8080/serv/login.php?lang=en&profile=2
  26.  
  27. It is composed of a "scheme" (the protocol name followed by '://'), a host
  28. name or address, optionally a colon (':') followed by a port number, then
  29. a relative URI beginning at the first slash ('/') after the address part.
  30. This is generally what proxies receive, but a server supporting HTTP/1.1
  31. must accept this form too.
  32.  
  33. - a star ('*') : this form is only accepted in association with the OPTIONS
  34. method and is not relayable. It is used to inquiry a next hop's
  35. capabilities.
  36.  
  37. - an address:port combination : 192.168.0.12:80
  38. This is used with the CONNECT method, which is used to establish TCP
  39. tunnels through HTTP proxies, generally for HTTPS, but sometimes for
  40. other protocols too.
  41.  
  42. In a relative URI, two sub-parts are identified. The part before the question
  43. mark is called the "path". It is typically the relative path to static objects
  44. on the server. The part after the question mark is called the "query string".
  45. It is mostly used with GET requests sent to dynamic scripts and is very
  46. specific to the language, framework or application in use.
  47.  
  48. HTTP/2 doesn't convey a version information with the request, so the version is
  49. assumed to be the same as the one of the underlying protocol (i.e. "HTTP/2").
  50. However, haproxy natively processes HTTP/1.x requests and headers, so requests
  51. received over an HTTP/2 connection are transcoded to HTTP/1.1 before being
  52. processed. This explains why they still appear as "HTTP/1.1" in haproxy's logs
  53. as well as in server logs.

1.2.2. The request headers

  1. The headers start at the second line. They are composed of a name at the
  2. beginning of the line, immediately followed by a colon (':'). Traditionally,
  3. an LWS is added after the colon but that's not required. Then come the values.
  4. Multiple identical headers may be folded into one single line, delimiting the
  5. values with commas, provided that their order is respected. This is commonly
  6. encountered in the "Cookie:" field. A header may span over multiple lines if
  7. the subsequent lines begin with an LWS. In the example in 1.2, lines 4 and 5
  8. define a total of 3 values for the "Accept:" header.
  9.  
  10. Contrary to a common misconception, header names are not case-sensitive, and
  11. their values are not either if they refer to other header names (such as the
  12. "Connection:" header). In HTTP/2, header names are always sent in lower case,
  13. as can be seen when running in debug mode.
  14.  
  15. The end of the headers is indicated by the first empty line. People often say
  16. that it's a double line feed, which is not exact, even if a double line feed
  17. is one valid form of empty line.
  18.  
  19. Fortunately, HAProxy takes care of all these complex combinations when indexing
  20. headers, checking values and counting them, so there is no reason to worry
  21. about the way they could be written, but it is important not to accuse an
  22. application of being buggy if it does unusual, valid things.
  23.  
  24. Important note:
  25. As suggested by RFC7231, HAProxy normalizes headers by replacing line breaks
  26. in the middle of headers by LWS in order to join multi-line headers. This
  27. is necessary for proper analysis and helps less capable HTTP parsers to work
  28. correctly and not to be fooled by such complex constructs.