tornado.httpclient — Asynchronous HTTP client¶

tornado.httpclient — Asynchronous HTTP client¶

Blocking and non-blocking HTTP client interfaces.

This module defines a common interface shared by two implementations,simple_httpclient and curl_httpclient. Applications may eitherinstantiate their chosen implementation class directly or use theAsyncHTTPClient class from this module, which selects an implementationthat can be overridden with the AsyncHTTPClient.configure method.

The default implementation is simple_httpclient, and this is expectedto be suitable for most users’ needs. However, some applications may wishto switch to curl_httpclient for reasons such as the following:

curl_httpclient has some features not found in simple_httpclient,including support for HTTP proxies and the ability to use a specifiednetwork interface.
curl_httpclient is more likely to be compatible with sites that arenot-quite-compliant with the HTTP spec, or sites that use little-exercisedfeatures of HTTP.
curl_httpclient is faster.
curl_httpclient was the default prior to Tornado 2.0.
Note that if you are using curl_httpclient, it is highlyrecommended that you use a recent version of libcurl andpycurl. Currently the minimum supported version of libcurl is7.21.1, and the minimum version of pycurl is 7.18.2. It is highlyrecommended that your libcurl installation is built withasynchronous DNS resolver (threaded or c-ares), otherwise you mayencounter various problems with request timeouts (for moreinformation, seehttp://curl.haxx.se/libcurl/c/curl_easy_setopt.html#CURLOPTCONNECTTIMEOUTMSand comments in curl_httpclient.py).

To select curl_httpclient, call AsyncHTTPClient.configure at startup:

AsyncHTTPClient.configure("tornado.curl_httpclient.CurlAsyncHTTPClient")

HTTP client interfaces¶

class tornado.httpclient.HTTPClient(async_client_class=None, **kwargs)[源代码]¶

A blocking HTTP client.

This interface is provided for convenience and testing; most applicationsthat are running an IOLoop will want to use AsyncHTTPClient instead.Typical usage looks like this:

httpclient = httpclient.HTTPClient()
try:
    response = http_client.fetch("http://www.google.com/&#34;)
    print response.body
except httpclient.HTTPError as e:
    # HTTPError is raised for non-200 responses; the response
    # can be found in e.response.
    print("Error: " + str(e))
except Exception as e:
    # Other errors are possible, such as IOError.
    print("Error: " + str(e))
http_client.close()

close()[源代码]¶

Closes the HTTPClient, freeing any resources used.

fetch(_request, **kwargs)[源代码]¶: Executes a request, returning an HTTPResponse.

The request may be either a string URL or an HTTPRequest object.If it is a string, we construct an HTTPRequest using any additionalkwargs: HTTPRequest(request, kwargs)

If an error occurs during the fetch, we raise an HTTPError unlessthe raiseerror keyword argument is set to False.

_class tornado.httpclient.AsyncHTTPClient[源代码]¶

An non-blocking HTTP client.

Example usage:

def handle_response(response):
if response.error:
print "Error:", response.error
else:
print response.body

http_client = AsyncHTTPClient()
http_client.fetch("http://www.google.com/", handle_response)

The constructor for this class is magic in several respects: Itactually creates an instance of an implementation-specificsubclass, and instances are reused as a kind of pseudo-singleton(one per IOLoop). The keyword argument force_instance=Truecan be used to suppress this singleton behavior. Unlessforce_instance=True is used, no arguments other thanio_loop should be passed to the AsyncHTTPClient constructor.The implementation subclass as well as arguments to itsconstructor can be set with the static method configure()

All AsyncHTTPClient implementations support a defaultskeyword argument, which can be used to set default values forHTTPRequest attributes. For example:

AsyncHTTPClient.configure(
None, defaults=dict(user_agent="MyUserAgent"))
# or with force_instance:
client = AsyncHTTPClient(force_instance=True,
defaults=dict(user_agent="MyUserAgent"))

在 4.1 版更改: The io_loop argument is deprecated.

close()[源代码]¶: Destroys this HTTP client, freeing any file descriptors used.

This method is not needed in normal use due to the waythat AsyncHTTPClient objects are transparently reused.close() is generally only necessary when either theIOLoop is also being closed, or the forceinstance=Trueargument was used when creating the AsyncHTTPClient.

No other methods may be called on the AsyncHTTPClient afterclose().

fetch(_request, callback=None, raise_error=True, _kwargs)[源代码]¶: Executes a request, asynchronously returning an HTTPResponse.

The request may be either a string URL or an HTTPRequest object.If it is a string, we construct an HTTPRequest using any additionalkwargs: HTTPRequest(request, **kwargs)

This method returns a Future whose result is anHTTPResponse. By default, the Future will raise anHTTPError if the request returned a non-200 response code(other errors may also be raised if the server could not becontacted). Instead, if raise_error is set to False, theresponse will always be returned regardless of the responsecode.

If a callback is given, it will be invoked with the HTTPResponse.In the callback interface, HTTPError is not automatically raised.Instead, you must check the response’s error attribute orcall its rethrow method.

_classmethod configure(impl, **kwargs)[源代码]¶

Configures the AsyncHTTPClient subclass to use.

AsyncHTTPClient() actually creates an instance of a subclass.This method may be called with either a class object or thefully-qualified name of such a class (or None to use the default,SimpleAsyncHTTPClient)

If additional keyword arguments are given, they will be passedto the constructor of each subclass instance created. Thekeyword argument max_clients determines the maximum numberof simultaneous fetch() operations that canexecute in parallel on each IOLoop. Additional argumentsmay be supported depending on the implementation class in use.

Example:

AsyncHTTPClient.configure("tornado.curl_httpclient.CurlAsyncHTTPClient")

Request objects¶

class tornado.httpclient.HTTPRequest(url, method='GET', headers=None, body=None, auth_username=None, auth_password=None, auth_mode=None, connect_timeout=None, request_timeout=None, if_modified_since=None, follow_redirects=None, max_redirects=None, user_agent=None, use_gzip=None, network_interface=None, streaming_callback=None, header_callback=None, prepare_curl_callback=None, proxy_host=None, proxy_port=None, proxy_username=None, proxy_password=None, allow_nonstandard_methods=None, validate_cert=None, ca_certs=None, allow_ipv6=None, client_key=None, client_cert=None, body_producer=None, expect_100_continue=False, decompress_response=None, ssl_options=None)[源代码]¶: HTTP client request object.

All parameters except url are optional.

|参数:
|——-
|
- url (string) – URL to fetch
- method (string) – HTTP method, e.g. “GET” or “POST”
- headers (HTTPHeaders or dict) – Additional HTTP headers to pass on the request
- body – HTTP request body as a string (byte or unicode; if unicodethe utf-8 encoding will be used)
- body_producer – Callable used for lazy/asynchronous request bodies.It is called with one argument, a write function, and shouldreturn a Future. It should call the write function with newdata as it becomes available. The write function returns aFuture which can be used for flow control.Only one of body and bodyproducer maybe specified. body_producer is not supported oncurl_httpclient. When using body_producer it is recommendedto pass a Content-Length in the headers as otherwise chunkedencoding will be used, and many servers do not support chunkedencoding on requests. New in Tornado 4.0
- auth_username ([_string](https://docs.python.org/3.4/library/string.html#module-string)) – Username for HTTP authentication
- auth_password (string) – Password for HTTP authentication
- auth_mode (string) – Authentication mode; default is “basic”.Allowed values are implementation-defined; curlhttpclientsupports “basic” and “digest”; simple_httpclient only supports“basic”
- connect_timeout ([_float](https://docs.python.org/3.4/library/functions.html#float)) – Timeout for initial connection in seconds
- request_timeout (float) – Timeout for entire request in seconds
- if_modified_since (datetime or float) – Timestamp for If-Modified-Since header
- follow_redirects (bool) – Should redirects be followed automaticallyor return the 3xx response?
- max_redirects (int) – Limit for followredirects
- user_agent ([_string](https://docs.python.org/3.4/library/string.html#module-string)) – String to send as User-Agent header
- decompress_response (bool) – Request a compressed response fromthe server and decompress it after downloading. Default is True.New in Tornado 4.0.
- use_gzip (bool) – Deprecated alias for decompressresponsesince Tornado 4.0.
- network_interface ([_string](https://docs.python.org/3.4/library/string.html#module-string)) – Network interface to use for request.curlhttpclient only; see note below.
- streaming_callback ([_callable](https://docs.python.org/3.4/library/functions.html#callable)) – If set, streamingcallback willbe run with each chunk of data as it is received, andHTTPResponse.body and HTTPResponse.buffer will be empty inthe final response.
- header_callback ([_callable](https://docs.python.org/3.4/library/functions.html#callable)) – If set, headercallback willbe run with each header line as it is received (including thefirst line, e.g. HTTP/1.0 200 OK\r\n, and a final linecontaining only \r\n. All lines include the trailing newlinecharacters). HTTPResponse.headers will be empty in the finalresponse. This is most useful in conjunction withstreaming_callback, because it’s the only way to get access toheader data while the request is in progress.
- prepare_curl_callback ([_callable](https://docs.python.org/3.4/library/functions.html#callable)) – If set, will be called witha pycurl.Curl object to allow the application to make additionalsetopt calls.
- proxy_host (string) – HTTP proxy hostname. To use proxies,proxyhost and proxy_port must be set; proxy_username andproxy_pass are optional. Proxies are currently only supportedwith curl_httpclient.
- proxy_port ([_int](https://docs.python.org/3.4/library/functions.html#int)) – HTTP proxy port
- proxy_username (string) – HTTP proxy username
- proxy_password (string) – HTTP proxy password
- allow_nonstandard_methods (bool) – Allow unknown values for methodargument?
- validate_cert (bool) – For HTTPS requests, validate the server’scertificate?
- ca_certs (string) – filename of CA certificates in PEM format,or None to use defaults. See note below when used withcurlhttpclient.
- client_key ([_string](https://docs.python.org/3.4/library/string.html#module-string)) – Filename for client SSL key, if any. Seenote below when used with curlhttpclient.
- client_cert ([_string](https://docs.python.org/3.4/library/string.html#module-string)) – Filename for client SSL certificate, if any.See note below when used with curlhttpclient.
- ssl_options ([_ssl.SSLContext](https://docs.python.org/3.4/library/ssl.html#ssl.SSLContext)) – ssl.SSLContext object for use insimplehttpclient (unsupported by curl_httpclient).Overrides validate_cert, ca_certs, client_key,and client_cert.
- allow_ipv6 ([_bool](https://docs.python.org/3.4/library/functions.html#bool)) – Use IPv6 when available? Default is true.
- expect_100_continue (bool) – If true, send theExpect: 100-continue header and wait for a continue responsebefore sending the request body. Only supported withsimple_httpclient.

注解

When using curl_httpclient certain options may beinherited by subsequent fetches because pycurl doesnot allow them to be cleanly reset. This applies to theca_certs, client_key, client_cert, andnetwork_interface arguments. If you use theseoptions, you should pass them on every request (you don’thave to always use the same values, but it’s not possibleto mix requests that specify these options with ones thatuse the defaults).

3.1 新版功能: The auth_mode argument.

4.0 新版功能: The body_producer and expect_100_continue arguments.

4.2 新版功能: The ssl_options argument.

Response objects¶

class tornado.httpclient.HTTPResponse(request, code, headers=None, buffer=None, effective_url=None, error=None, request_time=None, time_info=None, reason=None)[源代码]¶

HTTP Response object.

Attributes:

- request: HTTPRequest object
- code: numeric HTTP status code, e.g. 200 or 404
- reason: human-readable reason phrase describing the status code
- headers: tornado.httputil.HTTPHeaders object
- effective_url: final location of the resource after following anyredirects
- buffer: cStringIO object for response body
- body: response body as string (created on demand from self.buffer)
- error: Exception object, if any
- request_time: seconds from request start to finish
- time_info: dictionary of diagnostic timing information from the request.Available data are subject to change, but currently uses timingsavailable from http://curl.haxx.se/libcurl/c/curl_easy_getinfo.html,plus queue, which is the delay (if any) introduced by waiting fora slot under AsyncHTTPClient‘s max_clients setting.

rethrow()[源代码]¶: If there was an error on the request, raise an HTTPError.

Exceptions¶

exception tornado.httpclient.HTTPError(code, message=None, response=None)[源代码]¶: Exception thrown for an unsuccessful HTTP request.

Attributes:

- code - HTTP error integer error code, e.g. 404. Error code 599 isused when no HTTP response was received, e.g. for a timeout.
- response - HTTPResponse object, if any.
Note that if follow_redirects is False, redirects become HTTPErrors,and you can look at error.response.headers['Location'] to see thedestination of the redirect.

Command-line interface¶

This module provides a simple command-line interface to fetch a urlusing Tornado’s HTTP client. Example usage:

# Fetch the url and print its body
python -m tornado.httpclient http://www.google.com
 
# Just print the headers
python -m tornado.httpclient --print_headers --print_body=false http://www.google.com

Implementations¶

class tornado.simplehttpclient.SimpleAsyncHTTPClient[源代码]¶

Non-blocking HTTP client with no external dependencies.

This class implements an HTTP 1.1 client on top of Tornado’s IOStreams.Some features found in the curl-based AsyncHTTPClient are not yetsupported. In particular, proxies are not supported, connectionsare not reused, and callers cannot select the network interface to beused.

initialize(_io_loop, max_clients=10, hostname_mapping=None, max_buffer_size=104857600, resolver=None, defaults=None, max_header_size=None, max_body_size=None)[源代码]¶: Creates a AsyncHTTPClient.

Only a single AsyncHTTPClient instance exists per IOLoopin order to provide limitations on the number of pending connections.forceinstance=True may be used to suppress this behavior.

Note that because of this implicit reuse, unless force_instanceis used, only the first call to the constructor actually usesits arguments. It is recommended to use the configure methodinstead of the constructor to ensure that arguments take effect.

max_clients is the number of concurrent requests that can bein progress; when this limit is reached additional requests will bequeued. Note that time spent waiting in this queue still countsagainst the request_timeout.

hostname_mapping is a dictionary mapping hostnames to IP addresses.It can be used to make local DNS changes when modifying system-widesettings like /etc/hosts is not possible or desirable (e.g. inunittests).

max_buffer_size (default 100MB) is the number of bytesthat can be read into memory at once. max_body_size(defaults to max_buffer_size) is the largest response bodythat the client will accept. Without astreaming_callback, the smaller of these two limitsapplies; with a streaming_callback only max_body_sizedoes.

在 4.2 版更改: Added the max_body_size argument.

_class tornado.curlhttpclient.CurlAsyncHTTPClient(_io_loop, max_clients=10, defaults=None)¶: libcurl-based HTTP client.

原文:

https://tornado-zh-cn.readthedocs.io/zh_CN/latest/httpclient.html