urllib3 is a powerful, user-friendly HTTP client for Python. It is significantly more advanced than urllib2, which comes with the Python standard library (though urllib2 was merged into urllib in Python 3). urllib3 provides many features that are not found in the standard library's HTTP client. Here are some of the main features of urllib3:
Connection Pooling
urllib3 uses connection pooling to reuse connections to a host, which improves the efficiency of network operations by reducing the number of connections that need to be opened and subsequently closed.
import urllib3
http = urllib3.PoolManager()
r = http.request('GET', 'http://httpbin.org/robots.txt')
Thread Safety
Connection pools in urllib3 are thread-safe, allowing you to use the same PoolManager or ConnectionPool across threads without any additional locking.
Blocking and Non-Blocking I/O
urllib3 supports both blocking and non-blocking I/O. It can work with synchronous code as well as integrate with event loops for asynchronous applications.
SSL/TLS Verification
urllib3 can verify SSL certificates for HTTPS requests, ensuring that the connection is secure. It also allows you to specify your own CA certificates.
http = urllib3.PoolManager(
cert_reqs='CERT_REQUIRED',
ca_certs='/path/to/your/certificate_bundle'
)
r = http.request('GET', 'https://example.com/')
Client-Side SSL/TLS Support
urllib3 can also handle client-side SSL/TLS by allowing you to specify your own certificates.
http = urllib3.PoolManager(
key_file='/path/to/key.pem',
cert_file='/path/to/cert.pem'
)
Automatic Content Decoding
urllib3 can automatically decode gzip and deflate transfer-encodings when the server sends it.
Retry Logic
urllib3 can automatically retry idempotent requests for intermittent failures, which is configurable via the Retry class.
retries = urllib3.Retry(connect=5, read=2, redirect=5)
http = urllib3.PoolManager(retries=retries)
Redirect Handling
urllib3 can automatically follow redirects, or it can be configured to handle redirects manually.
Support for Chunked Requests
urllib3 supports chunked transfer encoding for both requests and responses, allowing for streaming uploads and downloads.
HTTP and SOCKS Proxy Support
urllib3 can work with HTTP and SOCKS proxies, using the proxy_from_url function to create a connection through a proxy.
http = urllib3.ProxyManager('http://localhost:8080/')
Headers, Query Parameters, and Form Fields
urllib3 allows you to easily add HTTP headers, query parameters, and send form data with your requests.
r = http.request(
'GET',
'http://httpbin.org/get',
fields={'hello': 'world'},
headers={'X-Something': 'value'}
)
Streaming and Large File Uploads
urllib3 supports streaming uploads, which is great for large files because they don’t need to be loaded into memory.
JSON Content
While urllib3 does not directly support JSON content, it is easy to send JSON requests with the standard json module.
import json
http = urllib3.PoolManager()
encoded_data = json.dumps({"attribute": "value"}).encode('utf-8')
r = http.request(
'POST',
'http://httpbin.org/post',
body=encoded_data,
headers={'Content-Type': 'application/json'}
)
Extensibility
urllib3 is designed with extensibility in mind, allowing you to implement custom connection types, request/response handling, etc.
These features make urllib3 a very comprehensive and flexible solution for HTTP networking in Python. However, it's important to note that urllib3 is a lower-level library that requires you to manage request encoding, response decoding, and error handling on your own. For a higher-level HTTP client that abstracts these tasks, you might consider using requests, which is built on top of urllib3.