Performance¶
Benchmarks¶
Measured against aiohttp and httpx on a local aiohttp loopback server
(no network, no DNS, no TLS), 2 000 requests per (client, body size),
concurrency 64. Python 3.12, macOS arm64, uvloop enabled.
| Body size | Client | RPS | P50 (ms) | P95 (ms) | P99 (ms) |
|---|---|---|---|---|---|
| 200 B | hyperhttp | 3 699 | 11.1 | 12.7 | 31.3 |
| 200 B | aiohttp | 3 904 | 11.5 | 14.7 | 28.5 |
| 200 B | httpx | 199 | 210.4 | 948.9 | 1452.9 |
| 10 KiB | hyperhttp | 3 527 | 11.8 | 13.7 | 32.2 |
| 10 KiB | aiohttp | 3 794 | 11.9 | 15.1 | 29.1 |
| 10 KiB | httpx | 203 | 204.8 | 902.1 | 1413.2 |
| 1 MiB | hyperhttp | 1 303 | 42.7 | 46.5 | 71.2 |
| 1 MiB | aiohttp | 1 317 | 43.0 | 47.6 | 61.5 |
| 1 MiB | httpx | 98 | 345.1 | 1465.9 | 2784.2 |
HyperHTTP runs within ~5% of aiohttp across every body size and is 15–20×
faster than httpx on this workload. Loopback numbers reflect client-side CPU
cost; real networks flatten the differences.
Run it yourself:
Tips for squeezing more out of it¶
1. Install the speed extras¶
This pulls in uvloop, orjson, h11, brotli, and zstandard. Each is
auto-detected at import time.
2. Turn on uvloop¶
This is a no-op if uvloop isn't installed (e.g. on Windows).
3. Reuse a single Client¶
The Client owns the connection pool, DNS cache, and retry handler.
Creating one per request throws all of that away.
# App startup
client = hyperhttp.Client()
# Per-request
response = await client.get(url)
# App shutdown
await client.aclose()
4. Size the pool for your concurrency¶
client = hyperhttp.Client(
max_connections=200,
max_keepalive_connections=32,
keepalive_expiry=120.0,
)
A good default: max_connections >= peak_concurrency,
max_keepalive_connections >= peak_concurrency_per_host.
5. Let HTTP/2 do the multiplexing¶
When a host supports HTTP/2 (most public APIs, CDNs, and load balancers do),
every concurrent request reuses a single TCP connection instead of burning
pool slots. Leave http2=True (the default).
6. Stream large bodies instead of aread()-ing them¶
async with await client.get(url) as response:
async for chunk in response.aiter_bytes():
process(chunk)
aread() is optimised (zero-copy single-chunk fast path; b"".join for
multi-chunk), but streaming avoids holding a full copy of the body in memory,
which matters for large files and high concurrency.
7. Run concurrently¶
With asyncio.gather (or a bounded-concurrency semaphore for larger batches),
HyperHTTP will saturate the pool and multiplex HTTP/2 streams where possible.
8. Use decorrelated-jitter backoff if you retry¶
from hyperhttp.errors.retry import RetryPolicy
from hyperhttp.utils.backoff import DecorrelatedJitterBackoff
retry_policy = RetryPolicy(
max_retries=3,
backoff_strategy=DecorrelatedJitterBackoff(base=0.1, max_backoff=10.0),
)
This spreads retries across clients and avoids thundering-herd pile-ups.
9. Put a circuit breaker in front of flaky dependencies¶
from hyperhttp.errors.circuit_breaker import DomainCircuitBreakerManager
client = hyperhttp.Client(
circuit_breaker_manager=DomainCircuitBreakerManager(
failure_threshold=5,
recovery_timeout=30.0,
),
)
When the breaker trips, calls fail fast with CircuitBreakerOpen instead of
piling more work onto a sick server.
Checklist¶
- [ ]
pip install 'hyperhttp[speed]' - [ ]
hyperhttp.install_uvloop()at startup - [ ] One
Clientfor the app lifetime - [ ] Pool sized for peak concurrency
- [ ]
http2=True(the default) for HTTP/2-capable hosts - [ ]
response.aiter_bytes()for large downloads - [ ]
asyncio.gatheror a semaphore for parallelism - [ ] Retry + circuit breaker for any unreliable upstream