Guide

Benchmarking Options

Fine-tune your benchmark with advanced options

This guide covers advanced options for controlling how burl runs benchmarks.

Rate Limiting

Use -q or --qps to limit requests per second:

# Maximum 100 requests per second
burl https://api.example.com -q 100 -c 10 -d 60s

This is useful for:

  • Testing at production traffic levels
  • Avoiding overwhelming the target server
  • Simulating realistic user behavior
  • Respecting rate limits

The QPS limit is distributed across all connections. With -q 100 -c 10, each connection averages 10 requests/second.

HTTP Version Control

By default, burl uses HTTP/2 when available. You can force a specific version:

Force HTTP/1.1

burl https://api.example.com --http1

Use this when:

  • Testing HTTP/1.1 specific behavior
  • The server doesn't support HTTP/2
  • Comparing protocol performance

Force HTTP/2

burl https://api.example.com --http2

HTTP/3 (Experimental)

burl https://api.example.com --http3

HTTP/3 support is experimental and may not work with all servers.

Latency Correction

Enable coordinated omission correction with --latency-correction:

burl https://api.example.com --latency-correction

What is Coordinated Omission?

When a server becomes slow, traditional benchmarking tools wait for responses before sending new requests. This means slow periods are underrepresented in statistics.

With latency correction enabled, burl accounts for this by including the queuing time in latency calculations, giving a more accurate picture of user-perceived latency.

When to Use It

  • Testing latency-sensitive applications
  • When you care about tail latencies (P99, P99.9)
  • For accurate SLA compliance testing

Warmup Requests

The -w or --warmup flag sends requests that aren't counted in statistics:

# 50 warmup requests
burl https://api.example.com -w 50 -d 30s

Why Warmup?

  1. Connection Establishment: HTTP connections take time to establish. Warmup ensures connections are ready.
  2. JIT Compilation: Many servers use JIT compilation. Initial requests may be slower while code compiles.
  3. Cache Warming: First requests may hit cold caches on the server.
  4. Realistic Measurements: After warmup, measurements better reflect steady-state performance.
ScenarioRecommended Warmup
Quick test5-10 requests
Standard benchmark50-100 requests
Production testing100-500 requests
Cold start testing0 (no warmup)

Combining Options

For realistic production testing, combine multiple options:

burl https://api.example.com/users \
  -c 100 \                    # 100 concurrent users
  -d 5m \                     # Run for 5 minutes
  -q 500 \                    # Limit to 500 req/s
  -w 200 \                    # 200 warmup requests
  --latency-correction \      # Accurate tail latencies
  --http2                     # Force HTTP/2

Understanding Results

After running with these options, pay attention to:

MetricWhat to Look For
Requests/secShould be close to your QPS limit if set
P99 LatencyCritical for SLA compliance
Error RateShould be 0% under normal load
ThroughputBytes/second transferred

Interpreting Latency Distribution

Latency
  P50:    12.4ms   ← Median - typical response time
  P90:    32.1ms   ← 90th percentile
  P95:    45.2ms   ← 95th percentile
  P99:    89.3ms   ← 99th percentile - tail latency

A large gap between P50 and P99 indicates inconsistent performance:

  • Healthy: P99 is 2-3x P50
  • Concerning: P99 is 5-10x P50
  • Problem: P99 is >10x P50