Advanced

Latency Correction

Coordinated omission and accurate latency measurement

Understanding and using latency correction for accurate benchmarking.

The Problem: Coordinated Omission

Traditional benchmarking tools have a fundamental flaw called Coordinated Omission.

How It Happens

A benchmark tool sends requests at a target rate (e.g., 100/second)
When the server slows down, the tool waits for responses
During this wait, fewer requests are sent
Slow periods are underrepresented in the statistics

Example

Imagine a server that:

Responds in 10ms normally
Has a 1-second pause every 100 requests

Without correction:

Tool waits during the pause
Only measures the 10ms responses
Reported P99: ~10ms (wrong!)

With correction:

Accounts for requests that "should have been sent"
Includes queuing time in measurements
Reported P99: ~1000ms (correct!)

Enabling Latency Correction

Use the --latency-correction flag:

burl https://api.example.com --latency-correction

How It Works

With latency correction enabled, burl:

Maintains expected request timing based on target rate
Tracks when requests "should" have started
Includes queuing/waiting time in latency calculations
Produces latencies that reflect user-perceived delays

When to Use It

Use Latency Correction For:

SLA compliance testing: When you need accurate tail latencies
Capacity planning: Understanding real-world performance
Performance comparisons: Fair comparisons between systems
Production readiness: Validating before go-live

Skip Latency Correction For:

Quick sanity checks: When you just need rough numbers
Debugging: When investigating specific issues
Low-load testing: When there's no queuing

Comparing Results

See the difference correction makes:

#!/bin/bash
echo "Without latency correction:"
burl https://api.example.com -c 100 -d 30s -f json | jq '.latency_ms | {p50, p99}'

echo -e "\nWith latency correction:"
burl https://api.example.com -c 100 -d 30s --latency-correction -f json | jq '.latency_ms | {p50, p99}'

Typical results showing the difference:

Without latency correction:
{
  "p50": 45.2,
  "p99": 156.8
}

With latency correction:
{
  "p50": 48.3,
  "p99": 892.4
}

The corrected P99 is much higher because it includes time spent waiting.

Understanding the Output

Without Correction

Measures: Time from request sent to response received
Missing: Time request spent waiting to be sent
Result: Optimistic latencies that don't reflect user experience

With Correction

Measures: Time from when request should have been sent to response received
Includes: Queuing delay when server can't keep up
Result: Realistic latencies matching user experience

Visualization

Timeline without correction:

Request 1: |--10ms--|
Request 2: |--10ms--|
Request 3: |-------------500ms pause-------------|--10ms--|
Request 4: |--10ms--|

Measured latencies: 10, 10, 10, 10 ms
Reported P99: ~10ms


Timeline with correction:

Request 1: |--10ms--|
Request 2: |--10ms--|
Request 3 (queued): |----wait 300ms----|-------------500ms pause-------------|--10ms--|
Request 4 (queued): |--------wait 500ms--------|-------------500ms pause-------------|--10ms--|

Measured latencies: 10, 10, 810, 1010 ms
Reported P99: ~1000ms

Best Practices

1. Always Use for Production Testing

# Production readiness test
burl https://api.production.com \
  -c 100 \
  -d 5m \
  --latency-correction \
  --llm json \
  -o production_test.json

2. Compare With and Without

Document both for complete picture:

# Run both
burl https://api.example.com -c 100 -d 60s -o uncorrected.json
burl https://api.example.com -c 100 -d 60s --latency-correction -o corrected.json

# Compare
echo "Uncorrected P99: $(jq '.latency_ms.p99' uncorrected.json)ms"
echo "Corrected P99: $(jq '.latency_ms.p99' corrected.json)ms"

3. Use with Rate Limiting

Latency correction is most meaningful with a target rate:

# Test at specific QPS with accurate latencies
burl https://api.example.com \
  -q 1000 \
  -c 100 \
  -d 60s \
  --latency-correction