Skip to content

Idempotency and retries

This guide explains when kneo-client injects idempotency keys, how 409 mismatches surface, what the retry policy is, and how to customize either piece for non-default deployments.

What "idempotency and retries" mean in kneo-client

The Kneo Agent Platform's operational endpoints are designed for safe retry. The platform short-circuits a duplicate POST with the same Idempotency-Key header and identical payload — the second request replays the original response rather than re-executing the side effect.

kneo-client makes that safety automatic by:

  1. Auto-injecting a fresh UUID4 Idempotency-Key on every POST (unless the caller supplies their own).
  2. Retrying transient transport errors and the fixed set {429, 502, 503, 504} within a configurable RetryPolicy.
  3. Honoring Retry-After on 429 responses (server's hint overrides the policy's computed delay).
  4. Surfacing the platform's payload-mismatch behavior as KneoIdempotencyMismatchError so retry-with-wrong-payload bugs are loud, not silent.

Both pieces work together: idempotency keys make it safe to retry a POST; the retry policy decides when to retry. The default settings work for most callers; you can override either independently.

Idempotency keys, by default

On every POST the transport sends, it adds:

  • Idempotency-Key: <fresh UUID4> — unless the caller passes idempotency_key=<string> on the method call.

The platform's contract:

  • Same key + identical payload → server short-circuits and returns the original response. The retry is effectively a replay.
  • Same key + different payload → server returns HTTP 409. The client surfaces this as KneoIdempotencyMismatchError.
  • Different key → server treats this as a new request and executes the side effect.

The auto-generated key is a fresh UUID4 per request. Collisions are not a real concern (UUID4 has 122 bits of randomness), so by default each call is independent — retries beyond the transport's own loop won't be deduplicated by the server.

When to supply your own key

The auto-generated key is fine for one-shot calls. Supply your own when:

  • You're retrying outside the transport's loop. The transport retries within RetryPolicy.max_attempts for transient failures, but if your application catches an error and retries (e.g. a job runner re-invoking after a process restart), pass the same key on both attempts so the platform dedupes. The transport's retries already share the key.
  • You want cross-process correlation. Two services submitting the same logical request can dedupe by agreeing on the key (e.g. derive it from a hash of the request payload + a request ID from your application).
  • You're testing. A stable key makes test fixtures deterministic.
from kneo_client.core.idempotency import new_idempotency_key

key = new_idempotency_key()
body = {"spec_id": "my-spec"}

# First attempt — succeeds normally
run = await client.platform.runs.create(body, idempotency_key=key)

# … later, retrying after a transient outage outside the transport's loop:
run = await client.platform.runs.create(body, idempotency_key=key)
# → returns the first run (same response, no new side effect)

Constraints on caller-supplied keys (validated by the client before sending):

  • Non-empty — an empty string raises ValueError.
  • At most 256 characters (MAX_KEY_LENGTH). The platform enforces this; the client validates locally to fail faster.

Catching 409 mismatch

A 409 with an idempotency key set means the same key was reused with a different payload — almost always a caller bug. Surface it loudly:

from kneo_client.core.errors import KneoIdempotencyMismatchError

try:
    await client.platform.runs.create(payload, idempotency_key=key)
except KneoIdempotencyMismatchError as exc:
    raise RuntimeError(
        f"Idempotency-Key {exc.idempotency_key!r} was reused with a different payload. "
        f"Either generate a new key for the new request or fix the payload drift."
    ) from exc

The exception carries the key as sent (exc.idempotency_key), the platform's error body (exc.body), the server-assigned request ID for log correlation (exc.request_id), and the HTTP status (exc.status == 409).

Note that KneoIdempotencyMismatchError is a subclass of KneoConflictError, so a generic except KneoConflictError will also catch it. If you only care about non-mismatch conflicts (resource state collisions, optimistic-lock failures, etc.), catch KneoIdempotencyMismatchError first.

The retry policy

RetryPolicy is a frozen dataclass describing when and how long to wait between attempts. The transport applies it; the policy itself does no I/O.

The default policy:

RetryPolicy(
    max_attempts=3,    # up to 3 total attempts
    base_delay=0.2,    # ~0.2s before attempt 2
    max_delay=30.0,    # cap on the computed delay
    jitter=0.1,        # 10% jitter applied to the computed delay
)

Delay sequence (without jitter): attempt 1 → no delay, attempt 2 → base_delay, attempt 3 → 2 × base_delay, …, capped at max_delay. With jitter=0.1, a 1-second delay becomes uniformly distributed in [0.9, 1.1].

Retry-After from a 429 response overrides the computed delay verbatim. The server's hint is authoritative; no jitter is applied on top.

When retries fire

The transport retries on:

  • Transport-level errors from httpx — DNS resolution failures, connect failures, TLS handshakes, read timeouts.
  • HTTP 429 (rate limited), 502 (bad gateway), 503 (service unavailable), 504 (gateway timeout). Other 4xx and 5xx status codes do not trigger a retry — those typically indicate caller errors or non-transient server problems that retrying won't fix.

The retryable status set is RETRYABLE_STATUS_CODES = frozenset({429, 502, 503, 504}) — a module constant, intentionally not configurable. If your platform deployment legitimately returns transient 500s, fix the deployment; we'd rather diagnose the root cause than open the gate.

Retries fire only for:

  • Idempotent verbs: GET, HEAD, OPTIONS, PUT, DELETE.
  • POST with an Idempotency-Key — i.e. always, since the transport auto-injects one on every POST.

This is why auto-injection matters: it's what makes POST retries safe.

Customizing the policy

Pass a custom policy when constructing the client:

from kneo_client import KneoClient
from kneo_client.core.profiles import load_profile
from kneo_client.core.retries import RetryPolicy

profile = load_profile()

# Aggressive retries for a flaky network
policy = RetryPolicy(max_attempts=8, base_delay=0.5, max_delay=60.0)
client = KneoClient(profile, retry_policy=policy)

# Flat delay (no exponential growth)
policy = RetryPolicy(max_attempts=3, base_delay=2.0, max_delay=2.0, jitter=0)
client = KneoClient(profile, retry_policy=policy)

To disable retries entirely (useful in tests that want to see the first failure surface immediately):

client = KneoClient(profile, retry_policy=RetryPolicy(max_attempts=1))

Constraints on RetryPolicy parameters (validated at construction):

  • max_attempts ≥ 1
  • base_delay ≥ 0
  • max_delay ≥ base_delay
  • 0 ≤ jitter ≤ 1

Invalid values raise ValueError immediately — you find out at startup, not on the first retry.

What surfaces when retries exhaust

If all retries fail, the client raises the appropriate typed exception based on the final attempt's outcome:

Final attempt failed because… Exception raised
HTTP 429 KneoRateLimited (carries .retry_after)
HTTP 502 / 503 / 504 KneoServerError
Transport error (DNS, connect, TLS, read) KneoNetworkError

Intermediate retry attempts are logged at INFO level under the kneo_client.transport logger:

INFO:kneo_client.transport:status 503 on attempt 1; sleeping 0.20s
INFO:kneo_client.transport:status 503 on attempt 2; sleeping 0.40s

If you want to see the retry behavior in your application logs, set the logger level:

import logging
logging.basicConfig(level=logging.INFO)
logging.getLogger("kneo_client.transport").setLevel(logging.INFO)

Bypassing the auto-injection

The transport auto-injects an Idempotency-Key on every POST unconditionally. If for some reason you need to send a POST without one (very unusual — the platform's contract assumes the header is present), drop to the transport directly and supply explicit headers:

# Standard call — transport adds Idempotency-Key automatically
await client.platform.runs.create(body)

# No-auto-inject path (you take responsibility for deduplication)
await client._transport.request("POST", "/v1/runs", json=body, headers={"Idempotency-Key": "your-deterministic-key"})

You almost never want this. Auto-injection is the right default; this is documented mostly so you know the escape hatch exists.

Putting it together — a robust pattern

from kneo_client import KneoClient
from kneo_client.core.errors import (
    KneoIdempotencyMismatchError, KneoNetworkError, KneoServerError,
)
from kneo_client.core.idempotency import new_idempotency_key

async def create_run_robust(client: KneoClient, body: dict) -> str:
    key = new_idempotency_key()  # one key, used across application-level retries

    for attempt in range(1, 4):
        try:
            run = await client.platform.runs.create(body, idempotency_key=key)
            return run.run_id

        except KneoIdempotencyMismatchError as exc:
            # Caller bug — body changed between attempts. Don't retry.
            raise RuntimeError(f"payload drift on key {exc.idempotency_key!r}") from exc

        except (KneoNetworkError, KneoServerError) as exc:
            # The transport already retried within its policy; we add an outer
            # guard for application-level recovery (e.g. across process restarts).
            if attempt == 3:
                raise
            await asyncio.sleep(2 ** attempt)

    raise AssertionError("unreachable")

In practice the transport's built-in retries are sufficient for most use cases; the outer loop above is for the rare case where you want application-level retry behavior (e.g. survive a process restart while the platform is temporarily unreachable).