Idempotency and retries¶

This guide explains when kneo-client injects idempotency keys, how 409 mismatches surface, what the retry policy is, and how to customize either piece for non-default deployments.

What "idempotency and retries" mean in kneo-client¶

The Kneo Agent Platform's write endpoints are designed for safe retry where the spec documents the Idempotency-Key header: run create and continue, specs/run, and human-task resume (continue is contractual on kneo_serv >= 0.9.0; older servers treated its key as best-effort). The platform short-circuits a duplicate POST with the same Idempotency-Key header and identical payload — the second request replays the original response rather than re-executing the side effect. One exception worth knowing: runs.cancel has no documented key (even on kneo_serv 0.9.0) — a retried cancel may be observed twice, which is harmless because cancel is naturally idempotent in effect.

kneo-client makes that safety automatic by:

Auto-injecting a fresh UUID4 Idempotency-Key on every POST (unless the caller supplies their own).
Retrying transient transport errors and the fixed set {429, 502, 503, 504} within a configurable RetryPolicy.
Honoring Retry-After on 429 and 503 responses (the server's hint overrides the policy's computed delay; a hint above RetryPolicy.max_delay fails fast instead — see below). 503 is kneo_serv's run-queue backpressure signal on POST /v1/runs; because POSTs carry an auto-injected idempotency key, those are retry-safe and retried automatically.
Surfacing the platform's payload-mismatch behavior as KneoIdempotencyMismatchError so retry-with-wrong-payload bugs are loud, not silent.

Both pieces work together: idempotency keys make it safe to retry a POST; the retry policy decides when to retry. The default settings work for most callers; you can override either independently.

Idempotency keys, by default¶

On every POST the transport sends, it adds:

Idempotency-Key: <fresh UUID4> — unless the caller passes idempotency_key=<string> on the method call.

The platform's contract:

Same key + identical payload → server short-circuits and returns the original response. The retry is effectively a replay.
Same key + original attempt still executing → server returns HTTP 409 with code idempotency_key_in_progress and a Retry-After hint. The transport auto-retries this case per the hint; it surfaces as a KneoConflictError only once retries exhaust.
Same key + different payload → server returns HTTP 409 with code idempotency_key_conflict. The client surfaces this as KneoIdempotencyMismatchError.
Different key → server treats this as a new request and executes the side effect.

The auto-generated key is a fresh UUID4 per request. Collisions are not a real concern (UUID4 has 122 bits of randomness), so by default each call is independent — retries beyond the transport's own loop won't be deduplicated by the server.

When to supply your own key¶

The auto-generated key is fine for one-shot calls. Supply your own when:

You're retrying outside the transport's loop. The transport retries within RetryPolicy.max_attempts for transient failures, but if your application catches an error and retries (e.g. a job runner re-invoking after a process restart), pass the same key on both attempts so the platform dedupes. The transport's retries already share the key.
You want cross-process correlation. Two services submitting the same logical request can dedupe by agreeing on the key (e.g. derive it from a hash of the request payload + a request ID from your application).
You're testing. A stable key makes test fixtures deterministic.

from kneo_client.core import new_idempotency_key

key = new_idempotency_key()
body = {"input": "Summarize the latest activity.", "spec_path": "my-spec.yaml"}

# First attempt — succeeds normally
run = await client.platform.runs.create(body, idempotency_key=key)

# … later, retrying after a transient outage outside the transport's loop:
run = await client.platform.runs.create(body, idempotency_key=key)
# → returns the first run (same response, no new side effect)

Constraints on caller-supplied keys (validated + normalized by the client before sending):

Non-empty, non-whitespace — an empty or whitespace-only key raises ValueError. The platform strips surrounding whitespace and treats a whitespace-only key as no key — which would silently disable idempotency (a retry re-runs the side effect), so the client rejects it up front.
Stripped before send — surrounding whitespace is removed, so a padded key " abc " is transmitted as "abc". This matches the platform's stored key, so the value you see echoed back (including KneoError.idempotency_key) is the stripped form.
At most 256 characters (MAX_KEY_LENGTH), measured on the stripped value. The platform enforces this (rejecting an over-length key with 400 invalid_idempotency_key); the client validates locally to fail faster.

Catching 409 mismatch¶

The platform's 409s carry a stable envelope code (KneoError.code), and the client classifies on it. Only idempotency_key_conflict — the same key reused with a different payload — raises KneoIdempotencyMismatchError (plus, against pre-0.6.0 servers whose bodies carry no code, any 409 on a keyed request). Other codes raise the KneoConflictError base: run_state_conflict (a lifecycle fence, e.g. continuing a run that isn't paused for human review — cancelling an already-terminal run is instead a 200 no-op) and resource_locked (held by another operation), both carrying .retry_after when the server hints one. idempotency_key_in_progress is auto-retried by the transport per its Retry-After and only surfaces as KneoConflictError once retries exhaust.

A mismatch is almost always a caller bug. Surface it loudly:

from kneo_client.core import KneoIdempotencyMismatchError

try:
    await client.platform.runs.create(payload, idempotency_key=key)
except KneoIdempotencyMismatchError as exc:
    raise RuntimeError(
        f"Idempotency-Key {exc.idempotency_key!r} was reused with a different payload. "
        f"Either generate a new key for the new request or fix the payload drift."
    ) from exc

The exception carries the key as sent (exc.idempotency_key), the platform's error body (exc.body), the server-assigned request ID for log correlation (exc.request_id), and the HTTP status (exc.status == 409).

Note that KneoIdempotencyMismatchError is a subclass of KneoConflictError, so a generic except KneoConflictError will also catch it. If you only care about non-mismatch conflicts (run_state_conflict, resource_locked), catch KneoIdempotencyMismatchError first — or branch on exc.code.

The retry policy¶

RetryPolicy is a frozen dataclass describing when and how long to wait between attempts. The transport applies it; the policy itself does no I/O.

The default policy:

RetryPolicy(
    max_attempts=3,    # up to 3 total attempts
    base_delay=0.2,    # ~0.2s before attempt 2
    max_delay=30.0,    # cap on the computed delay
    jitter=0.1,        # 10% jitter applied to the computed delay
)

Delay sequence (without jitter): attempt 1 → no delay, attempt 2 → base_delay, attempt 3 → 2 × base_delay, …, capped at max_delay. With jitter=0.1, a 1-second delay becomes uniformly distributed in [0.9, 1.1].

Retry-After from a 429 or 503 response overrides the computed delay — no jitter is applied on top — when the hint is within max_delay. A hint above max_delay is not slept on at all: retrying earlier than the server asked is likely doomed, so the transport stops retrying immediately and raises the typed error with the verbatim hint on .retry_after — an intermediary advertising an hour-long Retry-After neither blocks the caller nor burns attempts on guaranteed-fail retries. Apply your own longer back-off from .retry_after when you want to wait it out.

A Retry-After: 0 (or a negative / already-past HTTP-date hint, which the parser floors to 0) is honored verbatim: the next attempt fires immediately with no back-off and no jitter. This is deliberate — 0 is the server's explicit "retry now" and is RFC-compliant, and the blast radius is bounded by max_attempts (the default 3 allows only two retries). The transport-error path (no server hint) always keeps the jittered exponential back-off; only a server-sent hint can produce a zero delay.

When retries fire¶

The transport retries on:

Transport-level errors from httpx — DNS resolution failures, connect failures, TLS handshakes, read timeouts.
HTTP 429 (rate limited), 502 (bad gateway), 503 (service unavailable), 504 (gateway timeout). Other 4xx and 5xx status codes do not trigger a retry — those typically indicate caller errors or non-transient server problems that retrying won't fix.
HTTP 409 with code idempotency_key_in_progress — the server saying the key's original attempt is still executing; retried per its Retry-After. Every other 409 is a real conflict and surfaces immediately.

The retryable status set is RETRYABLE_STATUS_CODES = frozenset({429, 502, 503, 504}) — a module constant, intentionally not configurable. If your platform deployment legitimately returns transient 500s, fix the deployment; we'd rather diagnose the root cause than open the gate.

Status-code retries fire only for:

Idempotent verbs: GET, HEAD, OPTIONS, PUT, DELETE.
POST with an Idempotency-Key — i.e. always, since the transport auto-injects one on every POST.

Transport-level errors are slightly stricter: they're retried only for idempotent verbs, keyed POSTs, or — for any method — connect-phase errors (httpx.ConnectError and httpx.ConnectTimeout: connection refused or the TCP/TLS handshake timing out), which provably never reached the server and are therefore always safe to re-send. Mid-flight decode failures (httpx.DecodingError, e.g. a proxy emitting corrupt gzip) and redirect loops (httpx.TooManyRedirects) also count as transient — but only on replay-safe methods, since the request may have executed server-side.

This is why auto-injection matters: it's what makes POST retries safe.

Customizing the policy¶

Pass a custom policy when constructing the client:

from kneo_client import KneoClient
from kneo_client.core import load_profile
from kneo_client.core import RetryPolicy

profile = load_profile()

# Aggressive retries for a flaky network
policy = RetryPolicy(max_attempts=8, base_delay=0.5, max_delay=60.0)
client = KneoClient(profile, retry_policy=policy)

# Flat delay (no exponential growth)
policy = RetryPolicy(max_attempts=3, base_delay=2.0, max_delay=2.0, jitter=0)
client = KneoClient(profile, retry_policy=policy)

To disable retries entirely (useful in tests that want to see the first failure surface immediately):

client = KneoClient(profile, retry_policy=RetryPolicy(max_attempts=1))

Constraints on RetryPolicy parameters (validated at construction):

max_attempts ≥ 1
base_delay ≥ 0
max_delay ≥ base_delay
0 ≤ jitter ≤ 1

Invalid values raise ValueError immediately — you find out at startup, not on the first retry.

What surfaces when retries exhaust¶

If all retries fail, the client raises the appropriate typed exception based on the final attempt's outcome:

Final attempt failed because…	Exception raised
HTTP 429	`KneoRateLimitedError` (carries `.retry_after`)
HTTP 503	`KneoServiceUnavailableError` (a `KneoServerError` subclass; carries `.retry_after`)
HTTP 502 / 504	`KneoServerError`
Transport error (DNS, connect, TLS, read)	`KneoNetworkError`

Intermediate retry attempts are logged at INFO level under the kneo_client.transport logger:

INFO:kneo_client.transport:status 503 on attempt 1; sleeping 0.20s
INFO:kneo_client.transport:status 503 on attempt 2; sleeping 0.40s

If you want to see the retry behavior in your application logs, set the logger level:

import logging
logging.basicConfig(level=logging.INFO)
logging.getLogger("kneo_client.transport").setLevel(logging.INFO)

Controlling the key (there is no opt-out)¶

The transport auto-injects an Idempotency-Key on every POST unconditionally — and it overwrites any Idempotency-Key you place in a headers= mapping on Transport.request, so passing explicit headers is not an escape hatch. The supported way to control the key is the idempotency_key= kwarg on the wrapper methods (or on Transport.request directly):

# Standard call — transport adds a fresh UUID4 automatically
await client.platform.runs.create(body)

# Deterministic key — full control over what gets sent
await client.platform.runs.create(body, idempotency_key="your-deterministic-key")

There is no supported way to send a POST without a key. The platform's contract assumes the header is present, and the auto-injection is exactly what makes POST retries safe — an opt-out would quietly disable that.

Putting it together — a robust pattern¶

from kneo_client import KneoClient
from kneo_client.core import (
    KneoIdempotencyMismatchError, KneoNetworkError, KneoServerError,
)
from kneo_client.core import new_idempotency_key

async def create_run_robust(client: KneoClient, body: dict) -> str:
    key = new_idempotency_key()  # one key, used across application-level retries

    for attempt in range(1, 4):
        try:
            run = await client.platform.runs.create(body, idempotency_key=key)
            return run.run_id

        except KneoIdempotencyMismatchError as exc:
            # Caller bug — body changed between attempts. Don't retry.
            raise RuntimeError(f"payload drift on key {exc.idempotency_key!r}") from exc

        except (KneoNetworkError, KneoServerError) as exc:
            # The transport already retried within its policy; we add an outer
            # guard for application-level recovery (e.g. across process restarts).
            if attempt == 3:
                raise
            await asyncio.sleep(2 ** attempt)

    raise AssertionError("unreachable")

In practice the transport's built-in retries are sufficient for most use cases; the outer loop above is for the rare case where you want application-level retry behavior (e.g. survive a process restart while the platform is temporarily unreachable).