Platform service API¶
The HTTP contract exposed by kneo service serve. The generated OpenAPI
schema is committed at openapi.json; refresh it with
python docs/script/generate_reference_docs.py.
This page covers versioning, auth, redaction, governance, and the request /
response shapes for each route group. The
Worked examples section near the bottom has
copy-pasteable curl invocations for the most common endpoints.
Versioning¶
The stable public HTTP API is exposed under /v1. Existing unversioned routes
remain available for local development and backwards compatibility, but new
service clients should prefer /v1.
Examples:
Authentication¶
Authentication is disabled by default for local development. Enable it by setting API keys before starting the service:
export KNEO_SERV_AUTH_ENABLED=true
export KNEO_SERV_API_KEYS='operator:operator-token:operator;reviewer:reviewer-token:reviewer'
export KNEO_SERV_ADMIN_API_KEY='admin-token'
Clients authenticate with either header:
Built-in roles:
admin: all scopesoperator:runs:read,runs:write,specs:read,human:read,audit:read,credentials:read,policies:read,policies:writereviewer:runs:read,human:read,human:write,audit:readservice:runs:read,runs:write,specs:read,human:read,human:write,audit:read,audit:write,credentials:read,policies:readviewer:runs:read,specs:read,human:read,audit:read
KNEO_SERV_API_KEYS accepts semicolon-separated entries:
Example with explicit scopes:
Redaction¶
Service responses, traces, checkpoints, and CLI JSON output are redacted before they are returned or persisted as checkpoints. Redaction covers common secret keys and inline values such as passwords, tokens, API keys, authorization headers, emails, and SSNs.
Spec governance diagnostics¶
Spec validation includes static governance diagnostics before deployment:
- Unsafe tool or function implementation imports such as direct
os,subprocess,shutil,socket,importlib, orbuiltinsprimitives are reported as errors. - Shorthand tool selection or missing tool permission policies are reported as warnings.
- Network tools without
allowed_domains, shell-capable tools, and filesystem write access are reported as warnings. - Specs that expose privileged tools or unsafe imports without a human workflow
approval step receive a
W_HUMAN_APPROVAL_MISSINGwarning.
These diagnostics are returned by kneo spec validate,
POST /specs/validate, and strict compiler flows.
POST /specs/policy-report returns a structured policy report covering memory
configuration, tool permissions, declared MCP imports, guardrail stages, human
reviewers, and human approval requirements. Use it in deployment gates when a
spec needs a machine-readable policy summary before signing or promotion.
GET /runs/{run_id}/policy-report returns the same shape but operates on the
spec the run was started with — the service reads it out of the run's stored
metadata, so operators auditing a deployed run don't need to ship the bundle
to the service themselves. Same specs:read scope as the spec-bundle route.
curl -H "Authorization: Bearer $KNEO_API_KEY" \
https://kneo.example.com/v1/runs/run-7c2f.../policy-report
Returns 404 if the run id is unknown, 400 if the run carries no spec
metadata (older runs from a pre-0.3.0 store), and 200 with
{"valid": <bool>, "report": {...}} otherwise. Each call records a
spec.policy_reported audit event scoped to the run id with
metadata.source = "run", so spec-bundle calls and run-keyed calls are
distinguishable in the audit log.
Project-based CLI flows can enforce different gates per environment through
environments.<name>.policy_enforcement. Enforcement runs after overlays and
defaults are applied, so dev, staging, and prod can require progressively
stricter tool permissions, human review, guardrails, or blocked diagnostic
codes.
Redaction is a safety layer, not a replacement for secret management. Provider keys and credentials should still be supplied through deployment secret stores or environment variables rather than embedded in specs or request payloads.
Workflow specs¶
YAML specs can target SDK-backed workflow families while preserving service validation, tracing, cancellation, and run-result metadata:
sequential: orderedsteps.graph: keyednodes, conditionaledges, and astartnode.concurrent: fan-outparticipantsexecuted by the SDK concurrent workflow.handoff:participantsplus aselector;sequenceandround_robinselectors are supported.group-chatorgroup_chat:participantsrepeated forrounds.
Orchestration workflow participants use the same step shape as sequential workflow steps:
workflow:
type: handoff
name: review-handoff
participants:
- id: researcher
kind: agent
ref: research_agent
- id: reviewer
kind: agent
ref: review_agent
selector:
type: sequence
sequence: [researcher, reviewer]
Participant ids must be unique, participant refs must resolve to declared
components, handoff selector entries must reference participant ids, and
group-chat rounds must be at least 1.
Secret management¶
kneo_serv resolves provider keys, MCP credentials, service tokens, and
runtime-specific values through named environment-variable references. Project
config stores only env-var names, never raw secret values:
The default provider mappings include openai/openai-agents,
anthropic, google, and google-adk. The CLI can show a redacted inventory
for deployment checks:
Native provider startup can fail fast when a required provider secret is missing:
Service API keys remain in KNEO_SERV_API_KEY, KNEO_SERV_API_KEYS, and
KNEO_SERV_ADMIN_API_KEY; the secret inventory reports whether they are
present without exposing values.
The service exposes the same redacted inventory for operators:
This endpoint requires credentials:read. The response reports configured
provider, extra, and service-token references with present flags and
redacted values:
{
"inventory": {
"providers": {
"openai": {
"name": "provider:openai",
"env_var": "OPENAI_API_KEY",
"present": true,
"value": "[REDACTED]"
}
},
"extra": {}
}
}
Every successful credential inventory request records a
credential.inventory_accessed audit event. Audit metadata includes counts and
which reference names were present; raw secret values are never included.
Environment policy management¶
Environment policy enforcement can be managed through the service when a deployment needs operator-controlled gates outside checked-in project config:
Reads require policies:read; writes require policies:write. A policy
update stores validated EnvironmentPolicyEnforcement settings in the run
state store's project_metadata table/key-value area:
{
"enabled": true,
"fail_on_warnings": false,
"blocked_diagnostic_codes": [],
"require_human_review": true,
"require_tool_permissions": true,
"deny_unrestricted_tools": true,
"require_guardrails": false
}
The response includes the current policy and, for updates, the previous policy
when one existed. Each successful update records a policy.changed audit event
with the policy surface, environment, previous/current redacted policy payloads,
and changed field names.
Request limits¶
The service rejects oversized request bodies before route handling and applies
strict request-model validation for inline payloads. Unknown request fields are
rejected with 422, and bodies above the configured transport limit return
413.
Default limits:
KNEO_SERV_MAX_BODY_BYTES:1048576KNEO_SERV_MAX_INPUT_CHARS:20000KNEO_SERV_MAX_HUMAN_CONTENT_CHARS:20000KNEO_SERV_MAX_INLINE_SPEC_BYTES:262144KNEO_SERV_MAX_OVERRIDES_BYTES:65536KNEO_SERV_MAX_METADATA_BYTES:32768KNEO_SERV_MAX_LIST_ITEMS:100KNEO_SERV_MAX_PATH_CHARS:4096
Structured logging¶
API requests emit redacted JSON log records on the kneo_serv.service logger.
Each request record includes event=http_request, request_id, method, path,
status code, duration, client IP when available, and route-supplied run,
continuation, or trace IDs when known.
Clients can send X-Request-ID; otherwise the service generates one. The
response always includes the effective X-Request-ID.
Configuration:
KNEO_SERV_REQUEST_LOGS: defaults totrueKNEO_SERV_LOG_LEVEL: defaults toINFO
SDK OpenTelemetry tracing¶
When SDK telemetry support is installed, set KNEO_SERV_OTEL_ENABLED=true to
attach kneo_agent.observability.OpenTelemetryMiddleware to SDK-backed agents.
The middleware uses the OpenTelemetry global tracer provider, so exporters and
resources can be configured with standard OTEL_* environment variables in the
deployment environment.
Service defaults keep potentially sensitive span attributes disabled:
KNEO_SERV_OTEL_RECORD_ARGUMENTS: defaults tofalseKNEO_SERV_OTEL_RECORD_RESULTS: defaults tofalse
Enable those only for trusted deployments where tool arguments and results are safe to emit to telemetry backends.
Idempotency¶
POST /runs, POST /specs/run, and
POST /human-tasks/{continuation_id}/resume support the Idempotency-Key
header. When the same key is reused with the same request payload, the service
returns the original response without creating a duplicate run or submitting a
second human decision.
Reusing a key with a different payload returns 409 with
idempotency_key_conflict.
The CLI service client can send a key per call in code, or read one from:
Human-task resume also takes a store-backed continuation lock. If another
process is already resuming the same continuation, the service returns 409
with resource_locked.
Run cancellation¶
POST /runs/{run_id}/cancel marks a pending or running run as cancelled.
Background execution receives a cooperative cancellation token through the
SDK run config extra payload, so service workflows, agents, runtimes, and
wrapped workflow steps check cancellation before and after unit-of-work
boundaries. A cancelled run is not overwritten as completed if execution
returns after cancellation was requested.
Provider calls that do not expose an interrupt primitive can only stop at the next cooperative boundary after the provider returns.
Retry, timeout, and backoff¶
Service-client retries are configured with KNEO_SERV_CLIENT_* variables.
Provider/runtime and MCP calls use the same conservative policy shape:
export KNEO_SERV_PROVIDER_RETRIES=2
export KNEO_SERV_PROVIDER_RETRY_BACKOFF_SECONDS=0.25
export KNEO_SERV_PROVIDER_TIMEOUT_SECONDS=120
export KNEO_SERV_MCP_RETRIES=2
export KNEO_SERV_MCP_RETRY_BACKOFF_SECONDS=0.25
export KNEO_SERV_MCP_TIMEOUT_SECONDS=30
Workflow steps can also set on_error: retry, max_retries, and
timeout_seconds in YAML specs. Cancellation is never retried.
Health checks¶
This section is the API contract. For an on-call triage tree mapping each
/readyz check to recovery actions, see
incident_response.md.
GET /healthz: lightweight API health.GET /livez: process liveness.GET /readyz: readiness for API wiring, run state store, continuation store, durable run queue, runtime registry, tool registry, and configured provider or MCP secret dependencies.
Provider and MCP dependency checks are opt-in so local development does not fail when no real upstream credentials are configured:
If a configured readiness dependency is missing or unhealthy, /readyz
returns 503 with a structured not_ready detail payload.
Background worker queue¶
Async run creation enqueues run IDs into the configured run state store before worker execution. SQLite and file stores persist queue records with status, attempt count, lease owner, lease expiry, and error details; in-memory stores keep the same contract for tests and local ephemeral use.
Workers claim queued or expired leased records, execute the run through the
same PlatformManager.execute_run path, and then mark the queue record
completed or failed. On service startup the default manager starts a worker
so previously queued records can be resumed.
Recovery and continuation¶
Workflow execution stores live execution context on run state and persists step/node completion and failure checkpoints. For interrupted non-human sequential workflows, the service can report the completed steps, failed step, resume input, and next step index:
When replay_context.can_continue is true, the run can continue from the last
completed step boundary:
Graph workflows expose replay context from node checkpoints, but automatic continuation is limited to sequential workflows until graph edge state is persisted at each routing decision.
Replay and checkpoint diff¶
Operators can inspect a compact replay timeline without reading full checkpoint payloads:
The response includes checkpoint sequence, type, step/node IDs, status,
current execution position, pending human request ID, error summary, and the
same replay context used by /runs/{run_id}/recovery.
Checkpoint diffs compare checkpoint state and metadata. By default the latest two checkpoints are compared:
GET /runs/{run_id}/checkpoints/diff
GET /runs/{run_id}/checkpoints/diff?from_sequence=1&to_sequence=3
The diff response reports added, removed, and changed flattened paths. Values are redacted before returning.
Audit events¶
The service records redacted audit events in the configured run state store for successful spec operations, run creation, run cancellation, run continuation, spec-run execution, and human-in-the-loop decisions.
The audit list endpoint requires audit:read and returns events newest first.
Each event includes event_type, actor, optional run_id and
continuation_id, redacted metadata, and created_at.
SQLite migrations¶
SQLite state stores apply versioned migrations on startup. The migration table
is schema_migrations, and the current schema covers run state, checkpoints,
idempotency records, locks, durable run queue records, continuation records,
audit event records, and project metadata records.
Existing unversioned SQLite databases are upgraded in place with
CREATE TABLE IF NOT EXISTS and CREATE INDEX IF NOT EXISTS statements, so
existing run payloads remain readable after migration.
Project metadata is used by service-managed environment policies. Upgrade coverage verifies that existing SQLite databases can create, persist, and reload policy metadata after migrations have applied.
Retention and pruning¶
RetentionManager provides an operator-callable pruning job for run state,
checkpoints, completed or failed queue records, file-backed continuations,
artifacts, and logs. It can be configured directly or through environment
variables:
export KNEO_SERV_RETENTION_RUNS_DAYS=30
export KNEO_SERV_RETENTION_CHECKPOINTS_DAYS=30
export KNEO_SERV_RETENTION_QUEUE_DAYS=14
export KNEO_SERV_RETENTION_CONTINUATIONS_DAYS=30
export KNEO_SERV_RETENTION_ARTIFACTS_DAYS=30
export KNEO_SERV_RETENTION_LOGS_DAYS=30
The platform manager exposes prune_retention() for embedded operators and
future scheduled jobs.
Checkpoint payload limits¶
SQLite and file stores transparently compress large checkpoint payloads before writing them. If a checkpoint remains above the hard cap after compression, the store persists a bounded checkpoint preview that keeps run ID, checkpoint type, step/node IDs, timestamps, limited trace previews, and metadata describing the size reduction.
Defaults:
KNEO_SERV_CHECKPOINT_COMPRESS_BYTES:65536KNEO_SERV_CHECKPOINT_MAX_BYTES:1048576KNEO_SERV_CHECKPOINT_PREVIEW_CHARS:1200KNEO_SERV_CHECKPOINT_MAX_LIST_ITEMS:20KNEO_SERV_CHECKPOINT_MAX_DICT_ITEMS:50
Backup and restore¶
This section documents the Python backup API. For the operator-facing
production procedure (PostgreSQL pg_dump, off-site rotation, restore
verification, DR checklist), see
backup_and_recovery.md.
The default SQLite store can be backed up online with SQLite's backup API:
from kneo_serv.maintenance import backup_sqlite_database, restore_sqlite_database
backup_sqlite_database(".kneo/kneo_runs.sqlite", ".kneo/backups/kneo_runs.sqlite")
restore_sqlite_database(".kneo/backups/kneo_runs.sqlite", ".kneo/kneo_runs.restored.sqlite")
The smoke test covers run state and checkpoint restore from the copied database. File-backed continuations, artifacts, and logs should be included in deployment-level filesystem backups when those paths are used.
Runs¶
- POST /v1/runs
- GET /v1/runs
- GET /v1/runs/{run_id}
- POST /v1/runs/{run_id}/cancel
- GET /v1/runs/{run_id}/policy-report
- GET /v1/runs/{run_id}/recovery
- GET /v1/runs/{run_id}/replay
- POST /v1/runs/{run_id}/continue
- GET /v1/runs/{run_id}/checkpoints
- GET /v1/runs/{run_id}/checkpoints/diff
- GET /v1/runs/{run_id}/trace
Legacy aliases:
- GET /runs
- POST /runs
- GET /runs/{run_id}
- POST /runs/{run_id}/cancel
- GET /runs/{run_id}/recovery
- GET /runs/{run_id}/replay
- POST /runs/{run_id}/continue
- GET /runs/{run_id}/checkpoints
- GET /runs/{run_id}/checkpoints/diff
- GET /runs/{run_id}/trace
Human tasks¶
- GET /v1/human-tasks
- GET /v1/human-tasks/{continuation_id}
- POST /v1/human-tasks/{continuation_id}/resume
Legacy aliases:
- GET /human-tasks
- GET /human-tasks/{continuation_id}
- POST /human-tasks/{continuation_id}/resume
Specs¶
- POST /v1/specs/validate
- POST /v1/specs/compile
- POST /v1/specs/explain
- POST /v1/specs/run
Legacy aliases:
- POST /specs/validate
- POST /specs/compile
- POST /specs/explain
- POST /specs/run
Audit¶
- GET /v1/audit-events
Legacy alias:
- GET /audit-events
Security and policies¶
- GET /v1/security/credentials
- GET /v1/policies/environment
- GET /v1/policies/environment/{environment}
- PUT /v1/policies/environment/{environment}
Legacy aliases:
- GET /security/credentials
- GET /policies/environment
- GET /policies/environment/{environment}
- PUT /policies/environment/{environment}
Worked examples¶
Concrete curl invocations and abbreviated response shapes for the most
common endpoints. The full schema is in
openapi.json; these are illustrative.
All examples assume:
Health¶
curl -sf "$BASE/livez" # {"ok": true, "metadata": {}}
curl -sf "$BASE/readyz" # 200 with checks: {} or 503 with not_ready details
/livez and /readyz are intentionally unauthenticated. See
troubleshooting.md § 1.2 for
the failure shape.
Create a run¶
Required scope: runs:write.
curl -sf -X POST "$BASE/v1/runs" \
-H "Authorization: Bearer $KEY" \
-H 'Content-Type: application/json' \
-d '{
"input": "Summarize Nvidia AI strategy",
"spec_path": "examples/research_agent.yaml",
"target": "workflow",
"environment": "prod",
"async_mode": false
}' | jq
Synchronous response (run finished within the request):
{
"run_id": "run_2026-05-10T12:34:56_a1b2c3d4",
"status": "succeeded",
"output_text": "Nvidia's AI strategy hinges on …",
"human_intervention_required": false,
"continuation_id": null,
"metadata": {"workflow_kind": "sequential", "trace_event_count": 7}
}
If the workflow pauses on a human step:
{
"run_id": "run_…",
"status": "paused",
"output_text": null,
"human_intervention_required": true,
"continuation_id": "cont_…",
"metadata": {"pending_human_request": {"request_id": "req_…", "prompt": "Approve the draft?"}}
}
For retry-safe submissions, send an Idempotency-Key header. Reusing
the same key with the same body replays the original response;
mismatched bodies return 409 idempotency_key_conflict.
Get run state¶
Required scope: runs:read.
{
"run_id": "run_…",
"status": "running",
"agent_name": "research-copilot",
"workflow_name": "research-pipeline",
"workflow_kind": "sequential",
"current_step_index": 1,
"current_node_id": "analyze",
"visited_steps": ["retrieve"],
"visited_nodes": ["retrieve"],
"trace_event_count": 4,
"metadata": {"environment": "prod"}
}
For terminal status:
{
"run_id": "run_…",
"status": "succeeded",
"output_text": "…",
"visited_steps": ["retrieve", "analyze", "summarize"],
"trace_event_count": 11
}
List runs (paginated)¶
curl -sf "$BASE/v1/runs?status=running&limit=20&sort_by=created_at&sort_order=desc" \
-H "Authorization: Bearer $KEY" | jq
{
"runs": [
{"run_id": "run_…", "status": "running", "workflow_name": "research-pipeline", "created_at": "2026-05-10T12:30:00Z"},
{"run_id": "run_…", "status": "running", "workflow_name": "approval-workflow", "created_at": "2026-05-10T12:28:11Z"}
],
"count": 2,
"total": 2,
"limit": 20,
"offset": 0,
"sort_by": "created_at",
"sort_order": "desc"
}
Cancel a run¶
The run transitions to cancelled; cancellation is cooperative —
in-flight steps stop at unit-of-work boundaries. See
troubleshooting.md § 5.2.
Validate a spec¶
Required scope: specs:read.
curl -sf -X POST "$BASE/v1/specs/validate" \
-H "Authorization: Bearer $KEY" \
-H 'Content-Type: application/json' \
-d '{"spec_path": "examples/research_agent.yaml", "environment": "prod"}' | jq
{
"valid": true,
"diagnostics": [],
"report": {
"agent_name": "research-copilot",
"workflow_name": "research-pipeline"
}
}
For an invalid spec, valid is false and diagnostics is populated:
{
"valid": false,
"diagnostics": [
{
"severity": "error",
"code": "E_UNKNOWN_TOOL",
"message": "Tool 'web_search' is not registered.",
"path": "agent.tools[0]"
}
]
}
List human tasks¶
Required scope: human:read.
{
"tasks": [
{
"continuation_id": "cont_…",
"run_id": "run_…",
"request": {
"request_id": "req_…",
"prompt": "Approve the draft?",
"deadline_epoch": 1715432400
}
}
],
"count": 1,
"total": 1,
"limit": 100,
"offset": 0
}
Resume a human task¶
Required scope: human:write. Pair with Idempotency-Key for safe
retries.
curl -sf -X POST "$BASE/v1/human-tasks/cont_…/resume" \
-H "Authorization: Bearer $KEY" \
-H "Idempotency-Key: $(uuidgen)" \
-H 'Content-Type: application/json' \
-d '{
"request_id": "req_…",
"decision": "approved",
"content": "Looks good. Ship it."
}' | jq
{
"run_id": "run_…",
"status": "succeeded",
"output_text": "Published. https://…",
"human_intervention_required": false,
"continuation_id": null,
"metadata": {}
}
decision is one of approved, rejected, edited, selected,
provided. See human_in_the_loop.md.
List audit events¶
Required scope: audit:read. Audit payloads are redacted; secret and
PII patterns never appear.
{
"events": [
{
"id": "evt_…",
"event_type": "human.decision",
"actor": "reviewer",
"occurred_at": "2026-05-10T12:35:01Z",
"payload": {
"request_id": "req_…",
"decision": "approved",
"selected_option": null,
"result_status": "succeeded",
"content_present": true
}
}
],
"count": 1
}
Inspect credential references¶
Required scope: credentials:read. Returns presence metadata only;
secret values never appear.
{
"inventory": {
"providers": [
{"name": "openai", "env_var": "OPENAI_API_KEY", "present": true, "value": "***REDACTED***"},
{"name": "anthropic", "env_var": "ANTHROPIC_API_KEY", "present": false, "value": null}
],
"mcp": [],
"service": [
{"name": "operator", "env_var": "KNEO_SERV_API_KEYS", "present": true, "value": "***REDACTED***"}
]
}
}
Each access records a credential.inventory_accessed audit event.
Read or update environment policy¶
Read requires policies:read; write requires policies:write.
{
"environment": "prod",
"policy": {
"enabled": true,
"fail_on_warnings": false,
"blocked_diagnostic_codes": ["E_UNSAFE_TOOL_IMPORT"],
"require_human_review": false,
"require_tool_permissions": true,
"deny_unrestricted_tools": true,
"require_guardrails": false
}
}
curl -sf -X PUT "$BASE/v1/policies/environment/prod" \
-H "Authorization: Bearer $KEY" \
-H 'Content-Type: application/json' \
-d '{
"enabled": true,
"require_tool_permissions": true,
"deny_unrestricted_tools": true,
"blocked_diagnostic_codes": ["E_UNSAFE_TOOL_IMPORT", "E_UNSAFE_FUNCTION_IMPORT"]
}' | jq
The response includes previous_policy so you can audit what changed.
Each write records a policy.changed audit event.
Error response shape¶
All error paths use the same envelope:
{
"error": "forbidden",
"message": "Missing required scope: runs:write",
"required_scope": "runs:write"
}
Common error codes: unauthorized (401), forbidden (403),
invalid_request (400), not_found (404), idempotency_key_conflict
(409), payload_too_large (413), not_ready (503). Errors map through
service/errors.py.
Pagination, filtering, and sorting¶
List-style endpoints return the original collection field plus pagination metadata:
{
"count": 25,
"total": 91,
"limit": 25,
"offset": 50,
"sort_by": "updated_at",
"sort_order": "desc"
}
Supported query parameters:
GET /v1/runs:status,limit,offset,sort_by,sort_orderGET /v1/runs/{run_id}/checkpoints:type,limit,offset,sort_by,sort_orderGET /v1/runs/{run_id}/trace:event_type,limit,offset,sort_by,sort_orderGET /v1/human-tasks:run_id,workflow_kind,limit,offset,sort_by,sort_order
sort_order is asc or desc; limit is capped at 1000.