Self-hosted observability¶
kneo-agent ships an OpenTelemetryMiddleware that emits GenAI
semantic-convention spans. The SDK does not bundle an exporter —
that's a deployment concern your application owns. This guide
shows how to wire the middleware to common self-hostable backends:
OTLP collector, Jaeger, Grafana Tempo, and SigNoz.
The kneo-agent side is identical for all four — the only thing that changes is which container you point the exporter at.
1. Install the telemetry extra¶
The [telemetry] extra brings in opentelemetry-api only, on
purpose — it's the part the SDK uses. The exporter (-otlp,
-jaeger, -zipkin, …) is a deployment choice, so apps install it
explicitly.
2. Attach the middleware¶
Wire it into your agent the same way as any other middleware:
from kneo_agent import build_agent
from kneo_agent.observability import OpenTelemetryMiddleware
agent = build_agent(
"openai",
base_url="http://localhost:11434/v1",
model="llama3.1",
middlewares=[
OpenTelemetryMiddleware(
record_arguments=True, # set False to suppress tool-arg attrs
record_results=False, # set False to suppress final-result attrs
),
],
)
Pair it with RedactionMiddleware if your inputs may contain
secrets — see api_stability.md and the
kneo_agent.middleware package docstring for the recommended
ordering (Redaction outer, OTel inner; record_results=False keeps
un-redacted output out of spans).
3. Configure the SDK exporter¶
This is standard OpenTelemetry; the SDK doesn't get involved. Recommended: configure it once at app startup before any agent runs.
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
provider = TracerProvider()
provider.add_span_processor(
BatchSpanProcessor(
OTLPSpanExporter(endpoint="http://localhost:4318/v1/traces")
)
)
trace.set_tracer_provider(provider)
Switch to gRPC by importing from
opentelemetry.exporter.otlp.proto.grpc.trace_exporter and using
the :4317 port.
4. Backend recipes¶
Run any of these alongside your application; pick one and point the OTLP exporter at it. None of them are an SDK artifact — they're infrastructure your application's deployment pulls in.
OTLP collector (vendor-neutral)¶
# docker-compose.yml
services:
otel-collector:
image: otel/opentelemetry-collector-contrib:latest
command: ["--config=/etc/otelcol/config.yaml"]
volumes:
- ./otel-config.yaml:/etc/otelcol/config.yaml
ports:
- "4317:4317" # gRPC
- "4318:4318" # HTTP
# otel-config.yaml — minimal: receive OTLP, log to stdout
receivers:
otlp:
protocols:
grpc:
http:
service:
pipelines:
traces:
receivers: [otlp]
exporters: [debug]
exporters:
debug:
verbosity: detailed
Jaeger (all-in-one)¶
services:
jaeger:
image: jaegertracing/all-in-one:latest
environment:
- COLLECTOR_OTLP_ENABLED=true
ports:
- "16686:16686" # UI
- "4317:4317" # OTLP gRPC
- "4318:4318" # OTLP HTTP
Open http://localhost:16686 for the Jaeger UI. The kneo-agent
spans appear under service name kneo_agent (the default span
attribute set by OpenTelemetryMiddleware).
Grafana Tempo + Grafana¶
services:
tempo:
image: grafana/tempo:latest
command: ["-config.file=/etc/tempo.yaml"]
volumes:
- ./tempo.yaml:/etc/tempo.yaml
ports:
- "4317:4317"
- "4318:4318"
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
Point Grafana's data source at http://tempo:3200 and query the
GenAI semantic-convention attributes:
gen_ai.systemgen_ai.operation.namegen_ai.agent.namegen_ai.tool.namegen_ai.usage.input_tokens/gen_ai.usage.output_tokens
SigNoz¶
SigNoz exposes an OTLP receiver on :4317 (gRPC) and :4318
(HTTP). Open http://localhost:3301 for the SigNoz UI.
5. Verify end-to-end¶
A minimal smoke check that doesn't require a real LLM:
import asyncio
from kneo_agent import build_agent
from kneo_agent.observability import OpenTelemetryMiddleware
# Configure the exporter as in step 3 first.
async def main():
agent = build_agent(
"openai",
base_url="http://localhost:11434/v1", # your local server
model="llama3.1",
middlewares=[OpenTelemetryMiddleware()],
)
print(await agent.chat("ping"))
asyncio.run(main())
Within ~30 seconds, the run should appear in your chosen backend
under the kneo_agent service. The middleware emits one span per
agent run plus child spans for tool calls and model invocations
when those flow through the Bridge runtime.
6. Troubleshooting¶
- No spans appear. Check the exporter is configured before
any agent runs (the
TracerProvideris read at first use). Run withOTEL_LOG_LEVEL=debugset to surface exporter errors on stderr. - Spans missing
usageattributes. Only the OpenAI Agents native runtime populatesRunResult.metadata["usage"]in 1.2.0; on other runtimes the OTel middleware has nothing to copy. Populate via custom middleware or wait for the per-runtime rollout in a follow-up release. - TLS errors against an internal collector. The exporter is a
standard OpenTelemetry component — configure its CA bundle the
same way you would for any other Python HTTP client (env vars
like
REQUESTS_CA_BUNDLE/SSL_CERT_FILE, orOTLPSpanExporter(certificate_file=...)).