Self-hosted observability¶

kneo-agent ships an OpenTelemetryMiddleware that emits GenAI semantic-convention spans. The SDK does not bundle an exporter — that's a deployment concern your application owns. This guide shows how to wire the middleware to common self-hostable backends: OTLP collector, Jaeger, Grafana Tempo, and SigNoz.

The kneo-agent side is identical for all four — the only thing that changes is which container you point the exporter at.

1. Install the telemetry extra¶

pip install "kneo-agent[telemetry]"
pip install opentelemetry-sdk opentelemetry-exporter-otlp

The [telemetry] extra brings in opentelemetry-api only, on purpose — it's the part the SDK uses. The exporter (-otlp, -jaeger, -zipkin, …) is a deployment choice, so apps install it explicitly.

2. Attach the middleware¶

Wire it into your agent the same way as any other middleware:

from kneo_agent import build_agent
from kneo_agent.observability import OpenTelemetryMiddleware

agent = build_agent(
    "openai",
    base_url="http://localhost:11434/v1",
    model="llama3.1",
    middlewares=[
        OpenTelemetryMiddleware(
            record_arguments=True,    # set False to suppress tool-arg attrs
            record_results=False,     # default False; set True to capture final-result attrs
        ),
    ],
)

Pair it with RedactionMiddleware if your inputs may contain secrets — see api_stability.md and the kneo_agent.middleware package docstring for the recommended ordering (Redaction outer, OTel inner; record_results=False keeps un-redacted output out of spans).

3. Configure the SDK exporter¶

This is standard OpenTelemetry; the SDK doesn't get involved. Recommended: configure it once at app startup before any agent runs.

from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter

# service.name comes from the Resource you set here — the SDK does not set it.
# Without it, OTel reports spans under `unknown_service`.
provider = TracerProvider(resource=Resource.create({"service.name": "kneo_agent"}))
provider.add_span_processor(
    BatchSpanProcessor(
        OTLPSpanExporter(endpoint="http://localhost:4318/v1/traces")
    )
)
trace.set_tracer_provider(provider)

Switch to gRPC by importing from opentelemetry.exporter.otlp.proto.grpc.trace_exporter and using the :4317 port.

4. Backend recipes¶

Run any of these alongside your application; pick one and point the OTLP exporter at it. None of them are an SDK artifact — they're infrastructure your application's deployment pulls in.

OTLP collector (vendor-neutral)¶

# docker-compose.yml
services:
  otel-collector:
    image: otel/opentelemetry-collector-contrib:latest
    command: ["--config=/etc/otelcol/config.yaml"]
    volumes:
      - ./otel-config.yaml:/etc/otelcol/config.yaml
    ports:
      - "4317:4317"   # gRPC
      - "4318:4318"   # HTTP

# otel-config.yaml — minimal: receive OTLP, log to stdout
receivers:
  otlp:
    protocols:
      grpc:
      http:
service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [debug]
exporters:
  debug:
    verbosity: detailed

Jaeger (all-in-one)¶

services:
  jaeger:
    image: jaegertracing/all-in-one:latest
    environment:
      - COLLECTOR_OTLP_ENABLED=true
    ports:
      - "16686:16686"  # UI
      - "4317:4317"    # OTLP gRPC
      - "4318:4318"    # OTLP HTTP

Open http://localhost:16686 for the Jaeger UI. The kneo-agent spans appear under service name kneo_agent — set via the Resource on your TracerProvider in step 3, not by the SDK. (The OpenTelemetryMiddleware only names the instrumentation scope kneo_agent; without the Resource, OTel reports unknown_service.)

Grafana Tempo + Grafana¶

services:
  tempo:
    image: grafana/tempo:latest
    command: ["-config.file=/etc/tempo.yaml"]
    volumes:
      - ./tempo.yaml:/etc/tempo.yaml
    ports:
      - "4317:4317"
      - "4318:4318"
  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"

Point Grafana's data source at http://tempo:3200 and query the GenAI semantic-convention attributes:

gen_ai.system
gen_ai.operation.name
gen_ai.agent.name
gen_ai.tool.name
gen_ai.usage.input_tokens / gen_ai.usage.output_tokens

SigNoz¶

git clone https://github.com/SigNoz/signoz.git
cd signoz/deploy
./install.sh

SigNoz exposes an OTLP receiver on :4317 (gRPC) and :4318 (HTTP). Open http://localhost:3301 for the SigNoz UI.

5. Verify end-to-end¶

A minimal smoke check that doesn't require a real LLM:

import asyncio
from kneo_agent import build_agent
from kneo_agent.observability import OpenTelemetryMiddleware

# Configure the exporter as in step 3 first.

async def main():
    agent = build_agent(
        "openai",
        base_url="http://localhost:11434/v1",   # your local server
        model="llama3.1",
        middlewares=[OpenTelemetryMiddleware()],
    )
    print(await agent.chat("ping"))

asyncio.run(main())

Within ~30 seconds, the run should appear in your chosen backend under the kneo_agent service. The middleware emits one span per agent run plus child spans for tool calls and model invocations when those flow through the Bridge runtime.

6. Troubleshooting¶

No spans appear. Check the exporter is configured before any agent runs (the TracerProvider is read at first use). Run with OTEL_LOG_LEVEL=debug set to surface exporter errors on stderr.
Spans missing usage attributes. All built-in runtimes — OpenAI Agents native, the three Bridge executors, and all three Adapter paths (Google ADK + LangChain as of 1.4.0, OpenAI Agents thereafter) — populate RunResult.metadata["usage"] when the provider reports it, and the OTel middleware records it on the span. Missing usage usually means the provider returned none for that call.
TLS errors against an internal collector. The exporter is a standard OpenTelemetry component — configure its CA bundle the same way you would for any other Python HTTP client (env vars like REQUESTS_CA_BUNDLE / SSL_CERT_FILE, or OTLPSpanExporter(certificate_file=...)).