Skip to content

Self-hosted observability

kneo-agent ships an OpenTelemetryMiddleware that emits GenAI semantic-convention spans. The SDK does not bundle an exporter — that's a deployment concern your application owns. This guide shows how to wire the middleware to common self-hostable backends: OTLP collector, Jaeger, Grafana Tempo, and SigNoz.

The kneo-agent side is identical for all four — the only thing that changes is which container you point the exporter at.

1. Install the telemetry extra

pip install "kneo-agent[telemetry]"
pip install opentelemetry-sdk opentelemetry-exporter-otlp

The [telemetry] extra brings in opentelemetry-api only, on purpose — it's the part the SDK uses. The exporter (-otlp, -jaeger, -zipkin, …) is a deployment choice, so apps install it explicitly.

2. Attach the middleware

Wire it into your agent the same way as any other middleware:

from kneo_agent import build_agent
from kneo_agent.observability import OpenTelemetryMiddleware

agent = build_agent(
    "openai",
    base_url="http://localhost:11434/v1",
    model="llama3.1",
    middlewares=[
        OpenTelemetryMiddleware(
            record_arguments=True,    # set False to suppress tool-arg attrs
            record_results=False,     # set False to suppress final-result attrs
        ),
    ],
)

Pair it with RedactionMiddleware if your inputs may contain secrets — see api_stability.md and the kneo_agent.middleware package docstring for the recommended ordering (Redaction outer, OTel inner; record_results=False keeps un-redacted output out of spans).

3. Configure the SDK exporter

This is standard OpenTelemetry; the SDK doesn't get involved. Recommended: configure it once at app startup before any agent runs.

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter

provider = TracerProvider()
provider.add_span_processor(
    BatchSpanProcessor(
        OTLPSpanExporter(endpoint="http://localhost:4318/v1/traces")
    )
)
trace.set_tracer_provider(provider)

Switch to gRPC by importing from opentelemetry.exporter.otlp.proto.grpc.trace_exporter and using the :4317 port.

4. Backend recipes

Run any of these alongside your application; pick one and point the OTLP exporter at it. None of them are an SDK artifact — they're infrastructure your application's deployment pulls in.

OTLP collector (vendor-neutral)

# docker-compose.yml
services:
  otel-collector:
    image: otel/opentelemetry-collector-contrib:latest
    command: ["--config=/etc/otelcol/config.yaml"]
    volumes:
      - ./otel-config.yaml:/etc/otelcol/config.yaml
    ports:
      - "4317:4317"   # gRPC
      - "4318:4318"   # HTTP
# otel-config.yaml — minimal: receive OTLP, log to stdout
receivers:
  otlp:
    protocols:
      grpc:
      http:
service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [debug]
exporters:
  debug:
    verbosity: detailed

Jaeger (all-in-one)

services:
  jaeger:
    image: jaegertracing/all-in-one:latest
    environment:
      - COLLECTOR_OTLP_ENABLED=true
    ports:
      - "16686:16686"  # UI
      - "4317:4317"    # OTLP gRPC
      - "4318:4318"    # OTLP HTTP

Open http://localhost:16686 for the Jaeger UI. The kneo-agent spans appear under service name kneo_agent (the default span attribute set by OpenTelemetryMiddleware).

Grafana Tempo + Grafana

services:
  tempo:
    image: grafana/tempo:latest
    command: ["-config.file=/etc/tempo.yaml"]
    volumes:
      - ./tempo.yaml:/etc/tempo.yaml
    ports:
      - "4317:4317"
      - "4318:4318"
  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"

Point Grafana's data source at http://tempo:3200 and query the GenAI semantic-convention attributes:

  • gen_ai.system
  • gen_ai.operation.name
  • gen_ai.agent.name
  • gen_ai.tool.name
  • gen_ai.usage.input_tokens / gen_ai.usage.output_tokens

SigNoz

git clone https://github.com/SigNoz/signoz.git
cd signoz/deploy
./install.sh

SigNoz exposes an OTLP receiver on :4317 (gRPC) and :4318 (HTTP). Open http://localhost:3301 for the SigNoz UI.

5. Verify end-to-end

A minimal smoke check that doesn't require a real LLM:

import asyncio
from kneo_agent import build_agent
from kneo_agent.observability import OpenTelemetryMiddleware

# Configure the exporter as in step 3 first.

async def main():
    agent = build_agent(
        "openai",
        base_url="http://localhost:11434/v1",   # your local server
        model="llama3.1",
        middlewares=[OpenTelemetryMiddleware()],
    )
    print(await agent.chat("ping"))

asyncio.run(main())

Within ~30 seconds, the run should appear in your chosen backend under the kneo_agent service. The middleware emits one span per agent run plus child spans for tool calls and model invocations when those flow through the Bridge runtime.

6. Troubleshooting

  • No spans appear. Check the exporter is configured before any agent runs (the TracerProvider is read at first use). Run with OTEL_LOG_LEVEL=debug set to surface exporter errors on stderr.
  • Spans missing usage attributes. Only the OpenAI Agents native runtime populates RunResult.metadata["usage"] in 1.2.0; on other runtimes the OTel middleware has nothing to copy. Populate via custom middleware or wait for the per-runtime rollout in a follow-up release.
  • TLS errors against an internal collector. The exporter is a standard OpenTelemetry component — configure its CA bundle the same way you would for any other Python HTTP client (env vars like REQUESTS_CA_BUNDLE / SSL_CERT_FILE, or OTLPSpanExporter(certificate_file=...)).