Upgrading to 1.2.0¶
A short, practical "should I upgrade?" walkthrough for users on
1.1.x. The full list of changes is in
CHANGELOG.md; this page covers only the parts
that affect existing code.
TL;DR¶
- No breaking changes. Every public name from 1.1.x still works
the same way. New features are additive. You can
pip install --upgrade kneo-agentand run your existing code unchanged. - Worth upgrading for: local-LLM support without pre-building an
AsyncOpenAIclient; the new middleware bundle (retry / rate-limit / token-budget / redaction); a documentedRunResult.metadata["usage"]schema; MCP TLS / mTLS support; and aSecretProviderProtocol with three reference implementations. - Read this page first if you: pin to a specific subpackage size,
pass MCP servers behind a corporate CA, or care about the
RunResult.metadatashape.
What's new¶
1. Local LLMs via base_url=¶
Before 1.2.0 you had to pre-build an AsyncOpenAI client to point
at Ollama / vLLM / llama.cpp / LocalAI. Now base_url= and
api_key= are first-class on every OpenAI entry point:
from kneo_agent import build_sync_agent
agent = build_sync_agent(
"openai",
base_url="http://localhost:11434/v1",
model="llama3.1",
system_prompt="You are a concise assistant.",
)
The same parameters land on BridgeAgentFactory.for_openai,
NativeRuntimeFactory.for_openai, and OpenAIAgentsImpl. Passing
both openai_client= and base_url= raises ValueError.
When base_url is set without api_key, the SDK uses a placeholder
string so AsyncOpenAI doesn't error on a missing key (most local
servers ignore it). Set api_key= explicitly when your local server
checks it.
See examples/cookbook/local_ollama.py
for a runnable demo.
2. Middleware bundle (kneo_agent.middleware)¶
Four new middlewares ship under a new subpackage:
from kneo_agent.middleware import (
RetryMiddleware,
RateLimitMiddleware,
TokenBudgetMiddleware,
RedactionMiddleware,
COMMON_PATTERNS,
)
RetryMiddleware— retries transient model / tool failures with exponential backoff; never retriesasyncio.CancelledError.RateLimitMiddleware— token-bucket throttle on whole runs or individual model calls.TokenBudgetMiddleware— per-run / cumulative caps fromRunResult.metadata["usage"]. RaisesTokenBudgetExceeded.RedactionMiddleware— regex scrubbing of secrets in inputs, tool args, tool results, the final result, and streamed chunks. Ships aCOMMON_PATTERNSstarter pack.
The framework types (AgentMiddleware, BaseAgentMiddleware, the
context dataclasses) are re-exported from kneo_agent.middleware,
so you can import everything from one place.
Ordering rule (covered in detail in the package docstring):
- Static middleware (registered via
AgentBuilder.add_middleware) wraps every run, in registration order. - Per-run middleware (passed via
RunConfig.middlewares) is appended inside the static list.
For redaction + observability specifically: register RedactionMiddleware
outer (first via add_middleware) and OpenTelemetryMiddleware(record_results=False)
inner. The redaction layer scrubs inputs before the OTel layer
records them, and record_results=False keeps un-redacted output
out of the spans.
3. Documented RunResult.metadata["usage"] schema¶
The usage key on RunResult.metadata now has a formal shape:
prompt_tokens / completion_tokens are accepted as aliases for
provider-native naming.
Population today: the OpenAI Agents native runtime (for_openai
in both NativeRuntimeFactory and BridgeAgentFactory) populates
metadata["usage"] automatically by summing across
raw_responses[*].usage. LangChain, Google ADK, and Bridge runtimes
do not yet populate it; if you need usage tracking on those paths,
populate metadata["usage"] from a custom middleware. The full
docstring on RunResult enumerates which runtimes do and don't
populate.
OpenTelemetryMiddleware reads the documented keys and emits the
GenAI semantic-convention attributes
gen_ai.usage.input_tokens / gen_ai.usage.output_tokens.
TokenBudgetMiddleware reads the same keys to enforce caps.
4. MCP TLS / mTLS / custom CA¶
MCPServerConfig.http() and .sse() now accept TLS knobs:
from kneo_agent import MCPServerConfig
config = MCPServerConfig.http(
name="internal",
url="https://mcp.internal.corp/v1",
ca_bundle="/etc/ssl/corp/ca.pem",
client_cert="/etc/ssl/corp/client.crt",
client_key="/etc/ssl/corp/client.key",
headers={"X-Internal-Auth": "..."},
)
MCPServerConfig.build_ssl_context() returns the corresponding
ssl.SSLContext; both transports thread it into urlopen(...,
context=...). No code change is needed if you were already on the
public CA path.
5. SecretProvider Protocol¶
kneo_agent.utils now exports a SecretProvider Protocol plus
three reference implementations:
from kneo_agent.utils import (
EnvSecretProvider, # env vars with optional prefix
FileSecretProvider, # K8s/Docker secret-mount file pattern
MappingSecretProvider, # in-memory dict for tests
SecretProvider,
SecretNotFound,
)
The SDK does not auto-resolve secrets — it documents one well-known
shape so applications plumb credentials through their tool handlers
and MCP transports without embedding raw strings in prompts or tool
args. The Protocol is runtime_checkable, so isinstance(my_vault_provider,
SecretProvider) works against your own implementation.
Pair with RedactionMiddleware for defense in depth.
6. Workflow retry-on-failure¶
kneo_agent.workflows.RetryStep wraps any WorkflowComponent (an
Agent, FunctionStep, or nested Workflow) with retry semantics.
Mirrors the middleware version; preserves the inner component's
name so existing graph edges still match.
from kneo_agent.workflows import RetryStep, WorkflowBuilder
builder = WorkflowBuilder(RetryStep(prepare, max_attempts=3))
builder.add_executor(RetryStep(call_provider, max_attempts=5,
retry_on=(ConnectionError,)))
WorkflowBuilder.add_edge(..., condition=...) and WorkflowEdge.condition
were already available in 1.1.x — see
examples/cookbook/workflow_branching_and_retry.py
for a full graph example. There is no separate
WorkflowGraphBuilder class — WorkflowBuilder is the graph
builder.
7. Sync-API failure-mode callout¶
The build_sync_agent / SyncAgent failure mode (raises
RuntimeError inside an already-running event loop — Jupyter,
FastAPI handler, other async code) is now documented in three
places: README, docs/README.md, and the API reference manual. No
code change.
8. Python 3.13 classifier¶
Programming Language :: Python :: 3.13 is now in pyproject.toml
classifiers. CI was already exercising 3.13 in 1.1.x.
What didn't change¶
- The top-level
kneo_agent.__all__set (75 names) is unchanged. A new snapshot test (tests/unit/test_public_api_snapshot.py) pins this surface so future drift fails CI. - The
AgentMiddlewareProtocol and four hooks (wrap_run,wrap_stream,wrap_model_call,wrap_tool_call) are unchanged. - The provider-extra version floors are unchanged.
What's still pending¶
The following items from the v1.2.0 roadmap landed in design / docs but did not land in the SDK code:
- Usage population for LangChain / Google ADK / Bridge runtimes.
Documented in the
RunResultdocstring as not-yet-populated; apps that need this should populate via custom middleware. - Enterprise data-source tool recipes (SQL / object store / REST / search). The cookbook structure is in place; the recipes themselves are tracked for a follow-up release.
CHANGELOG and policy¶
Full diff: CHANGELOG.md. Public API stability
guarantees and the deprecation policy: api_stability.md.