Skip to content

Changelog

LM Deluge iterates quickly. This page calls out the highlights from the most recent releases. For a blow-by-blow history you can always inspect git log, but the sections below should help you catch up quickly starting with v0.0.62.

  • output_schema now accepts raw JSON Schemas or Pydantic BaseModel subclasses. lm_deluge.util.schema.prepare_output_schema() handles the conversion to strict JSON Schema (adds additionalProperties: false, expands $defs, keeps optional fields nullable, etc.) and feeds both Anthropic and OpenAI builders, with coverage in tests/core/test_schema_transformations.py and tests/core/test_pydantic_structured_outputs.py.
  • Anthropic/OpenAI structured output requests now share the same normalization path so provider quirks stay isolated—unsupported Anthropic constraints move into descriptions while OpenAI keeps the tight grammar untouched. Regression suites for the chat and Responses APIs plus new real-run harnesses (tests/one_off/test_anthropic_structured_outputs_real.py, tests/one_off/test_openai_structured_outputs_real.py) make sure the wiring keeps working.
  • Shipped examples/pydantic_structured_outputs_example.py and refreshed the structured outputs docs so teams can drop a Pydantic model into LLMClient.process_prompts_*() without hand-rolling schemas or worrying about mutation.
  • Structured outputs landed across Anthropic and OpenAI: LLMClient(..., output_schema=...) now pushes the JSON Schema to Claude (complete with the structured-outputs-2025-11-13 beta and strict-tool gating) and to both OpenAI chat and Responses API requests, with schema precedence over json_mode everywhere.
  • Tightened tool serialization so strict schemas only turn on when providers actually support it (Bedrock always forces non-strict) and made MCP-backed OpenAI Responses runs share the same strict/non-strict behavior; covered by fresh suites in tests/core/test_openai_structured_outputs.py and tests/core/test_bedrock_requests.py.
  • process_prompts_sync() forwards output_schema, and the new regression test (tests/core/test_process_prompts_sync.py) ensures future changes keep the sync/async surfaces aligned.
  • Added one-off real API coverage for OpenAI structured outputs plus a battery of deterministic unit tests so regressions in schema handling or strict tooling are caught automatically.
  • Added the GPT-5.1 family (standard, Codex, Codex Mini) with pricing metadata and marked them as reasoning models so they Just Work with LLMClient.
  • Extended reasoning suffix parsing to accept -minimal and -none, enforced that Codex variants must run against the Responses API, and added guard rails that convert unsupported efforts to the closest valid value with clear warnings.
  • Updated the OpenAI request builders plus the warning system so GPT-5.1 downgrades from minimal to none transparently while older models downgrade to low, and added coverage for the new models (tests/models/test_gpt_5_1.py).
  • Background requests now honour request_timeout precisely: polling uses a monotonic clock, cancels the remote response before erroring, and surfaces a structured timeout APIResponse instead of hanging jobs.
  • Cancellation is best-effort logged when failures happen so you can trace leaked jobs during debugging.
  • Conversation.from_openai_chat() now filters out whitespace-only text blocks and skips empty messages so bad payloads from upstream providers no longer crash tool execution.
  • MockAsyncOpenAI does a real conversion from OpenAI tool definitions into lm-deluge Tool objects, wires them through LLMClient.start(), and carries the active CachePattern, so you can run copilot-style tools under tests without custom glue.
  • Added a focused test suite for the mock client (tests/test_mock_openai.py) that exercises the OpenAI-compatible surface area.
  • Packaging now re-exports AsyncOpenAI-style exception classes (APIError, APITimeoutError, BadRequestError, RateLimitError) so verifier harnesses can catch them directly from lm_deluge.
  • MockAsyncOpenAI gained full parity with the official AsyncOpenAI signature: you can pass api_key, organization, project, custom base URLs, and call the legacy .completions.create() path in addition to chat completions.
  • Added an async close() noop for compatibility together with extensive tests to ensure verifier integrations behave as expected.
  • Introduced the optional lm-deluge[openai] extra and shipped the first cut of MockAsyncOpenAI, giving you an on-device OpenAI-compatible client backed by LLMClient.
  • Registered the first Moonshot/Kimi (kimi-k2, kimi-k2-turbo, kimi-k2-thinking, kimi-k2-thinking-turbo) and MiniMax (minimax-m2) models so you can swap between those providers without custom API wrappers.
  • Added regression tests for the new models (tests/models/test_kimi_and_minimax.py) to make sure they stay callable.
  • Hardened OpenAIResponsesRequest.handle_response() so truncated/incomplete streaming payloads now produce actionable error messages (with the provider’s incomplete_details) instead of JSON parsing failures, and ensured the dangling await in the OpenAI client path is fixed.
  • Added dedicated coverage in tests/core/test_incomplete_response.py for both the incomplete and the successful response paths.
  • When you pass MCP server dictionaries (with a url key) through tools for Anthropic models, the client now automatically moves them into the mcp_servers array and sets the right beta header, so Anthropics’ MCP integration works without any manual request massaging.
  • Tightened the strict-mode JSON Schema generator for tools: when strict=True, nested object schemas (including those inside $defs) have additionalProperties: false, defaults are stripped, and every property is marked required, matching OpenAI’s schema contract.
  • Backed the change with new tests in tests/core/test_tool_defs.py to ensure tools with and without $defs serialize correctly.
  • Added first-class $defs/definitions support to Tool plus the MCP loader so complex tool schemas with references survive serialization.
  • Tool.for_openai_completions() now automatically includes $defs, rejects schemas that can’t run in strict mode, and sets additionalProperties: false so OpenAI’s strict JSON schema validation passes out of the box.
  • SamplingParams and LLMClient accept reasoning_effort="minimal" (and "none") so you can target the more efficient reasoning tiers exposed by OpenAI without hand-editing objects.
  • Added regression coverage in tests/core/test_reasoning_effort_minimal.py.
  • Message.with_file() / add_file() now accept existing File objects, letting you build up prompts from pre-signed files without duplicates.
  • Added Message.with_remote_file() to turn local bytes/paths into provider-hosted files asynchronously (with provider guard rails), making it easy to keep Anthropic/OpenAI file references in sync when constructing conversations.

Looking for something older? Run git log --oneline or inspect the GitHub release feed—this page will continue to backfill as new releases ship.