Anonymous View
Skip to content

fix(budget): refund the reservation on streaming pre-commit provider errors#129

Open
njbrake wants to merge 2 commits into
mainfrom
fix/streaming-precommit-reservation-leak
Open

fix(budget): refund the reservation on streaming pre-commit provider errors#129
njbrake wants to merge 2 commits into
mainfrom
fix/streaming-precommit-reservation-leak

Conversation

@njbrake

@njbrake njbrake commented Jun 9, 2026

Copy link
Copy Markdown
Member

Note: this PR description was drafted by Claude via back-and-forth with @njbrake. The reasoning and decisions are his; the prose is Claude's.

Description

Fixes #128. In standalone mode, a streaming request to /v1/messages or /v1/responses whose upstream provider call fails before the first chunk ("pre-commit") logged the error but did not refund the budget pre-debit hold. /v1/chat/completions and all non-streaming paths already refund here. Because a budget-period reset zeroes spend but not reserved, the leaked hold was never released, so repeated failures accumulated reserved until the user hit a spurious "budget exceeded" 403 with low actual spend.

This adds the missing refund_reservation to the messages/responses streaming pre-commit except blocks, making all three endpoints consistent.

  • Standalone only (platform mode has no local reservation).
  • No wire change: the HTTP status and body are unchanged; only users.reserved is released instead of leaked.
  • The stale "Matches the non-streaming handler" comments are corrected (the non-streaming handler does refund).

Relationship to #127

This is the standalone bug fix, split out of the strategy-seam refactor (#127) so it can land independently. #127 carries the same fix in its refactored form; once this merges, #127 will be rebased and reconciled.

PR Type

  • New Feature
  • Bug Fix
  • Refactor
  • Documentation
  • Infrastructure / CI

Relevant issues

Fixes #128

Checklist

  • I understand the code I am submitting.
  • I have added or updated tests that cover my change (tests/unit, tests/integration).
  • I ran the Definition of Done checks locally (make lint, make typecheck, make test).
  • Documentation was updated where necessary.
  • If the API contract changed, I regenerated the OpenAPI spec (uv run python scripts/generate_openapi.py).

Test plan

  • New tests/integration/test_streaming_precommit_refund.py: asserts users.reserved returns to 0 after a pre-commit streaming failure on /v1/chat/completions, /v1/messages, and /v1/responses (with a budgeted user + priced model so the hold is genuinely non-zero).
  • Verified the messages/responses tests fail without the fix (the leaked hold leaves reserved > 0) and pass with it.
  • make lint (ruff) and make typecheck (mypy strict) pass.

AI Usage

  • No AI was used.
  • AI was used for drafting/refactoring.
  • This is fully AI-generated.

AI Model/Tool used: Claude Code

Any additional AI details you'd like to share:

Claude drafted the fix and tests via back-and-forth with @njbrake; the reasoning and decisions are his.

NOTE:
When responding to reviewer questions, please respond yourself rather than copy/pasting reviewer comments into an AI and pasting back its answer. We want to discuss with you, not your AI :)

  • I am an AI Agent filling out this form (check box if true)

@njbrake njbrake temporarily deployed to integration-tests June 9, 2026 16:58 — with GitHub Actions Inactive
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown

Review Change Stack

Walkthrough

This PR fixes a budget reservation leak in streaming endpoints when upstream provider errors occur before streaming begins. It adds missing refund calls to /v1/messages and /v1/responses exception handlers and validates the fix with comprehensive integration tests covering all three streaming endpoints.

Changes

Streaming pre-commit reservation refund fixes

Layer / File(s) Summary
Messages streaming pre-commit refund fix
src/gateway/api/routes/messages.py
Comment clarifies behavior on pre-stream provider errors, and a conditional refund_reservation(...) call is added to the exception handler to release the pre-debit hold when reservation is present, matching non-streaming behavior.
Responses streaming pre-commit refund fix
src/gateway/api/routes/responses.py
Comment updated to document that pre-debit holds are now refunded on pre-stream provider errors to prevent leaks and maintain parity with other handlers. An async refund_reservation(...) call is added conditionally in the exception path.
Integration tests for streaming pre-commit refund
tests/integration/test_streaming_precommit_refund.py
New test module with helper functions to seed budgeted users, configure pricing, query user budget state, and poll usage logs. Three regression tests verify that /v1/chat/completions, /v1/messages, and /v1/responses all properly refund the pre-debit reservation and log an error when the upstream provider fails before the first chunk.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title follows the Conventional Commit format with 'fix:' prefix and uses imperative mood; it clearly describes the main change (refunding budget reservation on streaming pre-commit errors).
Linked Issues check ✅ Passed All coding requirements from #128 are met: refund_reservation calls added to pre-commit except blocks in messages/responses streaming handlers, inconsistencies corrected, and comprehensive regression tests verify reserved budget returns to 0.
Out of Scope Changes check ✅ Passed All changes directly address the budget reservation leak bug: two handler files add refunds, and the test file provides regression coverage; no unrelated modifications detected.
Description check ✅ Passed The PR description comprehensively covers the bug, root cause, fix, test coverage, and context, with all checklist items completed and AI usage properly disclosed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/streaming-precommit-reservation-leak
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch fix/streaming-precommit-reservation-leak

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

…errors

A streaming /v1/messages or /v1/responses request whose provider call fails
before the first chunk ("pre-commit") logged the error but did not refund the
budget pre-debit hold, while /v1/chat/completions and all non-streaming paths
did. Because a budget-period reset zeroes spend but not reserved, the leaked
hold was never released: repeated pre-commit streaming failures accumulated
reserved until the user hit a spurious "budget exceeded" 403 with low actual
spend.

Add the missing refund_reservation to the messages/responses streaming
pre-commit except blocks. No wire change: the HTTP status and body are
unchanged; only users.reserved is released instead of leaked. Adds a regression
test asserting reserved returns to 0 across all three endpoints.

Fixes #128

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@tbille tbille force-pushed the fix/streaming-precommit-reservation-leak branch from 2d5c879 to 28836c0 Compare June 10, 2026 08:02
@tbille tbille temporarily deployed to integration-tests June 10, 2026 08:02 — with GitHub Actions Inactive

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a standalone-mode budget accounting bug where streaming requests to /v1/messages and /v1/responses that fail before the first streamed chunk (“pre-commit”) would log the error but not refund the budget reservation hold, causing users.reserved to leak upward over time and eventually trigger incorrect “budget exceeded” rejections.

Changes:

  • Refund the budget reservation in the streaming pre-commit except blocks for /v1/messages and /v1/responses, aligning behavior with chat streaming and non-streaming handlers.
  • Update the in-code comments to accurately describe the intended “log + refund” behavior.
  • Add integration regression tests asserting users.reserved returns to 0 after pre-commit streaming failures across /v1/chat/completions, /v1/messages, and /v1/responses.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
src/gateway/api/routes/messages.py Adds missing refund_reservation call on streaming pre-commit provider errors (standalone DB path).
src/gateway/api/routes/responses.py Adds missing refund_reservation call on streaming pre-commit provider errors (standalone DB path).
tests/integration/test_streaming_precommit_refund.py New integration coverage to prevent regressions of leaked users.reserved holds on pre-commit streaming failures.

@github-actions github-actions Bot added the missing-template PR is missing required template sections label Jun 10, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/integration/test_streaming_precommit_refund.py (1)

48-77: 💤 Low value

Consider reusing the engine across helper calls for efficiency.

Each call to _user_state and _poll_usage_logs creates and disposes a new SQLAlchemy engine. For integration tests this works fine, but if you wanted to tighten it up, you could pass the engine as a parameter or use a module-scoped fixture. That said, this is test code and correctness > micro-optimization here, so totally fine to leave as-is.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/test_streaming_precommit_refund.py` around lines 48 - 77,
Both _user_state and _poll_usage_logs create and dispose a new SQLAlchemy engine
per call which is inefficient; refactor to accept a shared engine or session
factory (e.g., pass an engine or a sessionmaker into _user_state and
_poll_usage_logs) or use a module-scoped fixture so create_engine is only called
once, update calls to these helpers to supply the shared engine/sessionmaker,
and remove per-call engine.dispose() so the shared engine is not prematurely
torn down.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/integration/test_streaming_precommit_refund.py`:
- Around line 48-77: Both _user_state and _poll_usage_logs create and dispose a
new SQLAlchemy engine per call which is inefficient; refactor to accept a shared
engine or session factory (e.g., pass an engine or a sessionmaker into
_user_state and _poll_usage_logs) or use a module-scoped fixture so
create_engine is only called once, update calls to these helpers to supply the
shared engine/sessionmaker, and remove per-call engine.dispose() so the shared
engine is not prematurely torn down.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8a00decf-4546-4916-85cc-582af30f2996

📥 Commits

Reviewing files that changed from the base of the PR and between 96e8dc2 and 28836c0.

📒 Files selected for processing (3)
  • src/gateway/api/routes/messages.py
  • src/gateway/api/routes/responses.py
  • tests/integration/test_streaming_precommit_refund.py

@github-actions github-actions Bot removed the missing-template PR is missing required template sections label Jun 10, 2026
# Conflicts:
#	src/gateway/api/routes/messages.py
#	src/gateway/api/routes/responses.py
@njbrake njbrake temporarily deployed to integration-tests June 10, 2026 10:54 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Budget reservation leaks on streaming pre-commit provider errors (/v1/messages, /v1/responses, standalone)

3 participants