Add ask_vlm method for cloud VLM alert verification by srnangi · Pull Request #442 · groundlight/python-sdk

srnangi · 2026-06-18T00:11:05Z

What

Adds Groundlight.ask_vlm(media, query, model_id) for VLM-based alert verification. Calls POST /v1/vlm-verifications on the Groundlight cloud (AWS Bedrock) and returns a VLMVerificationResult with verdict (YES/NO/UNSURE), confidence, reasoning, and token cost fields.

Pairs with the janzu PR (zuuul#6519). No local inference — VLM runs entirely in the cloud.

How

media accepts 1–8 images (single image or list). Accepts numpy BGR arrays, PIL Images, bytes, BytesIO/BufferedReader, or filename strings — encoded via the existing parse_supported_image_types utility.
POSTs multipart/form-data: media parts as image/jpeg files; query and model_id as form fields (not URL params, so the prompt never leaks into access logs).
model_id is a friendly alias (e.g. "gpt-5.4", "claude-sonnet-4.5") — the server maps it to the real Bedrock model ID. Defaults to the server-configured default.
Exports VLMVerificationResult from the package root.

Usage

gl = Groundlight()

# Single image
result = gl.ask_vlm(frame, query="Is there a fire in this image?")
if result.verdict == "YES":
    emit_alert()

# Full frame + cropped ROI — describe each in the query
result = gl.ask_vlm(
    media=[full_frame, roi_crop],
    query="Image 1 is the full camera frame; image 2 is the cropped region "
          "a detector flagged. Is there really a fire?",
)
print(result.confidence, result.reasoning, result.total_cost_usd)

Changes

src/groundlight/client.py: ask_vlm method + VLMVerificationResult dataclass
src/groundlight/__init__.py: exports VLMVerificationResult
test/unit/test_ask_vlm.py: 8 unit tests with mocked HTTP

Bug fix included

sanitize_endpoint_url strips the trailing slash from self.endpoint, so the original code produced .../device-apiv1/vlm-verifications. Fixed to /v1/vlm-verifications (with leading slash). Regression test added.

Testing

8 unit tests (mocked HTTP, no live server):

Single numpy image encoded as JPEG multipart
Dual-image sends two media parts
query/model_id sent as form fields, not URL params
No model_id omits the field entirely
More than 8 media items raises ValueError
URL has correct path (/device-api/v1/vlm-verifications)
Bytes image accepted
Result fields parsed correctly

🤖 Generated with Claude Code

Add Groundlight.ask_vlm(images, query, model_id) which verifies one or two images against a natural-language query by calling POST /v1/vlm-queries. Returns a VLMVerificationResult dataclass with verdict (YES/NO/UNSURE), confidence, reasoning, and token cost. - Accepts a single image or [full_frame, roi] for the dual-image strategy, reusing parse_supported_image_types for encoding. - Moves the requests import to module level. - Exports VLMVerificationResult from the package. - Unit tests with mocked HTTP. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

- POST query and model_id as multipart form fields (data=) instead of query-string params, matching the updated endpoint and keeping long prompts out of URLs and access logs. - model_id is now a friendly alias (e.g. "gpt-5.4", "claude-sonnet-4.5") resolved server-side, not a raw Bedrock model ID. - Tests updated to assert form-field transport. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Drop the gpt-5.4 example (OpenAI models on Bedrock are text-only and cannot do image verification); use claude-sonnet-4.5 / nova-pro instead. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Match the generalized endpoint: param images -> media, multipart field 'media', guard raised from 2 to 8. The query should describe each media item (server makes no frame/ROI assumption). Docstring + tests updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Endpoint renamed server-side from vlm-queries to vlm-verifications. Update the SDK POST path and test fixtures accordingly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

sanitize_endpoint_url() strips the trailing slash from self.endpoint, so joining without "/" produced ".../device-apiv1/vlm-verifications" instead of ".../device-api/v1/vlm-verifications". Added test_url_has_correct_path to pin the correct URL shape. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

buildci and others added 7 commits June 24, 2026 00:52

Automatically reformatting code

9a5e3e1

Update ask_vlm model_id docstring examples to vision-capable aliases

2b20fce

Drop the gpt-5.4 example (OpenAI models on Bedrock are text-only and cannot do image verification); use claude-sonnet-4.5 / nova-pro instead. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

ask_vlm: point at renamed /v1/vlm-verifications endpoint

00789e0

Endpoint renamed server-side from vlm-queries to vlm-verifications. Update the SDK POST path and test fixtures accordingly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

srnangi force-pushed the feature/vlm-verification-sdk branch from b2b0755 to 263808d Compare June 24, 2026 08:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ask_vlm method for cloud VLM alert verification#442

Add ask_vlm method for cloud VLM alert verification#442
srnangi wants to merge 7 commits into
mainfrom
feature/vlm-verification-sdk

srnangi commented Jun 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

srnangi commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

How

Usage

Changes

Bug fix included

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

srnangi commented Jun 18, 2026 •

edited

Loading