test(harness): shared test fakes + conformance determinism fix#427
Merged
Conversation
Contributor
Author
|
@greptile review |
1 similar comment
Contributor
Author
|
@greptile review |
danielmillerp
approved these changes
Jun 23, 2026
…rminism fix Extract tests/lib/core/harness/_fakes.py (FakeSpan/FakeTracing), removing ~9 duplicated copies, and harden the conformance determinism test: it now iterates all_fixtures() at runtime (paired with a conftest that eagerly imports every per-harness conformance module) instead of import-time parametrization, which had silently dropped per-harness coverage (Greptile P1). Also removes the now-landed pr4-pydantic-ai planning doc. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
5953df6 to
a84f83b
Compare
Address review: the codex and pydantic-ai conformance modules called asyncio.run() at import time, which raises RuntimeError when collected under an already-running event loop (programmatic pytest, notebooks) — so even a focused run could fail during collection. Hoist the loop-free driver from the claude-code module into runner.run_pure_async and use it everywhere fixtures are built at import time. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Summary
First slice of #425 (post-merge harness cleanup), scoped to non-breaking test infrastructure so it can land independently.
tests/lib/core/harness/_fakes.py(FakeSpan/FakeTracing), removing ~9 duplicated in-file copies.test_span_derivation_is_deterministicnow iteratesall_fixtures()at runtime, paired with a new conformanceconftest.pythat eagerly imports every per-harness conformance module so the fixture set is fully populated regardless of collection order. Previously import-time parametrization silently dropped per-harness coverage.pr4-pydantic-aiplanning doc.Test plan
pytest tests/lib/core/harness/— 122 passed, 1 skippedNotes
Part of a stacked split of #425. No source changes; safe to merge first.
🤖 Generated with Claude Code
Greptile Summary
This PR is a pure test-infrastructure cleanup with no source code changes. It extracts shared
FakeSpan/FakeTracingtest doubles intotests/lib/core/harness/_fakes.py, eliminating roughly nine in-file duplicate definitions. It also fixes a conformance determinism gap:test_span_derivation_is_deterministicpreviously parametrized onall_fixtures()at import time (missing any harness whose module hadn't been collected yet) and now callsall_fixtures()inside the test body, backed by a newconftest.pythat eagerly imports every per-harness conformance module so registration is always complete.FakeSpan/FakeTracingextracted to_fakes.py; all local_FakeTracing/_RecordingTracing/_FakeSpanvariants replaced with the shared versions throughouttest_tracer.py,test_emitter.py,test_auto_send.py,test_yield_delivery.py, andrunner.py.test_conformance.py::test_span_derivation_is_deterministicis now a single no-parametrize test that callsall_fixtures()at run time, with a guard asserting that per-harness fixtures were registered;conftest.pyensures all conformance modules are imported before any test executes.asyncio.run()at import removed —test_pydantic_ai_conformance.pyandtest_codex_conformance.pynow use the sharedrun_pure_asyncloop-free driver; the driver itself is promoted from a private copy intest_claude_code_conformance.pyto a public export inrunner.py.Confidence Score: 5/5
Safe to merge — no source code changes, only test infrastructure refactoring with a determinism bug fix.
All changes are confined to test files. The shared FakeTracing is a behavioral superset of every local copy it replaces (same started/ended tuple format, same span-object shape). The run_pure_async driver is sound for pure in-memory coroutines and was already proven correct by the existing claude_code_conformance suite. The conftest eager-import strategy is safe because Python's module cache prevents double-registration. No edge cases that could silently hide failures were found.
No files require special attention.
Important Files Changed
Reviews (5): Last reviewed commit: "test(harness): build conformance fixture..." | Re-trigger Greptile