Anonymous View
Skip to content

fix(pm): identity-gate descriptor resolution (compat.zlib CI flake)#136

Merged
Sunrisepeak merged 7 commits into
mainfrom
fix/pkg-resolution-identity-first
Jun 19, 2026
Merged

fix(pm): identity-gate descriptor resolution (compat.zlib CI flake)#136
Sunrisepeak merged 7 commits into
mainfrom
fix/pkg-resolution-identity-first

Conversation

@Sunrisepeak

Copy link
Copy Markdown
Member

Problem

compat.zlib intermittently failed on fresh CI with index entry has no mcpp field while passing locally — same binary, same descriptor bytes.

Root cause is not parsing. read_xpkg_lua locates a package descriptor by generating candidate filenames (xpkg_lua_candidates) and returning the first filesystem hit across an unordered directory_iterator scan of every index dir, with no check that the file it found is actually the requested package.

A bare zlib.lua exists in xim-pkgindex (declares name="zlib", no namespace, no mcpp block) as an unrelated upstream package. Its filename matches the zlib.lua compat fallback candidate for a compat.zlib request. Whenever the filesystem visits xim-pkgindex before mcpplibs, that blockless file is returned → extract_mcpp_field finds nothing → "index entry has no mcpp field". Directory-iteration order is unspecified and differs between a warm local checkout and a fresh CI runner — hence local-pass / CI-fail.

Only zlib is hit: it's the only one of the four compression libs whose bare name collides with a package in a directly-scanned index dir.

Essence

A non-unique filename was used as the identity key. The fix makes identity what it should be: the descriptor's declared package.{namespace,name}. The filename is only a location hint; the declared identity is the proof.

Change

  • mcpp::manifest::xpkg_lua_identity_matches() — shared identity gate comparing a descriptor's declared (ns,name) against the requested coordinate. Empty-namespace requests stay discovery-lenient (used by mcpp new / scaffold, which derive the namespace from the file).
  • All three read_xpkg_lua* readers route through the gate: a candidate-filename hit is accepted only when the file actually declares the requested package; otherwise scanning continues. Correct independent of directory order.
  • Deterministic, sorted index-dir iteration.
  • De-dup: prepare.cppm's xpkgLuaMatchesCandidate now delegates to the shared gate (single source of truth).

compat.zlib.lua (declares (compat, compat.zlib)) matches; foreign bare zlib.lua (declares (_, zlib)) is rejected for a compat.zlib request.

Tests

  • test_manifest.cpp — identity-gate truth table (compat match, foreign-bare rejection, declared-namespace exclusivity, lenient no-name, default-ns legacy-bare flag, empty-ns discovery).
  • test_pm_package_fetcher.cpp — cross-index collision regression: stages compat.zlib.lua + a foreign bare zlib.lua and asserts read_xpkg_lua_from_project_data returns the compat descriptor regardless of dir order; and that a foreign bare zlib alone does not satisfy a compat request.

Local verification: mcpp test 21/21 binaries green (8 new tests). Targeted e2e green: dep-resolution (27/31/62/63), scaffold (02), custom/local index (42/52), path dep (09), preinstall (58).

Scope / follow-ups

This PR delivers the load-bearing subset — the identity gate at descriptor-read sites + determinism — that closes the incident. Deliberately deferred (tracked in the design doc): payload-locator gating, index-owned-namespace totalization ((xim, zlib) for no-namespace descriptors), the identity-indexed slow path, and the unified PackageLocator choke point.

Full analysis & design: .agents/docs/2026-06-20-package-resolution-architecture.md.

…e filename hits

read_xpkg_lua located a package descriptor by generating candidate filenames
and returning the first filesystem hit across an unordered scan of every index
dir, with no check that the file it found was the requested package. A bare
`zlib.lua` from xim-pkgindex (declares name="zlib", no namespace, no mcpp block)
could therefore satisfy a request for `compat.zlib` whenever directory iteration
visited xim-pkgindex before mcpplibs — which is filesystem-order-dependent, so
the build passed locally and failed on fresh CI with "index entry has no mcpp
field".

Root cause: a non-unique filename was used as the identity key. Fix: the
descriptor's declared package.{namespace,name} is the identity; the filename is
only a location hint.

- Add mcpp::manifest::xpkg_lua_identity_matches(): the shared identity gate
  comparing a descriptor's declared (ns,name) against the requested coordinate.
  Empty-ns requests stay discovery-lenient (scaffold / `mcpp new`).
- Route all three read_xpkg_lua* readers through the gate: a candidate filename
  hit is accepted only when the file actually declares the requested package;
  otherwise scanning continues. Independent of directory order.
- Scan index dirs in sorted (deterministic) order.
- De-duplicate: prepare.cppm's xpkgLuaMatchesCandidate now delegates to the
  shared gate (single source of truth).

Tests: identity-gate truth table (test_manifest) + cross-index collision
regression (test_pm_package_fetcher). Design + deferred follow-ups (payload
locators, index-owned namespace, PackageLocator choke point) documented in
.agents/docs/2026-06-20-package-resolution-architecture.md.
The first cut of the identity gate rejected `compat.*` descriptors for
bare/default-namespace requests, breaking the `gtest` dev-dependency on CI
(`dependency 'gtest': index entry not found in local clone`). A bare/default-ns
dependency name is a legitimate alias for a `compat.<name>` package — the
candidate generator deliberately offers `compat.<short>.lua` for default-ns
requests (compat.cppm) — so e.g. `gtest` resolves to `compat.gtest`.

Accept a `compat`-namespaced descriptor (name `<short>` or `compat.<short>`) for
a default-namespace request. Non-default-namespace matching (the compat.zlib vs
foreign bare zlib fix) is unchanged.

Tests: XpkgIdentity.DefaultNamespaceRequestMatchesCompatAlias and
PmPackageFetcher.DefaultNamespaceRequestResolvesCompatAliasDescriptor.
@Sunrisepeak

Copy link
Copy Markdown
Member Author

Identity-match matrix — filename × declared namespace × declared name vs the request

Key idea: the filename only decides which files are probed (the candidate list); it never decides identity. The match is purely declared (package.namespace, package.name) vs the request coordinate. The request coordinate comes from the TOML declaration.

How a TOML declaration becomes a request coordinate

TOML in mcpp.toml request ns request shortName qname
zlib = "x" (bare, [dependencies]/[dev-dependencies]) mcpplibs (default) zlib mcpplibs.zlib
[dependencies.compat]zlib = "x" compat zlib compat.zlib
"mcpplibs.cmdline" = "x" (legacy dotted key) mcpplibs cmdline mcpplibs.cmdline
[dependencies.xim]zlib = "x" (non-default / custom) xim zlib xim.zlib
mcpp new --template zlib (scaffold) "" (discovery) zlib

Match matrix (request → descriptor), shortName = zlib

Rows = the descriptor on disk (its filename, declared ns, declared name). Columns = the request coordinate. = field absent.

# filename declared ns declared name (mcpplibs, zlib) bare/default (compat, zlib) (xim, zlib) non-default ("", zlib) discovery
D1 compat.zlib.lua compat compat.zlib (compat alias)
D2 zlib.lua zlib ✅ † ❌ ★ ❌ ‡
D3 zlib.lua mcpplibs zlib
D4 mcpplibs.zlib.lua mcpplibs mcpplibs.zlib (=qname)
D5 zlib.lua xim zlib
D6 (any) ✅ (lenient) ✅ (lenient) ✅ (lenient) ✅ (lenient)

★ = the bug this PR fixes. (compat, zlib) request vs the upstream bare zlib.lua (D2, declares no namespace, no mcpp block) → rejected. Before this PR it was accepted whenever directory iteration hit xim-pkgindex before mcpplibs, producing index entry has no mcpp field.

D1 × (mcpplibs, zlib) = the gtest case. A bare/default-namespace request accepts its compat.<short> alias (the dev-dep gtestcompat.gtest). This is the second commit's fix.

Footnotes:

  • (mcpplibs, zlib) × D2 (no-ns bare name) is gated by allowLegacyBareDefault: ✅ on the read path (read_xpkg_lua, flag defaults true, preserves legacy bare-named default-ns packages); ❌ during multi-candidate disambiguation (selectDependencyCandidate, flag false). This is the pre-existing default-namespace ambiguity noted as a follow-up — not introduced here.
  • (xim, zlib) × D2 is ❌ today because the content-only gate can't know a no-namespace file's owning index. The §4.1 follow-up (index-owned namespace → attribute a bare xim-pkgindex/.../zlib.lua as (xim, zlib)) would flip this to ✅. Not needed for this incident, and xim packages are toolchains resolved by the install path, not by [dependencies] descriptor reads.
  • Discovery ("") is intentionally lenient — match by short name (bare or qualified tail), accepting any declared namespace — because scaffold derives the real (ns, name) from the descriptor afterwards.
  • D6 (no declared name) is accepted everywhere: identity can't be verified, so the gate stays lenient rather than reject (no regression vs the old no-check read).

Which filenames are even probed (the "filename" dimension)

The filename only governs candidacy, not the match. Candidate lists (xpkg_lua_candidates, canonical first):

request candidate filenames searched (in order)
(mcpplibs, zlib) zlib.lua, mcpplibs.zlib.lua, compat.zlib.lua
(compat, zlib) compat.zlib.lua, zlib.lua
(xim, zlib) xim.zlib.lua, zlib.lua, compat.xim.zlib.lua, compat.zlib.lua
("", zlib) zlib.lua, compat.zlib.lua

A file is considered if it exists under one of these names; it is accepted only if it also passes the identity column above. So (compat, zlib) probes zlib.lua (row D2) but the identity gate rejects it — closing the collision regardless of directory order.

Two follow-ups from review + a CI e2e regression:

1. Index-owned namespace (fixes e2e 49/51). A `[indices]` path index is scoped
   to one namespace, and its descriptors may declare only `name` (no namespace
   field) — the namespace is owned by the index. The first cut of the identity
   gate rejected such descriptors (`dependency 'local-dev.tinycfg': not found in
   local index`). Thread an `indexDefaultNs` into the gate: a no-namespace
   descriptor inherits the namespace of the single known index it was found in.
   `read_xpkg_lua_from_path` passes the request ns (the index it reads is scoped
   to exactly that namespace). The builtin global scan stays content-only — its
   per-file index→namespace map is the deferred §4.1 work.

2. De-hardcode `compat`. It is not special matching logic — it is the one entry
   in the default/unqualified-name search path. Promote it to a shared
   `kCompatNamespace` constant (dep_spec) used by both the identity gate and the
   candidate generator, instead of two independent string literals.

Tests: PmPackageFetcher.LocalPathIndexAttributesOwnNamespaceToNoNsDescriptor.
Local: mcpp test 21/21; e2e 49 + 51 now pass.
Identity is a 2-tuple (ns, name): ns is a hierarchical namespace path
(sub-namespaces), name is a single atomic segment (dotted name like a.b is just a
spelling of (a, b)). Filename / install-dir / candidate names are serializations
of this tuple, never independent keys. Normalization: owning-index namespace →
FQN → split on last dot. Matching = exact tuple equality (qualified) + namespace
search path (unqualified, with compat as a data entry, not a branch).
Replace the branchy identity gate with the canonical (ns, name) model (design
doc §4.2). A package identity is a 2-tuple: ns is a hierarchical namespace path,
name is a single atomic segment; every surface spelling (dotted name, embedded
prefix, missing namespace, owning-index ns) normalizes to it.

- canonical_xpkg_identity(declaredNs, declaredName, indexDefaultNs): 3-step
  normalization — owning-index namespace → FQN → split on the LAST dot.
- canonical_xpkg_identity_from_lua(): identity straight from a descriptor.
- xpkg_lua_identity_matches() rewritten on top: the single name must equal the
  request short name; then exact ns equality for qualified requests, or the
  default-namespace search path [mcpplibs, compat] for unqualified ones. compat
  is a data entry in the search path, not a logic branch. Behavior-preserving
  over the prior gate (verified by the full identity matrix + e2e).

Tests: new CanonicalIdentity suite covers every §4.2 paradigm (prefix-embedded,
bare+combine, idempotent qualified, index attribution, declared-ns precedence,
hierarchical/nested ns, dotted-name split, rootless bare, from-lua). Local:
mcpp test 21/21; e2e 49/51/63/27/18 green.
Add ci-source-build.yml: a minimal per-PR gate that checks out this repo,
installs xlings from openxlings/xlings (quick_install.sh), verifies
`xlings --version`, and `mcpp build`s the source end to end with the
freshly-built binary running `--version`.

Fills a gap: ci-linux.yml builds the source but bootstraps from the
d2learn/xlings tarball; ci-fresh-install.yml uses openxlings but only tests the
released mcpp on sample projects, never the PR source. This proves the source
builds from a fresh openxlings install.
- Remove the separate ci-source-build.yml; instead add one integration step to
  ci-{linux,macos,windows}.yml: the self-hosted mcpp built from THIS PR
  ($MCPP=/tmp/mcpp-fresh on linux/macos, $MCPP_SELF on windows) git-clones
  openxlings/xlings (which ships its own mcpp.toml) and `mcpp build` + `mcpp run`
  it — proving the freshly-built mcpp can build a real external C++ project.
- Bump version 0.0.56 → 0.0.57 (mcpp.toml, fingerprint MCPP_VERSION) + CHANGELOG.
@Sunrisepeak Sunrisepeak merged commit 4bd2db3 into main Jun 19, 2026
3 checks passed
@Sunrisepeak Sunrisepeak deleted the fix/pkg-resolution-identity-first branch June 19, 2026 18:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant