feat: Docker distribution + determinism receipts + SCIP/LSP breadth (spec 008)#243
feat: Docker distribution + determinism receipts + SCIP/LSP breadth (spec 008)#243theagenticguy wants to merge 14 commits into
Conversation
…on + breadth) Plan-phase durables for session-893add. Binary track dropped; Docker multistage pnpm+Node24 is the sole non-npm artifact. 3 tracks, 10 Act packets, wave graph. Q1 resolved: amend ADR 0005 (new ADR 0019) for a quarantined Tier-3 LSP fallback.
Read protocolVersion/clientInfo/clientCapabilities from per-request _meta via withProtocolGate proxy over all 29 tools; UnsupportedProtocolVersionError on mismatch. SDK@1.29.0 lacks 2026-07-28 so transport handshake stays SDK-native; full negotiation is a documented TODO. T-C9, spec 008 E-C9/AC-C14/U7.
…stability Emit skeleton/file-tree/deps first, volatile ast-chunks/findings/embeddings last, so a byte-identical pack maximizes the cache-eligible prompt prefix (0.1x read). Docs lead with cache-prefix stability over token savings. packHash byte-identity holds (no golden literal; determinism suite asserts cross-run equality). T-C2, AC-C4/E-C5.
Builder installs+builds+pnpm-deploy-prunes; node:24-slim runtime carries the pruned closure + wasm grammars, embedder removed. och-mcp shim runs stdio MCP via docker run -i. scope-enum += docker; ROADMAP rejects the single-binary track. Lite ~600MB (lockfile-faithful: DuckDB+graph natives+SCIP TS compilers). T-B1.
prove() emits an in-toto SLSA-v1 statement whose subject sha256 == manifest packHash, predicate carries (commit, tokenizer, budget, pins) + BOM inputs; keyless cosign sign-blob (degrades to documented-cmd when cosign absent). replay re-derives + byte-compares: strict drift exits non-zero naming the item, best_effort mismatch is expected-drift. Fixes code-pack manifest commit:'' so packs are replayable. cosign live-sign is env-gated, not faked. T-C1, E-C1/E-C2/AC-C3/U2.
server/discover advertises identity + lex-sorted protocol versions + the live 29-tool catalog (app-level handler; SDK@1.29.0 has no native discover). Remove ping; logging.setLevel + roots.list_changed never installed; log level via per-request _meta.logLevel. tools/list, resources/list+read carry ttlMs + cacheScope (not etag). README documents the stdio-only rail. T-C10-13, E-C10/E-C11/E-C12/AC-C13.
…+ CI Adds jre-build (jlink JRE-21, 62MB + scip-java 0.12.3), scip-go-dl (SHA-verified scip-go v0.2.7 per-arch), and a full target (FROM lite + indexer toolchains + uv). docker.yml builds lite+full for amd64+arm64 and smoke-tests och-mcp + indexers; all actions SHA-pinned. No GPL/MPL binaries. Lite stage untouched. T-B2, E-D2/E-D3/AC-D6/AC-D7.
IndexerKind += php/dart, ALLOWED_COMMANDS += scip-php/scip_dart, detectLanguages maps composer.json/pubspec.yaml; both gated behind --allow-build-scripts. New SCIP_UNOFFICIAL_PROVENANCE_PREFIXES (Tier 1.5, distinct from first-party scip:), surfaced in confidence-breakdown. scip_dart binary is underscore (verified vs upstream tag). ADR 0006 refreshed (scip-code/scip-go@v0.2.7, scip@0.8.1). T-A-S.
New @opencodehub/lsp-tier vendors agent-lsp logic (workspace/symbol + blast_radius) for Swift/Zig/Elixir/Terraform/Clojure etc. Facts tagged lsp:<bin>@<ver>, canonically re-sorted, kept in a packHash-EXCLUDED sidecar — packHash byte-identical with/without Tier-3 (proven by quarantine.test). Opt-in only (O-A7); warmup hard-fail (S-A4b); per-wrapped-server SPDX audit (AC-A5). ADR 0019 amends 0005. T-A-L.
… + stale-dist) Two durable ERPAVal lessons: (1) quarantine nondeterministic LSP/heuristic facts in a packHash-excluded sidecar to extend breadth without eroding determinism (the load-bearing byte-identical-with/without test); (2) new workspace package or export reads as a phantom missing-member error until clean rebuild + relink.
…plit Lite was described by what it omits (no JVM/scip-go), which read as 'no SCIP'. It bakes in scip-typescript + scip-python (CLI prod deps). Full adds the trimmed JRE + remaining indexers + uv; lite fetches those via codehub setup on demand.
code-pack staged the BOM under os.tmpdir() then rename()d into the repo's .codehub/packs/<hash>/. When /tmp is a different mount (tmpfs) from the repo (EFS/NFS), rename throws EXDEV: cross-device link not permitted and the pack crashes. Stage under the destination's own parent dir so the move is an atomic on-device rename. Found running pack --prove live on an EFS-backed checkout.
…p-tier status README was missing the spec-008 surface: add a 'prove a code-pack is reproducible' section (code-pack --prove + replay + keyless cosign + offline verify), fix the stale 28->29 tool count (server.test asserts 29), note PHP/Dart scip-unofficial tier + the SCIP-blind lsp-tier, and add a 'Since v1' status para that honestly flags the lsp-tier live backend (+ --tier3-lsp flag) as the remaining follow-up.
| # Default to the stdio MCP server. `docker run -i` keeps stdin open for the | ||
| # JSON-RPC stream; override the command (e.g. `... codehub analyze`) to drive | ||
| # the CLI. No EXPOSE / port / listener — stdio is the only transport (U9). | ||
| ENTRYPOINT [] |
| # JSON-RPC stream; override the command (e.g. `... codehub analyze`) to drive | ||
| # the CLI. No EXPOSE / port / listener — stdio is the only transport (U9). | ||
| ENTRYPOINT [] | ||
| CMD ["och-mcp"] |
lsp-tier's quarantine.test.ts imported @opencodehub/pack, but the graph is lsp-tier -> pack -> ingestion -> lsp-tier, so a tsconfig ref back to pack is a TS6202 circular project graph. Local check passed only on stale dist. Move the test into pack (downstream of lsp-tier, cycle-free), importing the sidecar writer from @opencodehub/lsp-tier. The invariant is pack's hash guarantee anyway. Fixes the CI typecheck/test failures on PR #243.
lsp-tier's quarantine.test.ts imported @opencodehub/pack, but the graph is lsp-tier -> pack -> ingestion -> lsp-tier, so a tsconfig ref back to pack is a TS6202 circular project graph. Local check passed only on stale dist. Move the test into pack (downstream of lsp-tier, cycle-free), importing the sidecar writer from @opencodehub/lsp-tier. The invariant is pack's hash guarantee anyway. Fixes the CI typecheck/test failures on PR #243.
4cfff0b to
bdef5ff
Compare
|
You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool. What Enabling Code Scanning Means:
For more information about GitHub Code Scanning, check out the documentation. |
2 similar comments
|
You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool. What Enabling Code Scanning Means:
For more information about GitHub Code Scanning, check out the documentation. |
|
You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool. What Enabling Code Scanning Means:
For more information about GitHub Code Scanning, check out the documentation. |
Summary
Distribution + determinism + breadth for OpenCodeHub (ERPAVal spec 008). Adds a Docker distribution channel, turns the deterministic code-pack into a signed/replayable receipt, conforms the MCP server to the 2026-07-28 stateless model, and extends language coverage — without eroding the byte-identical packHash contract.
Branch: 13 commits, sequenced by dependency (Docker → SCIP/LSP breadth ‖ determinism/MCP). Full spec at
.erpaval/specs/008-distribution-determinism-breadth/.Tracks
B — Docker distribution
node:24+ pnpm 11 lite image (parser + graph + CLI + stdio MCP + TS/Python SCIP) and full multi-arch image (+ jlink JRE + scip-java/go + uv)..github/workflows/docker.ymlbuilds both arches + smoke-testsoch-mcpand the bundled indexers. No HTTP surface —docker run -istdio only.C — Determinism receipts + MCP conformance
code-pack --proveemits an in-toto/SLSA-v1 statement whose subject digest is the packHash;codehub replay <hash>re-derives byte-for-byte and names the drifted item on a mismatch. Keyless cosign (CI-signed; degrades honestly off-CI)._metaprotocol negotiation,server/discover,ttlMs/cacheScope, deprecated methods removed. (SDK predates 2026-07-28 → the version pin is a documented TODO; no hand-rolled transport.)A — Language breadth
scip-unofficial(Tier 1.5) confidence label, distinct from first-partyscip:. ADR 0006 refreshed.@opencodehub/lsp-tier: a packHash-quarantined LSP tier-3 for SCIP-blind languages (Swift/Zig/Elixir/Terraform/Clojure). ADR 0019 amends ADR 0005. Quarantine proven by a with/without-Tier-3 byte-identical test.Verified
mise run checkgreen tree-wide (lint + typecheck + tests + banned-strings); packHash + graphHash byte-identity suites pass; license allowlist green.pack --prove/replayproven live locally (cosign 3.1.1): subject == packHash; clean → "reproduced"; tampered byte → "NOT reproduced — drifted item".Known follow-ups (documented, not blockers)
agent-lsp-spawning backend +codehub analyze --tier3-lspflag are not yet wired — SCIP-blind languages stay on tree-sitter until then.id-token: write);replayneeds no cosign.opencodehub-testbedrepo (not in this PR).🤖 Generated with ERPAVal (session-893add).