Skip to content

feat: runner-aware tools#346

Draft
branchseer wants to merge 24 commits intofeat/output-restorationfrom
runner-aware-tools
Draft

feat: runner-aware tools#346
branchseer wants to merge 24 commits intofeat/output-restorationfrom
runner-aware-tools

Conversation

@branchseer
Copy link
Copy Markdown
Member

@branchseer branchseer commented Apr 18, 2026

Give task runners a bidirectional IPC channel with the processes they spawn, so:

  • tools can declare at runtime what inputs they actually read / outputs they actually produced
  • tools can request additional env vars be tracked (or not) in the cache key
  • tools can tell the runner "don't cache me this time"
  • the runner feeds all of that back into its caching decisions

Design notes: docs/runner-task-ipc/.

What's in this PR

  • Step 1 — Protocol (vite_task_ipc_shared): message types + serialization shared by both ends.
  • Step 2 — Transport (vite_task_server + vite_task_client): async server, sync blocking client, tested Rust-to-Rust.
  • Step 3 — Extract artifact crate out of fspy for dylib embedding. (Landed on main via refactor: extract materialized_artifact crate out of fspy #344 as materialized_artifact.)
  • Step 4 — JS bridge: vite_task_client_napi + @voidzero-dev/vite-task-client JS wrapper (with fetchEnvs dedupe logic).
  • Step 5 — Runner integration: server started per task execution, client dylib embedded/extracted, IPC envs injected via serve()'s returned iterator.
  • Step 6 — Cache integration: runner consumes reported ignored inputs/outputs, tracked env requests, and disable-cache signals when fingerprinting.

Test plan

  • Rust integration tests for server/client transport (vite_task_server/tests/integration.rs)
  • NAPI e2e tests (vite_task_client_napi/tests/e2e.rs)
  • E2E snapshot fixtures per client method: ignore_input, ignore_output, fetch_env, disable_cache
  • E2E test caching a real vite build via a patched Vite plugin (vite_build_cache fixture)

@branchseer branchseer changed the base branch from main to graphite-base/346 April 20, 2026 02:19
@branchseer branchseer changed the base branch from graphite-base/346 to feat/output-restoration April 20, 2026 02:19
Copy link
Copy Markdown
Member Author

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@branchseer branchseer changed the title feat(ipc): runner-aware tools — protocol + transport (partial) feat: runner-aware tools Apr 20, 2026
branchseer and others added 17 commits April 20, 2026 12:18
- vite_task_ipc_shared: shared protocol (Request/GetEnvResponse, NativeStr)
- vite_task_server: per-task IPC server (Handler trait + Recorder)
- vite_task_client: sync Rust client
- vite_task_client_napi + @voidzero-dev/vite-task-client: node addon + JS wrapper
- vite_task: wire IPC server into spawn; inject VP_IPC + VP_RUN_NODE_CLIENT_PATH;
  bundle with fspy via Tracking struct; materialize .node addon on first use

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Step 6 of docs/runner-task-ipc/plan.md.

- Apply `ignoreInputs` to filter inferred fspy reads (directory-aware)
- Apply `ignoreOutputs` to filter auto-detected writes (overlap check + archive)
- Short-circuit cache update on `disableCache()` via new
  `CacheNotUpdatedReason::ToolRequested`
- Embed `tracked: true` envs in `PostRunFingerprint.tracked_envs`; validate
  on lookup by comparing against the current parent env
- Recorder env_map sources from `std::env::vars_os()` so tools can resolve
  envs the user never declared
- Bump cache schema to 13

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New fixture `ipc_client_test` exercises each IPC method through the JS
wrapper (@voidzero-dev/vite-task-client) inside a real cached task:

- ignoreInput → the ignored dir can mutate without invalidating cache
- ignoreOutput → read-write overlap under an ignored dir still caches
- disableCache → forces re-execution on next run
- fetchEnv(tracked: true) → env change invalidates cache; same value hits

The e2e harness now copies packages/vite-task-client into each staging
node_modules so fixtures can `import { ... } from "@voidzero-dev/vite-task-client"`
without pnpm install.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Applies a small pnpm patch to vite 8.0.8 that auto-injects a runner-aware
plugin at plugin-resolution time. When `VP_RUN_NODE_CLIENT_PATH` is set
(i.e. the child runs under `vp run`), the plugin:
- `ignoreInput(outDir)` — suppress fspy reads of the output dir (emptyDir
  scans dist/ before writing)
- `ignoreInput/Output(<root>/node_modules)` — machine state (pnpm store +
  vite's `.vite`/`.vite-temp` caches) is not user input/output
- `getEnv("NODE_ENV", true)` — tracked; drives DCE and define replacements

New e2e fixture `vite_build_cache` proves `vt run --cache build` produces
a cache hit on the second run and restores `dist/assets/main.js` after
deletion, all with zero manual input/output configuration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Extensions

Rework `patches/vite.patch` to match the shape of the eventual upstream
Vite PR:

- Drop the synthetic `vite:runner-aware` plugin. Each IPC call is now
  inlined right at the Vite code that triggers the fs / env access:
  - `ignoreInput(outDir)` in `prepareOutDir` before `emptyDir` scans it
  - `ignoreInput(depsCacheDir)` + `ignoreOutput(depsCacheDir)` in
    `loadCachedDepOptimizationMetadata` before the dep optimizer cache
    is read / written
  - `fetchEnv("NODE_ENV", { tracked: true })` in `resolveConfig` before
    `process.env.NODE_ENV` is first consulted
  - `ignoreInput`/`ignoreOutput` of `.vite-temp/` in
    `loadConfigFromBundledFile` (bundled-config temp write+import)
- Static `import` of `@voidzero-dev/vite-task-client` by name — the
  wrapper no-ops when no runner is connected, so no guard is needed at
  the call sites.
- Add a `packageExtensions` entry in `pnpm-workspace.yaml` that injects
  the wrapper as a real dependency of Vite. The final upstream PR would
  instead declare it in `packages/vite/package.json`; the only delta
  between experiment and PR is that one line.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous snapshot embedded Vite's minified JS output, which would
churn on every Vite version bump. Add a tiny `vtt stat-file` helper that
reports `exists` / `missing` and use that instead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Demonstrates end-to-end that Vite's patched `fetchEnv("NODE_ENV", { tracked: true })`
reaches the runner: flipping NODE_ENV between runs yields `tracked env
'NODE_ENV' changed`, while holding it constant still produces a cache hit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous case proved the cache invalidated when NODE_ENV flipped, but
not that the tool actually used the new value. Source now carries a
`process.env.NODE_ENV` branch whose marker (`BUILD_MODE_PROD` /
`BUILD_MODE_DEV`) is DCE-pruned by Vite's define + minifier, so only the
branch matching the current mode survives in the output.

Add a `vtt grep-file` helper to inspect the bundle without dumping its
whole (minified) body into the snapshot, and assert both markers against
the production and development builds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nv caching

Makes the effect of NODE_ENV changes visible in `dist/assets/main.js`: the
bundle contains only the surviving literal (`PROD build` or `DEV build`)
after Vite's define-plugin substitution + DCE.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…olves

Also update the inspection hint in the comment to match the default
`dist/assets/index-<hash>.js` filename now that vite.config.js is gone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… fields

Follows the convention introduced in main (#347): per-`[[e2e]]` and per-
step descriptions use the TOML `comment` field instead of bare `#` lines,
so they render under the snapshot headings and inside each step's block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The accept loop's `tokio::select!` could exit via the shutdown branch
before ever observing a connection that had already been established at
the kernel level, so fire-and-forget clients that connect, write, and
exit right before the runner signals stop_accepting would silently lose
their requests. After the main loop exits we now do one non-blocking
`poll!` of `listener.accept()` per iteration until it returns Pending,
ensuring every backlog-queued connection gets its handle_client future
pushed and drained.

Also:
- drop the now-redundant `crates/vite_task_client_napi/tests/e2e.rs`;
  the IPC path is covered end-to-end by the `ipc_client_test` fixture
  plus `vite_build_cache`
- oxfmt the fixture scripts and the JS wrapper

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@branchseer branchseer force-pushed the feat/output-restoration branch from 0008bd7 to 994624a Compare April 20, 2026 04:20
branchseer and others added 5 commits April 20, 2026 12:36
…ests

Two CI fixes rolled together:

1. `cargo-shear --deny-warnings` failed after the removal of
   `vite_task_client_napi/tests/e2e.rs`: the crate still listed the
   tests's deps (rustc-hash, tokio, vite_task_server, vite_path) and the
   workspace still referenced `vite_task_client_napi` in non-shear-aware
   ways. Drop those deps from the napi crate and add
   `vite_task_client_napi` to the workspace-level cargo-shear ignore list
   (same rationale as fspy_preload_*: it's an artifact dep loaded by
   string name, not `use`-d in Rust).

2. Revert the speculative server-side drain-accept loop — on Windows
   the interprocess Listener's named-pipe implementation crashed the
   integration test binary at startup (no tests even ran). Instead,
   have each fire-and-forget test end with a tiny `flush(&client)`
   round-trip (a cheap `get_env` that waits for a response). Since
   frames on a single stream are read sequentially by the server, once
   the flush's response returns, every preceding fire-and-forget frame
   has definitely been dispatched to the handler — no server-side race
   fix needed. 10/10 repeat runs pass locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`cargo-shear 1.11.1 --deny-warnings` treats the 'test = true on lib
target X but source contains no tests' messages as errors. Add
`test = false` (plus `doctest = false` where missing) to the `[lib]`
sections of the four IPC crates so cargo does not generate empty test
harnesses for them. Integration tests in `tests/*` are unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI's `RUSTDOCFLAGS='-D warnings' cargo doc --no-deps --document-private-items`
fails on the `[`SpawnFingerprint`]` link in `collect_tracked_envs`'s
docstring — it's not in scope at that site. Rewrite the prose to drop
the link; no information lost.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
typos 1.45.1 rejected the PR because:
- `./patches/vite.patch` includes Vite's own hunk-header line containing
  a truncated identifier (`environmen`) that looks like a typo but isn't
  ours to fix. Add `patches` to `.typos.toml` extend-exclude.
- `docs/runner-task-ipc/index.md:39` had a real typo `respone` → `respond`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
branchseer and others added 2 commits April 20, 2026 13:09
…c abs paths

On Windows, forward-slash paths without a drive letter (`/tmp/x.txt`)
are RELATIVE, so the client's `resolve_path` joined them with the cwd
(`D:\...\tmp\x.txt`) and the server-side assertion blew up. Use
`/tmp/` on unix and `C:\tmp\` on windows so the paths are absolute on
each platform and reach the server unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
On Windows CI these ignored tests crash their child processes with
"failed to start the persistent thread of the Interprocess linger pool:
Access is denied" from interprocess 2.4 as soon as the Node addon's
client connects. The server-side unit tests on Windows already cover
the IPC protocol; the crash is a downstream interprocess crate issue
that doesn't affect our code paths. Add `platform = "unix"` so the
ignored suite passes on Windows CI, with a comment pointing at the
upstream root cause.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant