Skip to content

Add isolated exec mode#667

Open
whoiskatrin wants to merge 4 commits intomainfrom
fix/isolate-exec-shell-state
Open

Add isolated exec mode#667
whoiskatrin wants to merge 4 commits intomainfrom
fix/isolate-exec-shell-state

Conversation

@whoiskatrin
Copy link
Copy Markdown
Collaborator

@whoiskatrin whoiskatrin commented May 1, 2026

Summary

  • Add an opt-in isolated exec() mode via preserveShellState: false for one-shot scripts.
  • Run isolated foreground commands in a subshell so top-level exit, exec, and set -e cannot terminate the persistent session shell.
  • Keep the existing default behavior unchanged: sandbox.exec() and session.exec() continue to preserve shell state unless callers opt out.

API Shape

Default behavior remains persistent:

await sandbox.exec('cd repo');
await sandbox.exec('pwd'); // repo

Use isolated mode for generated scripts, commit/push flows, or other one-shot command blocks that should not affect the session shell:

await sandbox.exec(commitAndPushScript, {
  preserveShellState: false
});

Explicit sessions keep the same persistent semantics:

const session = await sandbox.createSession();
await session.exec('cd repo');
await session.exec('pwd'); // repo

preserveShellState: true is rejected with streaming exec because streaming commands use the background execution path and do not preserve foreground shell state.

Why

Foreground exec() runs commands in the persistent bash session so state changes like cd and export survive across calls. That also means top-level shell-control constructs can terminate the shell itself. For example, exit 0, exec git ..., or set -e followed by a non-zero command can surface as SessionTerminatedError even though the container remains alive.

This PR adds a safe one-shot mode without changing existing persistence defaults.

Testing

  • bun test tests/services/session-manager-terminated-session.test.ts
  • npm test -w @cloudflare/sandbox -- --run tests/sandbox.test.ts -t \"default session management\"
  • npm run check

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 1, 2026

🦋 Changeset detected

Latest commit: 04a5204

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@cloudflare/sandbox Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 1, 2026

Open in StackBlitz

npm i https://pkg.pr.new/cloudflare/sandbox-sdk/@cloudflare/sandbox@667

commit: 04a5204

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

🐳 Docker Images Published

Variant Image
Default cloudflare/sandbox:0.0.0-pr-667-04a5204
Python cloudflare/sandbox:0.0.0-pr-667-04a5204-python
OpenCode cloudflare/sandbox:0.0.0-pr-667-04a5204-opencode
Musl cloudflare/sandbox:0.0.0-pr-667-04a5204-musl
Desktop cloudflare/sandbox:0.0.0-pr-667-04a5204-desktop

Usage:

FROM cloudflare/sandbox:0.0.0-pr-667-04a5204

Version: 0.0.0-pr-667-04a5204


📦 Standalone Binary

For arbitrary Dockerfiles:

COPY --from=cloudflare/sandbox:0.0.0-pr-667-04a5204 /container-server/sandbox /sandbox
ENTRYPOINT ["/sandbox"]

Download via GitHub CLI:

gh run download 25232714233 -n sandbox-binary

Extract from Docker:

docker run --rm cloudflare/sandbox:0.0.0-pr-667-04a5204 cat /container-server/sandbox > sandbox && chmod +x sandbox

@whoiskatrin
Copy link
Copy Markdown
Collaborator Author

/bonk review this PR

Copy link
Copy Markdown
Contributor

@ask-bonk ask-bonk Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Add isolated exec mode

Overall

Solid feature with a clean API design. The subshell approach for preserveShellState: false is the right primitive — it correctly isolates exit, exec, and set -e from the persistent session shell while reusing all the existing foreground output-capture machinery. The default-true behavior preserves backward compatibility, tests cover the key scenarios, and the plumbing through all three layers is consistent.

A few things worth addressing before merge:


1. Unrelated commandTimeoutMs changes

packages/shared/src/rpc-types.ts (line 218) and packages/sandbox-container/src/control-plane/api.ts (line 963) both add commandTimeoutMs?: number to the createSession RPC types. These aren't related to isolated exec and look like they were picked up during the merge from main. If they're intentional, they should be in a separate PR; if accidental, they should be dropped.

2. Isolated-mode env variables leak into the session shell

In buildFIFOScript, the env-restore logic is controlled by restore: !isBackground && preserveShellState:

const { setup: envSetupBlock, cleanup: envCleanupBlock } =
  this.buildScopedEnvBlocks(env, cmdId, {
    restore: !isBackground && preserveShellState
  });

When preserveShellState is false, restore is false, so buildScopedEnvBlocks emits bare export KEY='value' lines with no cleanup block. But buildCommandBlock (which includes the env setup) runs inside the subshell ( ... ), so the exports die with the subshell anyway — no session leak. This is correct but non-obvious. A brief inline comment explaining that the subshell boundary makes cleanup unnecessary would help future readers.

3. Duplicated subshell blocks — DRY opportunity

The isolated foreground cwd vs no-cwd branches (session.ts:1315-1331) are nearly identical; the only difference is two extra lines (cd + error handling). Consider extracting to avoid the duplication:

// Possible refactor (sketch)
script += `  (\n`;
if (cwd) {
  const safeCwd = this.escapeShellPath(cwd);
  script += `    cd ${safeCwd} || { printf '%s\\n' "Failed to change directory to ${safeCwd}" >&2; exit 1; }\n`;
}
script += `    {\n`;
script += `${buildCommandBlock('EXIT_CODE', 6)}\n`;
script += `    }\n`;
script += `    exit "$EXIT_CODE"\n`;
script += `  ) < /dev/null > "$log.stdout" 2> "$log.stderr"\n`;

This is a minor style nit — the current duplication is small and readable — but it reduces the surface for copy-paste drift.

4. Inconsistent cwd error reporting between modes

In the persistent foreground pattern, a failed cd writes a prefixed error line directly to $log:

printf '\x02\x02\x02%s\n' "Failed to change directory to ..." >> "$log"

In the isolated pattern, the error goes to stderr via the subshell's redirected >&2:

cd ... || { printf '%s\n' "Failed to change directory to ..." >&2; exit 1; }

The isolated version works correctly (stderr is captured into $log.stderr, then prefixed and merged), but the error message format differs: no \x02 prefix in the printf itself (the labeler adds it afterward), and the text differs slightly. This is fine functionally, but worth noting for consistency if you touch this area again.

5. Streaming validation is duplicated

preserveShellState: true with streaming is rejected in three places:

  • sandbox.ts:2470 (SDK-level, throws generic Error)
  • execute-handler.ts:119 (HTTP route handler, returns VALIDATION_FAILED)
  • The PR description mentions it

The SDK check uses a generic Error, not a SandboxValidationError from packages/shared/src/errors/. Consider using the shared error class for consistency with the rest of the codebase. Also, the container-side handler check at execute-handler.ts:119 rejects preserveShellState: true but the process service still receives and passes through preserveShellState — this is fine since background exec always sets preserveShellState: true in buildFIFOScript, but it means the validation is only enforced at the handler boundary, not in the service layer.

6. execInternal now uses preserveShellState: false

This is a good change — internal infrastructure commands (mkdir, chmod, mount, etc.) shouldn't risk killing the session shell if something goes wrong. Worth verifying that none of the execInternal callers depend on state persistence (e.g., cd followed by another execInternal). From my scan of the 8 call sites, they all look self-contained, so this should be safe.

7. Test coverage

The unit tests cover the three key isolation scenarios (exit, exec, errexit) plus the streaming rejection. The E2E test verifies the state-isolation contract end-to-end. Good coverage. One gap: there's no test for isolated exec with scoped env variables — worth adding to confirm env vars work correctly in the subshell and don't leak to the session.


Summary

The core implementation is correct and well-tested. The main actionable items are:

  1. Drop or separate the unrelated commandTimeoutMs type additions
  2. Consider a test for isolated exec + scoped env variables

The rest are minor style/consistency suggestions.

Comment thread packages/shared/src/rpc-types.ts Outdated
id: string;
env?: Record<string, string | undefined>;
cwd?: string;
commandTimeoutMs?: number;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to isolated exec — this adds commandTimeoutMs to the createSession RPC type. Was this picked up during the merge from main? If intentional, it should be in its own PR.

const { setup: envSetupBlock, cleanup: envCleanupBlock } =
this.buildScopedEnvBlocks(env, cmdId, { restore: !isBackground });
this.buildScopedEnvBlocks(env, cmdId, {
restore: !isBackground && preserveShellState
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The restore: false path emits bare export lines without cleanup, which looks like it would leak env vars into the session. It does not, because the isolated pattern runs inside ( ... ) and exports die with the subshell. A short comment here would help future readers understand why cleanup is unnecessary in isolated mode.

script += ` }\n`;
script += ` exit "$EXIT_CODE"\n`;
script += ` ) < /dev/null > "$log.stdout" 2> "$log.stderr"\n`;
} else {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The cwd/no-cwd branches are nearly identical — only the cd + error handling differs. Could be collapsed:

script += `  (\n`;
if (cwd) {
  const safeCwd = this.escapeShellPath(cwd);
  script += `    cd ${safeCwd} || { printf '%s\\n' "Failed to change directory to ${safeCwd}" >&2; exit 1; }\n`;
}
script += `    {\n`;
script += `${buildCommandBlock("EXIT_CODE", 6)}\n`;
script += `    }\n`;
script += `    exit "$EXIT_CODE"\n`;
script += `  ) < /dev/null > "$log.stdout" 2> "$log.stderr"\n`;

Minor — current version is readable enough.

Comment thread packages/sandbox/src/sandbox.ts Outdated

try {
if (options.preserveShellState === true) {
throw new Error(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses a generic Error instead of the shared error classes from packages/shared/src/errors/. Consider using SandboxValidationError (or equivalent) for consistency with the container-side validation at execute-handler.ts:119 which returns ErrorCode.VALIDATION_FAILED.

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk Bot commented May 1, 2026

Review posted successfully. Here's a summary of what I reviewed:

PR #667: Add isolated exec mode — reviewed across all 19 changed files spanning the three-layer architecture (SDK, shared types, container runtime).

Key feedback posted:

  1. Unrelated commandTimeoutMs changes — two files (rpc-types.ts, api.ts) add commandTimeoutMs to createSession types, which appears to be merge noise rather than part of this feature.

  2. Env variable cleanup in isolated mode — the restore: false path looks like it could leak env vars, but it's actually correct because the subshell boundary kills them. Suggested adding a clarifying comment.

  3. DRY opportunity in subshell generation — the cwd/no-cwd branches in the isolated foreground pattern are nearly identical and could be collapsed.

  4. Generic Error vs shared error classes — the streaming validation in sandbox.ts throws a plain Error while the container-side handler uses ErrorCode.VALIDATION_FAILED. Suggested using the shared error class.

  5. Test coverage gap — no test for isolated exec with scoped env variables.

Overall assessment: solid implementation with correct subshell isolation, clean API design, good backward compatibility, and strong test coverage for the core scenarios.

github run

@whoiskatrin
Copy link
Copy Markdown
Collaborator Author

/bonk re-review please

Copy link
Copy Markdown
Contributor

@ask-bonk ask-bonk Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review: Add isolated exec mode

All actionable items from the first review have been addressed:

  1. commandTimeoutMs noise — This was a false positive in the original review. The diff only adds preserveShellState to rpc-types.ts and api.ts. No unrelated changes.

  2. Env variable cleanup comment — Added at session.ts:1141-1142. Clear and accurate.

  3. Streaming validation error class — Now uses ValidationFailedError with ErrorCode.VALIDATION_FAILED at sandbox.ts:2472, matching the container-side handler at execute-handler.ts:119.

  4. Test for isolated exec + scoped env vars — Added at session-manager-terminated-session.test.ts:187. Verifies env vars work inside the subshell and don't leak to the session.

Correctness analysis of the isolated foreground pattern

Traced through the generated bash for each scenario:

  • exit N: The exit terminates the subshell; EXIT_CODE=$? on the parent captures N. Correct.
  • exec <cmd>: exec replaces the subshell process; the parent sees the replacement's exit code via $?. Correct.
  • set -e; false: set -e applies to the subshell only; false exits the subshell with code 1; EXIT_CODE=$? captures it. The inner EXIT_CODE=$? assignment is never reached, but the outer one is. Correct.
  • Scoped env vars: restore: false emits bare export lines without cleanup. Since the exports run inside ( ... ), they die with the subshell. The comment at line 1141 explains this. Correct.
  • cwd handling: cd inside the subshell is scoped to it. The parent shell's working directory is unaffected. No PREV_DIR save/restore needed. Correct.

The appendForegroundCompletion helper eliminates the prefix/merge/exit-code duplication between persistent and isolated foreground patterns. Clean refactor.

execInternal switch to isolated mode

All 8 execInternal call sites (mkdir -p, chmod, rm -f, mount commands) are self-contained one-shot commands. None depend on cd/export persisting across calls. The switch to preserveShellState: false is safe and is a good defensive change — infrastructure commands that fail unexpectedly won't terminate the session shell.

Minor nit (non-blocking)

The buildFIFOScript JSDoc (session.ts:1061-1103) still describes only two modes (background and foreground). The @param isBackground doc says "If false, command runs in foreground (for exec) - state persists!" which is now only conditionally true. The new preserveShellState parameter and isolated foreground mode aren't mentioned. Consider updating the doc and ASCII diagram to reflect the three-mode behavior on a follow-up.

Summary

Implementation is correct, tests cover the key scenarios (exit, exec, errexit, scoped env, E2E state isolation), the API design preserves backward compatibility, and all prior review feedback was addressed. Ship it.

Copy link
Copy Markdown
Contributor

@ask-bonk ask-bonk Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (non-blocking): The buildFIFOScript JSDoc still describes only two modes (background and foreground) and the @param isBackground comment says "state persists!" unconditionally. Consider updating the doc and ASCII diagram to mention the isolated foreground mode and the preserveShellState parameter in a follow-up.

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk Bot commented May 1, 2026

Re-review posted with approval on PR #667. Here's the summary:

All prior review feedback was addressed:

  • commandTimeoutMs was a false positive (not in this diff)
  • Env cleanup comment added
  • Streaming validation now uses ValidationFailedError
  • Isolated exec + scoped env vars test added

Correctness verified — traced through the generated bash for exit, exec, set -e, scoped env vars, and cwd handling. The subshell pattern is sound. The execInternal switch to isolated mode is safe (all 8 call sites are self-contained).

One non-blocking nit — the buildFIFOScript JSDoc still describes only two modes and should be updated to reflect the three-mode behavior in a follow-up.

Approved to ship.

github run

@whoiskatrin whoiskatrin marked this pull request as ready for review May 1, 2026 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant