feat(prover): add OpenShell Policy Prover (OPP)#703
Closed
Conversation
Integrate Z3-based formal verification for sandbox policies as openshell.prover. Detects data exfiltration paths, write bypass attacks, L7 enforcement gaps, overpermissive methods, and privilege escalation via binary capability chains. Closes #699 Signed-off-by: Alexander Watson <zredlined@gmail.com>
IntoDiagnostic trait is not imported in main.rs scope. Signed-off-by: Alexander Watson <zredlined@gmail.com>
Wrap the sandbox import in try/except so subpackages like openshell.prover can be imported without a full Rust build. Removes the conftest.py namespace hack that was working around this. Signed-off-by: Alexander Watson <zredlined@gmail.com>
Bare python3 doesn't have z3-solver/rich/pyyaml. Using uv run ensures the prover runs in the project venv with all dependencies. Signed-off-by: Alexander Watson <zredlined@gmail.com>
… files Signed-off-by: Alexander Watson <zredlined@gmail.com>
Signed-off-by: Alexander Watson <zredlined@gmail.com>
Signed-off-by: Alexander Watson <zredlined@gmail.com>
Narrow OPP to two yes/no security questions: 1. Can data leave this sandbox? (exfiltration) 2. Can the agent write despite read-only intent? (write bypass) Removes 4 advisory/lint queries (overpermissive methods, L4 gaps, binary inheritance, inference relay), 3 output modes (Mermaid, topology, HTML), and fixes 3 bugs (include_workdir not used in exfil analysis, __all__ broken when protos missing, CLI PYTHONPATH). ~40% line reduction (2900 -> 1769 lines). Signed-off-by: Alexander Watson <zredlined@gmail.com>
Only report exfiltration paths where enforcement diverges from intent: - L4-only endpoints (no HTTP inspection) - Wire protocol bypasses (L7 exists but binary circumvents it) L7-enforced endpoints where POST is explicitly allowed are the policy working as designed, not findings. Remove binary inheritance from the Z3 model — transitive spawn chain access produced noise (git inheriting access to api.telegram.org via node). Direct policy grants only. Inheritance can return later as an opt-in extended analysis mode. NemoClaw default policy: 38 findings -> 9 actionable gaps. Signed-off-by: Alexander Watson <zredlined@gmail.com>
L4-only endpoints are not implicitly read-only. A policy with access: full and no protocol is an explicit grant of full access, not a violated intent. Write bypass now only fires when the policy expresses read-only intent via access: read-only or GET-only L7 rules. Signed-off-by: Alexander Watson <zredlined@gmail.com>
johntmyers
reviewed
Apr 2, 2026
Replace broad contextlib.suppress with explicit error inspection. Proto/gRPC import failures are silently logged. Other ImportErrors surface as warnings so real issues aren't swallowed. Signed-off-by: Alexander Watson <zredlined@gmail.com>
zredlined
added a commit
that referenced
this pull request
Apr 2, 2026
Add openshell-prover crate implementing formal policy verification using Z3 SMT solving. Answers two questions about any sandbox policy: "Can data leave?" and "Can the agent write despite read-only intent?" Native Rust — no Python subprocess, no PYTHONPATH, no uv dependency. Z3 bundled via z3-sys for self-contained builds. Replaces the Python prototype from #703. Closes #699 Signed-off-by: Alexander Watson <zredlined@gmail.com>
Collaborator
Author
|
Superseded by native Rust implementation — see new PR. |
7 tasks
zredlined
added a commit
that referenced
this pull request
Apr 4, 2026
Add openshell-prover crate implementing formal policy verification using Z3 SMT solving. Answers two questions about any sandbox policy: "Can data leave?" and "Can the agent write despite read-only intent?" Native Rust — no Python subprocess, no PYTHONPATH, no uv dependency. Z3 bundled via z3-sys for self-contained builds. Replaces the Python prototype from #703. Closes #699 Signed-off-by: Alexander Watson <zredlined@gmail.com>
zredlined
added a commit
that referenced
this pull request
Apr 9, 2026
Add openshell-prover crate implementing formal policy verification using Z3 SMT solving. Answers two questions about any sandbox policy: "Can data leave?" and "Can the agent write despite read-only intent?" Native Rust — no Python subprocess, no PYTHONPATH, no uv dependency. Z3 bundled via z3-sys for self-contained builds. Replaces the Python prototype from #703. Closes #699 Signed-off-by: Alexander Watson <zredlined@gmail.com>
zredlined
added a commit
that referenced
this pull request
Apr 9, 2026
* feat(prover): add native Rust policy prover with Z3 solver Add openshell-prover crate implementing formal policy verification using Z3 SMT solving. Answers two questions about any sandbox policy: "Can data leave?" and "Can the agent write despite read-only intent?" Native Rust — no Python subprocess, no PYTHONPATH, no uv dependency. Z3 bundled via z3-sys for self-contained builds. Replaces the Python prototype from #703. Closes #699 Signed-off-by: Alexander Watson <zredlined@gmail.com> * fix(prover): skip L7-write in exfil, write bypass only on read-only intent Port two fixes from the Python branch: - Exfil query skips endpoints where L7 is enforced and working - Write bypass only fires on explicit read-only intent, not L4-only Signed-off-by: Alexander Watson <zredlined@gmail.com> * chore(prover): add missing SPDX license headers to registry and testdata YAML files * fix(prover): revert serde_yaml to serde_yml to match workspace dependency on main * fix(prover): apply cargo fmt formatting to prover and cli source files * ci: add clang and libclang-dev to CI image for z3-sys bindgen z3-sys requires libclang at build time for bindgen to generate FFI bindings. Without it, the Rust CI jobs fail on the prover crate. * fix(prover): use system libz3 instead of compiling from source Add libz3-dev to the CI image and drop the z3 `bundled` feature from the workspace dependency. This eliminates the ~30 min z3 C++ build that ran on every CI cache miss. sccache only wraps rustc — it does not intercept the cmake C++ build that z3-sys runs when `bundled` is enabled. The cargo target cache helped on warm runs but evicts on toolchain bumps and fork PRs. With the system library pre-installed in the CI image, z3 link time is always zero. A `bundled-z3` opt-in feature is added to the prover crate for local development without system z3: cargo build -p openshell-prover --features bundled-z3 Regular local dev: brew install z3 (macOS) or apt install libz3-dev (Linux), then cargo build just works. Signed-off-by: Alexander Watson <zredlined@gmail.com> Made-with: Cursor * fix(prover): use u16 for ports, align include_workdir default with runtime Port fields changed from u32 to u16 across all prover types (policy, model, finding, queries). Prevents the prover from silently accepting port values >65535 that the runtime rejects, which would produce misleading PASS results on invalid policies. Change include_workdir serde default from true to false to match the runtime (openshell-policy uses #[serde(default)] which gives false). The previous mismatch caused the prover to model a /sandbox path that does not exist at runtime, producing false analysis. Signed-off-by: Alexander Watson <zredlined@gmail.com> Made-with: Cursor * refactor(prover): embed registry at compile time, replace hand-rolled glob Embed binary and API capability registry YAML files into the binary at compile time using include_dir!. The previous approach used env!("CARGO_MANIFEST_DIR") which bakes in the build machine's source path — works in tests but breaks for installed binaries. The CWD fallback was equally fragile. The --registry CLI flag still works as a filesystem override for custom registries. Credentials remain filesystem-loaded (user-supplied data). Replace the hand-rolled glob matching (~60 lines) with the glob crate's Pattern::matches(), which is already a transitive dependency. Remove unused dependencies: openshell-policy, openshell-core, thiserror, tracing. The prover parses policy YAML directly into its own types and does not use the policy or core crate APIs. Signed-off-by: Alexander Watson <zredlined@gmail.com> Made-with: Cursor * docs: add Z3 system library to prerequisites The prover crate now links against system libz3 instead of compiling from source. Document the install steps for macOS, Ubuntu, and Fedora, and note the bundled-z3 feature flag as a fallback. Signed-off-by: Alexander Watson <zredlined@gmail.com> Made-with: Cursor * fix(prover): apply cargo fmt formatting Made-with: Cursor --------- Signed-off-by: Alexander Watson <zredlined@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add the OpenShell Policy Prover (OPP) — formal verification for sandbox policies using Z3 SMT solving. OPP answers two yes/no security questions about any policy:
Any finding = FAIL. No findings = PASS. Teams can acknowledge known paths via accepted-risks YAML.
Related Issue
Closes #699
Changes
Python prover (
python/openshell/prover/):Registry data (
python/openshell/prover/registry/):Rust CLI (
crates/openshell-cli/src/main.rs):openshell policy provesubcommand — delegates to Python prover viauv runAgent skill (
.agents/skills/harden-sandbox-policy/):generate-sandbox-policyin the policy iteration chainArchitecture doc (
architecture/policy-prover.md):SDK fix (
python/openshell/__init__.py):Testing
uv run ruff check python/openshell/prover/— lint cleanuv run pytest python/openshell/prover/prover_test.py -v— 7/7 tests passcargo build -p openshell-cli— compilesopenshell policy prove --compact— detects exfil + write bypass on test fixtureChecklist