Skip to content

feat(prover): add OpenShell Policy Prover (OPP)#703

Closed
zredlined wants to merge 11 commits intomainfrom
feat/699-policy-prover/zredlined
Closed

feat(prover): add OpenShell Policy Prover (OPP)#703
zredlined wants to merge 11 commits intomainfrom
feat/699-policy-prover/zredlined

Conversation

@zredlined
Copy link
Copy Markdown
Collaborator

@zredlined zredlined commented Mar 31, 2026

Summary

Add the OpenShell Policy Prover (OPP) — formal verification for sandbox policies using Z3 SMT solving. OPP answers two yes/no security questions about any policy:

  1. Can data leave this sandbox? — maps exfiltration paths from readable filesystem to writable egress channels (L4 bypasses, wire protocol tunnels, L7 write channels)
  2. Can the agent write despite read-only intent? — detects write operations that bypass read-only policy through binary capabilities, credential scopes, and transport protocol mismatches

Any finding = FAIL. No findings = PASS. Teams can acknowledge known paths via accepted-risks YAML.

Related Issue

Closes #699

Changes

Python prover (python/openshell/prover/):

  • 2 verification queries (exfiltration + write bypass) encoded as Z3 SMT constraints
  • Policy parser, binary capability registry, credential loader
  • Terminal and compact (CI) output modes
  • Accepted risks filtering
  • ~1,770 lines total

Registry data (python/openshell/prover/registry/):

  • 17 binary capability descriptors (git, curl, python, claude, etc.)
  • 1 API capability descriptor (GitHub)
  • 1 credential descriptor (GitHub PAT)

Rust CLI (crates/openshell-cli/src/main.rs):

  • openshell policy prove subcommand — delegates to Python prover via uv run

Agent skill (.agents/skills/harden-sandbox-policy/):

  • Skill definition for iterative policy hardening workflow
  • Complements generate-sandbox-policy in the policy iteration chain

Architecture doc (architecture/policy-prover.md):

  • Problem statement, approach, usage, and architecture overview

SDK fix (python/openshell/__init__.py):

  • Graceful handling of missing proto stubs so subpackages can import independently

Testing

  • uv run ruff check python/openshell/prover/ — lint clean
  • uv run pytest python/openshell/prover/prover_test.py -v — 7/7 tests pass
  • cargo build -p openshell-cli — compiles
  • openshell policy prove --compact — detects exfil + write bypass on test fixture
  • License headers — all files have SPDX headers

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated

Integrate Z3-based formal verification for sandbox policies as
openshell.prover. Detects data exfiltration paths, write bypass
attacks, L7 enforcement gaps, overpermissive methods, and privilege
escalation via binary capability chains.

Closes #699

Signed-off-by: Alexander Watson <zredlined@gmail.com>
IntoDiagnostic trait is not imported in main.rs scope.

Signed-off-by: Alexander Watson <zredlined@gmail.com>
Wrap the sandbox import in try/except so subpackages like
openshell.prover can be imported without a full Rust build.
Removes the conftest.py namespace hack that was working around this.

Signed-off-by: Alexander Watson <zredlined@gmail.com>
Bare python3 doesn't have z3-solver/rich/pyyaml. Using uv run
ensures the prover runs in the project venv with all dependencies.

Signed-off-by: Alexander Watson <zredlined@gmail.com>
@zredlined zredlined self-assigned this Mar 31, 2026
@zredlined zredlined changed the title feat(prover): add OpenShell Policy Prover (OPP) to Python SDK feat(prover): add OpenShell Policy Prover (OPP) Mar 31, 2026
… files

Signed-off-by: Alexander Watson <zredlined@gmail.com>
Signed-off-by: Alexander Watson <zredlined@gmail.com>
Signed-off-by: Alexander Watson <zredlined@gmail.com>
Narrow OPP to two yes/no security questions:
1. Can data leave this sandbox? (exfiltration)
2. Can the agent write despite read-only intent? (write bypass)

Removes 4 advisory/lint queries (overpermissive methods, L4 gaps,
binary inheritance, inference relay), 3 output modes (Mermaid,
topology, HTML), and fixes 3 bugs (include_workdir not used in
exfil analysis, __all__ broken when protos missing, CLI PYTHONPATH).

~40% line reduction (2900 -> 1769 lines).

Signed-off-by: Alexander Watson <zredlined@gmail.com>
@zredlined zredlined marked this pull request as ready for review March 31, 2026 18:40
@zredlined zredlined requested a review from a team as a code owner March 31, 2026 18:40
Only report exfiltration paths where enforcement diverges from intent:
- L4-only endpoints (no HTTP inspection)
- Wire protocol bypasses (L7 exists but binary circumvents it)

L7-enforced endpoints where POST is explicitly allowed are the policy
working as designed, not findings.

Remove binary inheritance from the Z3 model — transitive spawn chain
access produced noise (git inheriting access to api.telegram.org via
node). Direct policy grants only. Inheritance can return later as an
opt-in extended analysis mode.

NemoClaw default policy: 38 findings -> 9 actionable gaps.

Signed-off-by: Alexander Watson <zredlined@gmail.com>
L4-only endpoints are not implicitly read-only. A policy with
access: full and no protocol is an explicit grant of full access,
not a violated intent. Write bypass now only fires when the policy
expresses read-only intent via access: read-only or GET-only L7 rules.

Signed-off-by: Alexander Watson <zredlined@gmail.com>
Comment thread python/openshell/__init__.py Outdated
Replace broad contextlib.suppress with explicit error inspection.
Proto/gRPC import failures are silently logged. Other ImportErrors
surface as warnings so real issues aren't swallowed.

Signed-off-by: Alexander Watson <zredlined@gmail.com>
zredlined added a commit that referenced this pull request Apr 2, 2026
Add openshell-prover crate implementing formal policy verification
using Z3 SMT solving. Answers two questions about any sandbox policy:
"Can data leave?" and "Can the agent write despite read-only intent?"

Native Rust — no Python subprocess, no PYTHONPATH, no uv dependency.
Z3 bundled via z3-sys for self-contained builds.

Replaces the Python prototype from #703.

Closes #699

Signed-off-by: Alexander Watson <zredlined@gmail.com>
@zredlined
Copy link
Copy Markdown
Collaborator Author

Superseded by native Rust implementation — see new PR.

@zredlined zredlined closed this Apr 2, 2026
zredlined added a commit that referenced this pull request Apr 4, 2026
Add openshell-prover crate implementing formal policy verification
using Z3 SMT solving. Answers two questions about any sandbox policy:
"Can data leave?" and "Can the agent write despite read-only intent?"

Native Rust — no Python subprocess, no PYTHONPATH, no uv dependency.
Z3 bundled via z3-sys for self-contained builds.

Replaces the Python prototype from #703.

Closes #699

Signed-off-by: Alexander Watson <zredlined@gmail.com>
zredlined added a commit that referenced this pull request Apr 9, 2026
Add openshell-prover crate implementing formal policy verification
using Z3 SMT solving. Answers two questions about any sandbox policy:
"Can data leave?" and "Can the agent write despite read-only intent?"

Native Rust — no Python subprocess, no PYTHONPATH, no uv dependency.
Z3 bundled via z3-sys for self-contained builds.

Replaces the Python prototype from #703.

Closes #699

Signed-off-by: Alexander Watson <zredlined@gmail.com>
zredlined added a commit that referenced this pull request Apr 9, 2026
* feat(prover): add native Rust policy prover with Z3 solver

Add openshell-prover crate implementing formal policy verification
using Z3 SMT solving. Answers two questions about any sandbox policy:
"Can data leave?" and "Can the agent write despite read-only intent?"

Native Rust — no Python subprocess, no PYTHONPATH, no uv dependency.
Z3 bundled via z3-sys for self-contained builds.

Replaces the Python prototype from #703.

Closes #699

Signed-off-by: Alexander Watson <zredlined@gmail.com>

* fix(prover): skip L7-write in exfil, write bypass only on read-only intent

Port two fixes from the Python branch:
- Exfil query skips endpoints where L7 is enforced and working
- Write bypass only fires on explicit read-only intent, not L4-only

Signed-off-by: Alexander Watson <zredlined@gmail.com>

* chore(prover): add missing SPDX license headers to registry and testdata YAML files

* fix(prover): revert serde_yaml to serde_yml to match workspace dependency on main

* fix(prover): apply cargo fmt formatting to prover and cli source files

* ci: add clang and libclang-dev to CI image for z3-sys bindgen

z3-sys requires libclang at build time for bindgen to generate FFI
bindings. Without it, the Rust CI jobs fail on the prover crate.

* fix(prover): use system libz3 instead of compiling from source

Add libz3-dev to the CI image and drop the z3 `bundled` feature from
the workspace dependency. This eliminates the ~30 min z3 C++ build
that ran on every CI cache miss.

sccache only wraps rustc — it does not intercept the cmake C++ build
that z3-sys runs when `bundled` is enabled. The cargo target cache
helped on warm runs but evicts on toolchain bumps and fork PRs.
With the system library pre-installed in the CI image, z3 link time
is always zero.

A `bundled-z3` opt-in feature is added to the prover crate for local
development without system z3:

  cargo build -p openshell-prover --features bundled-z3

Regular local dev: brew install z3 (macOS) or apt install libz3-dev
(Linux), then cargo build just works.

Signed-off-by: Alexander Watson <zredlined@gmail.com>
Made-with: Cursor

* fix(prover): use u16 for ports, align include_workdir default with runtime

Port fields changed from u32 to u16 across all prover types (policy,
model, finding, queries). Prevents the prover from silently accepting
port values >65535 that the runtime rejects, which would produce
misleading PASS results on invalid policies.

Change include_workdir serde default from true to false to match the
runtime (openshell-policy uses #[serde(default)] which gives false).
The previous mismatch caused the prover to model a /sandbox path that
does not exist at runtime, producing false analysis.

Signed-off-by: Alexander Watson <zredlined@gmail.com>
Made-with: Cursor

* refactor(prover): embed registry at compile time, replace hand-rolled glob

Embed binary and API capability registry YAML files into the binary at
compile time using include_dir!. The previous approach used
env!("CARGO_MANIFEST_DIR") which bakes in the build machine's source
path — works in tests but breaks for installed binaries. The CWD
fallback was equally fragile.

The --registry CLI flag still works as a filesystem override for custom
registries. Credentials remain filesystem-loaded (user-supplied data).

Replace the hand-rolled glob matching (~60 lines) with the glob crate's
Pattern::matches(), which is already a transitive dependency.

Remove unused dependencies: openshell-policy, openshell-core, thiserror,
tracing. The prover parses policy YAML directly into its own types and
does not use the policy or core crate APIs.

Signed-off-by: Alexander Watson <zredlined@gmail.com>
Made-with: Cursor

* docs: add Z3 system library to prerequisites

The prover crate now links against system libz3 instead of compiling
from source. Document the install steps for macOS, Ubuntu, and Fedora,
and note the bundled-z3 feature flag as a fallback.

Signed-off-by: Alexander Watson <zredlined@gmail.com>
Made-with: Cursor

* fix(prover): apply cargo fmt formatting

Made-with: Cursor

---------

Signed-off-by: Alexander Watson <zredlined@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Add formal policy verification (OpenShell Policy Prover) to Python SDK

2 participants