diff --git a/architecture/sandbox-providers.md b/architecture/sandbox-providers.md index f6fdf38b3..00120630e 100644 --- a/architecture/sandbox-providers.md +++ b/architecture/sandbox-providers.md @@ -11,10 +11,12 @@ providers centralize that concern: a user configures a provider once, and any sa needs that external service can reference it. At sandbox creation time, providers are validated and associated with the sandbox. The -sandbox supervisor then fetches credentials at runtime and injects them as environment -variables into every child process it spawns. Access is enforced through the sandbox -policy — the policy decides which outbound requests are allowed or denied based on -the providers attached to that sandbox. +sandbox supervisor then fetches credentials at runtime, keeps the real secret values in +supervisor-only memory, and injects placeholder environment variables into every child +process it spawns. When outbound traffic is allowed through the sandbox proxy, the +supervisor rewrites those placeholders back to the real secret values before forwarding. +Access is enforced through the sandbox policy — the policy decides which outbound +requests are allowed or denied based on the providers attached to that sandbox. Core goals: @@ -57,7 +59,8 @@ The gRPC surface is defined in `proto/navigator.proto`: - persistence using `object_type = "provider"`. - `crates/navigator-sandbox` - sandbox supervisor fetches provider credentials via gRPC at startup, - - injects credentials as environment variables into entrypoint and SSH child processes. + - injects placeholder env vars into entrypoint and SSH child processes, + - resolves placeholders back to real secrets in the outbound proxy path. ## Provider Plugins @@ -242,12 +245,18 @@ In `run_sandbox()` (`crates/navigator-sandbox/src/lib.rs`): 2. fetches provider credentials via gRPC (`GetSandboxProviderEnvironment`), 3. if the fetch fails, continues with an empty map (graceful degradation with a warning). -The returned `provider_env` `HashMap` is then threaded to both the -entrypoint process spawner and the SSH server. +The returned `provider_env` `HashMap` is immediately transformed into: + +- a child-visible env map with placeholder values such as + `openshell:resolve:env:ANTHROPIC_API_KEY`, and +- a supervisor-only in-memory registry mapping each placeholder back to its real secret. + +The placeholder env map is threaded to the entrypoint process spawner and SSH server. +The registry is threaded to the proxy so it can rewrite outbound headers. ### Child Process Environment Variable Injection -Provider credentials are injected into child processes in two places, covering all +Provider placeholders are injected into child processes in two places, covering all process spawning paths inside the sandbox: **1. Entrypoint process** (`crates/navigator-sandbox/src/process.rs`): @@ -257,14 +266,16 @@ let mut cmd = Command::new(program); cmd.args(args) .env("OPENSHELL_SANDBOX", "1"); -// Set provider environment variables (credentials fetched at runtime). +// Set provider environment variables (supervisor-managed placeholders). for (key, value) in provider_env { cmd.env(key, value); } ``` This uses `tokio::process::Command`. The `.env()` call adds each variable to the child's -inherited environment without clearing it. +inherited environment without clearing it. The spawn path also explicitly removes +`NEMOCLAW_SSH_HANDSHAKE_SECRET` so the handshake secret does not leak into the agent +entrypoint process. After provider env vars, proxy env vars (`HTTP_PROXY`, `HTTPS_PROXY`, `ALL_PROXY`, etc.) are also set when `NetworkMode` is `Proxy`. The child is then launched with namespace @@ -281,14 +292,30 @@ cmd.env("OPENSHELL_SANDBOX", "1") .env("USER", "sandbox") .env("TERM", term); -// Set provider environment variables (credentials fetched at runtime). +// Set provider environment variables (supervisor-managed placeholders). for (key, value) in provider_env { cmd.env(key, value); } ``` This uses `std::process::Command`. The `SshHandler` holds the `provider_env` map and -passes it to `spawn_pty_shell()` for each new shell or exec request. +passes it to `spawn_pty_shell()` for each new shell or exec request. SSH child processes +start from `env_clear()`, so the handshake secret is not present there. + +### Proxy-Time Secret Resolution + +When a sandboxed tool uses one of these placeholder env vars to populate an outbound HTTP +header (for example `Authorization: Bearer openshell:resolve:env:ANTHROPIC_API_KEY`), the +sandbox proxy rewrites the placeholder to the real secret value immediately before the +request is forwarded upstream. + +This applies to: + +- forward-proxy HTTP requests, and +- L7-inspected REST requests inside CONNECT tunnels. + +The real secret value remains in supervisor memory only; it is not re-injected into the +child process environment. ### End-to-End Flow @@ -309,13 +336,17 @@ CLI: openshell sandbox create -- claude K8s: pod starts navigator-sandbox binary +-- OPENSHELL_SANDBOX_ID and OPENSHELL_ENDPOINT set in pod env | - Sandbox supervisor: run_sandbox() - +-- Fetches policy via gRPC - +-- Fetches provider env via gRPC - | +-- Gateway resolves: "claude" -> credentials -> {ANTHROPIC_API_KEY: "sk-..."} - +-- Spawns entrypoint: cmd.env("ANTHROPIC_API_KEY", "sk-...") - +-- SSH server holds provider_env - +-- Each SSH shell: cmd.env("ANTHROPIC_API_KEY", "sk-...") + Sandbox supervisor: run_sandbox() + +-- Fetches policy via gRPC + +-- Fetches provider env via gRPC + | +-- Gateway resolves: "claude" -> credentials -> {ANTHROPIC_API_KEY: "sk-..."} + +-- Builds placeholder registry + | +-- child env: {ANTHROPIC_API_KEY: "openshell:resolve:env:ANTHROPIC_API_KEY"} + | +-- supervisor registry: {"openshell:resolve:env:ANTHROPIC_API_KEY": "sk-..."} + +-- Spawns entrypoint with placeholder env + +-- SSH server holds placeholder env + | +-- Each SSH shell: cmd.env("ANTHROPIC_API_KEY", "openshell:resolve:env:ANTHROPIC_API_KEY") + +-- Proxy rewrites outbound auth header placeholders -> real secrets ``` ## Persistence and Validation @@ -336,6 +367,10 @@ Providers are stored with `object_type = "provider"` in the shared object store. - CLI displays only non-sensitive summaries (counts/key names where relevant). - Credentials are never persisted in the sandbox spec — they exist only in the provider store and are fetched at runtime by the sandbox supervisor. +- Child processes never receive the raw provider secret values; they only receive + placeholders, and the supervisor resolves those placeholders during outbound proxying. +- `NEMOCLAW_SSH_HANDSHAKE_SECRET` is required by the supervisor/SSH server path but is + explicitly kept out of spawned sandbox child-process environments. ## Test Strategy @@ -344,3 +379,6 @@ Providers are stored with `object_type = "provider"` in the shared object store. - Mocked discovery context tests cover env and path-based behavior. - CLI and gateway integration tests validate end-to-end RPC compatibility. - `resolve_provider_environment` unit tests in `crates/navigator-server/src/grpc.rs`. +- sandbox unit tests validate placeholder generation and header rewriting. +- E2E sandbox tests verify placeholders are visible in child env, outbound proxy traffic + is rewritten with the real secret, and the SSH handshake secret is absent from exec env. diff --git a/crates/navigator-sandbox/src/l7/relay.rs b/crates/navigator-sandbox/src/l7/relay.rs index 46ca059c3..b8b923cd6 100644 --- a/crates/navigator-sandbox/src/l7/relay.rs +++ b/crates/navigator-sandbox/src/l7/relay.rs @@ -8,10 +8,10 @@ //! and either forwards or denies the request. use crate::l7::provider::L7Provider; -use crate::l7::rest::RestProvider; use crate::l7::{EnforcementMode, L7EndpointConfig, L7Protocol, L7RequestInfo}; +use crate::secrets::SecretResolver; use miette::{IntoDiagnostic, Result, miette}; -use std::sync::Mutex; +use std::sync::{Arc, Mutex}; use tokio::io::{AsyncRead, AsyncWrite}; use tracing::{debug, info, warn}; @@ -29,6 +29,8 @@ pub struct L7EvalContext { pub ancestors: Vec, /// Cmdline paths. pub cmdline_paths: Vec, + /// Supervisor-only placeholder resolver for outbound headers. + pub(crate) secret_resolver: Option>, } /// Run protocol-aware L7 inspection on a tunnel. @@ -78,11 +80,9 @@ where C: AsyncRead + AsyncWrite + Unpin + Send, U: AsyncRead + AsyncWrite + Unpin + Send, { - let provider = RestProvider; - loop { // Parse one HTTP request from client - let req = match provider.parse_request(client).await { + let req = match crate::l7::rest::RestProvider.parse_request(client).await { Ok(Some(req)) => req, Ok(None) => return Ok(()), // Client closed connection Err(e) => { @@ -134,7 +134,13 @@ where if allowed || config.enforcement == EnforcementMode::Audit { // Forward request to upstream and relay response - let reusable = provider.relay(&req, client, upstream).await?; + let reusable = crate::l7::rest::relay_http_request_with_resolver( + &req, + client, + upstream, + ctx.secret_resolver.as_deref(), + ) + .await?; if !reusable { debug!( host = %ctx.host, @@ -145,7 +151,7 @@ where } } else { // Enforce mode: deny with 403 and close connection - provider + crate::l7::rest::RestProvider .deny(&req, &ctx.policy_name, &reason, client) .await?; return Ok(()); diff --git a/crates/navigator-sandbox/src/l7/rest.rs b/crates/navigator-sandbox/src/l7/rest.rs index b609c708f..f61d1242c 100644 --- a/crates/navigator-sandbox/src/l7/rest.rs +++ b/crates/navigator-sandbox/src/l7/rest.rs @@ -8,6 +8,7 @@ //! and chunked transfer encoding for body framing. use crate::l7::provider::{BodyLength, L7Provider, L7Request}; +use crate::secrets::rewrite_http_header_block; use miette::{IntoDiagnostic, Result, miette}; use tokio::io::{AsyncRead, AsyncReadExt, AsyncWrite, AsyncWriteExt}; use tracing::debug; @@ -121,27 +122,38 @@ where C: AsyncRead + AsyncWrite + Unpin, U: AsyncRead + AsyncWrite + Unpin, { - // Find the actual header end in raw_header + relay_http_request_with_resolver(req, client, upstream, None).await +} + +pub(crate) async fn relay_http_request_with_resolver( + req: &L7Request, + client: &mut C, + upstream: &mut U, + resolver: Option<&crate::secrets::SecretResolver>, +) -> Result +where + C: AsyncRead + AsyncWrite + Unpin, + U: AsyncRead + AsyncWrite + Unpin, +{ let header_end = req .raw_header .windows(4) .position(|w| w == b"\r\n\r\n") .map_or(req.raw_header.len(), |p| p + 4); - // Forward request headers to upstream + let rewritten_header = rewrite_http_header_block(&req.raw_header[..header_end], resolver); + upstream - .write_all(&req.raw_header[..header_end]) + .write_all(&rewritten_header) .await .into_diagnostic()?; - // Forward any overflow body bytes that were read with headers let overflow = &req.raw_header[header_end..]; if !overflow.is_empty() { upstream.write_all(overflow).await.into_diagnostic()?; } let overflow_len = overflow.len() as u64; - // Forward remaining request body match req.body_length { BodyLength::ContentLength(len) => { let remaining = len.saturating_sub(overflow_len); @@ -155,8 +167,6 @@ where BodyLength::None => {} } upstream.flush().await.into_diagnostic()?; - - // Read and forward response from upstream back to client relay_response(&req.action, upstream, client).await } @@ -580,6 +590,7 @@ fn is_benign_close(err: &std::io::Error) -> bool { #[cfg(test)] mod tests { use super::*; + use crate::secrets::SecretResolver; #[test] fn parse_content_length() { @@ -930,4 +941,193 @@ mod tests { client_read.read_to_end(&mut received).await.unwrap(); assert!(String::from_utf8_lossy(&received).contains("hello")); } + + #[test] + fn rewrite_header_block_resolves_placeholder_auth_headers() { + let (_, resolver) = SecretResolver::from_provider_env( + [("ANTHROPIC_API_KEY".to_string(), "sk-test".to_string())] + .into_iter() + .collect(), + ); + let raw = b"GET /v1/messages HTTP/1.1\r\nAuthorization: Bearer openshell:resolve:env:ANTHROPIC_API_KEY\r\nHost: example.com\r\n\r\n"; + + let rewritten = rewrite_http_header_block(raw, resolver.as_ref()); + let rewritten = String::from_utf8(rewritten).expect("utf8"); + + assert!(rewritten.contains("Authorization: Bearer sk-test\r\n")); + assert!(!rewritten.contains("openshell:resolve:env:ANTHROPIC_API_KEY")); + } + + /// Verifies that `relay_http_request_with_resolver` rewrites credential + /// placeholders in request headers before forwarding to upstream. + /// + /// This is the code path exercised when an endpoint has `protocol: rest` + /// and `tls: terminate` — the proxy terminates TLS, sees plaintext HTTP, + /// and replaces placeholder tokens with real secrets. + /// + /// Without this test, a misconfigured endpoint (missing `tls: terminate`) + /// silently leaks placeholder strings like `openshell:resolve:env:NVIDIA_API_KEY` + /// to the upstream API, causing 401 Unauthorized errors. + #[tokio::test] + async fn relay_request_with_resolver_rewrites_credential_placeholders() { + let provider_env: std::collections::HashMap = [( + "NVIDIA_API_KEY".to_string(), + "nvapi-real-secret-key".to_string(), + )] + .into_iter() + .collect(); + + let (child_env, resolver) = SecretResolver::from_provider_env(provider_env); + let placeholder = child_env.get("NVIDIA_API_KEY").unwrap(); + + let (mut proxy_to_upstream, mut upstream_side) = tokio::io::duplex(8192); + let (mut _app_side, mut proxy_to_client) = tokio::io::duplex(8192); + + let req = L7Request { + action: "POST".to_string(), + target: "/v1/chat/completions".to_string(), + raw_header: format!( + "POST /v1/chat/completions HTTP/1.1\r\n\ + Host: integrate.api.nvidia.com\r\n\ + Authorization: Bearer {placeholder}\r\n\ + Content-Length: 2\r\n\r\n{{}}" + ) + .into_bytes(), + body_length: BodyLength::ContentLength(2), + }; + + // Mock upstream: read the forwarded request, capture it, send response + let upstream_task = tokio::spawn(async move { + let mut buf = vec![0u8; 4096]; + let mut total = 0; + loop { + let n = upstream_side.read(&mut buf[total..]).await.unwrap(); + if n == 0 { + break; + } + total += n; + if let Some(hdr_end) = buf[..total].windows(4).position(|w| w == b"\r\n\r\n") { + if total >= hdr_end + 4 + 2 { + break; + } + } + } + upstream_side + .write_all(b"HTTP/1.1 200 OK\r\nContent-Length: 2\r\n\r\nok") + .await + .unwrap(); + upstream_side.flush().await.unwrap(); + String::from_utf8_lossy(&buf[..total]).to_string() + }); + + // Run the relay with a resolver — simulates the TLS-terminate path + let relay = tokio::time::timeout( + std::time::Duration::from_secs(5), + relay_http_request_with_resolver( + &req, + &mut proxy_to_client, + &mut proxy_to_upstream, + resolver.as_ref(), + ), + ) + .await + .expect("relay must not deadlock"); + relay.expect("relay should succeed"); + + let forwarded = upstream_task.await.expect("upstream task should complete"); + + // The real secret must appear in what upstream received + assert!( + forwarded.contains("Authorization: Bearer nvapi-real-secret-key\r\n"), + "Expected real API key in upstream request, got: {forwarded}" + ); + // The placeholder must NOT appear + assert!( + !forwarded.contains("openshell:resolve:env:"), + "Placeholder leaked to upstream: {forwarded}" + ); + // Other headers must be preserved + assert!(forwarded.contains("Host: integrate.api.nvidia.com\r\n")); + } + + /// Verifies that without a `SecretResolver` (i.e. the L4-only raw tunnel + /// path, or no TLS termination), credential placeholders pass through + /// unmodified. This documents the behavior that causes 401 errors when + /// `tls: terminate` is missing from the endpoint config. + #[tokio::test] + async fn relay_request_without_resolver_leaks_placeholders() { + let (child_env, _resolver) = SecretResolver::from_provider_env( + [("NVIDIA_API_KEY".to_string(), "nvapi-secret".to_string())] + .into_iter() + .collect(), + ); + let placeholder = child_env.get("NVIDIA_API_KEY").unwrap(); + + let (mut proxy_to_upstream, mut upstream_side) = tokio::io::duplex(8192); + let (mut _app_side, mut proxy_to_client) = tokio::io::duplex(8192); + + let req = L7Request { + action: "POST".to_string(), + target: "/v1/chat/completions".to_string(), + raw_header: format!( + "POST /v1/chat/completions HTTP/1.1\r\n\ + Host: integrate.api.nvidia.com\r\n\ + Authorization: Bearer {placeholder}\r\n\ + Content-Length: 2\r\n\r\n{{}}" + ) + .into_bytes(), + body_length: BodyLength::ContentLength(2), + }; + + let upstream_task = tokio::spawn(async move { + let mut buf = vec![0u8; 4096]; + let mut total = 0; + loop { + let n = upstream_side.read(&mut buf[total..]).await.unwrap(); + if n == 0 { + break; + } + total += n; + if let Some(hdr_end) = buf[..total].windows(4).position(|w| w == b"\r\n\r\n") { + if total >= hdr_end + 4 + 2 { + break; + } + } + } + upstream_side + .write_all(b"HTTP/1.1 200 OK\r\nContent-Length: 2\r\n\r\nok") + .await + .unwrap(); + upstream_side.flush().await.unwrap(); + String::from_utf8_lossy(&buf[..total]).to_string() + }); + + // Pass `None` for the resolver — simulates the L4 path where no + // rewriting occurs. + let relay = tokio::time::timeout( + std::time::Duration::from_secs(5), + relay_http_request_with_resolver( + &req, + &mut proxy_to_client, + &mut proxy_to_upstream, + None, // <-- No resolver, as in the L4 raw tunnel path + ), + ) + .await + .expect("relay must not deadlock"); + relay.expect("relay should succeed"); + + let forwarded = upstream_task.await.expect("upstream task should complete"); + + // Without a resolver, the placeholder LEAKS to upstream — this is the + // documented behavior that causes 401s when `tls: terminate` is missing. + assert!( + forwarded.contains("openshell:resolve:env:NVIDIA_API_KEY"), + "Expected placeholder to leak without resolver, got: {forwarded}" + ); + assert!( + !forwarded.contains("nvapi-secret"), + "Real secret should NOT appear without resolver, got: {forwarded}" + ); + } } diff --git a/crates/navigator-sandbox/src/lib.rs b/crates/navigator-sandbox/src/lib.rs index ea55c86ad..39de09aa1 100644 --- a/crates/navigator-sandbox/src/lib.rs +++ b/crates/navigator-sandbox/src/lib.rs @@ -17,6 +17,7 @@ mod process; pub mod procfs; pub mod proxy; mod sandbox; +mod secrets; mod ssh; use miette::{IntoDiagnostic, Result}; @@ -40,6 +41,7 @@ use crate::policy::{NetworkMode, NetworkPolicy, ProxyPolicy, SandboxPolicy}; use crate::proxy::ProxyHandle; #[cfg(target_os = "linux")] use crate::sandbox::linux::netns::NetworkNamespace; +use crate::secrets::SecretResolver; pub use process::{ProcessHandle, ProcessStatus}; /// Default interval (seconds) for re-fetching the inference route bundle from @@ -198,6 +200,9 @@ pub async fn run_sandbox( std::collections::HashMap::new() }; + let (provider_env, secret_resolver) = SecretResolver::from_provider_env(provider_env); + let secret_resolver = secret_resolver.map(Arc::new); + // Create identity cache for SHA256 TOFU when OPA is active let identity_cache = opa_engine .as_ref() @@ -321,6 +326,7 @@ pub async fn run_sandbox( entrypoint_pid.clone(), tls_state, inference_ctx, + secret_resolver.clone(), denial_tx, ) .await?; diff --git a/crates/navigator-sandbox/src/process.rs b/crates/navigator-sandbox/src/process.rs index 2f841c2c7..ca4477a41 100644 --- a/crates/navigator-sandbox/src/process.rs +++ b/crates/navigator-sandbox/src/process.rs @@ -21,6 +21,18 @@ use std::process::Stdio; use tokio::process::{Child, Command}; use tracing::{debug, warn}; +const SSH_HANDSHAKE_SECRET_ENV: &str = "NEMOCLAW_SSH_HANDSHAKE_SECRET"; + +fn inject_provider_env(cmd: &mut Command, provider_env: &HashMap) { + for (key, value) in provider_env { + cmd.env(key, value); + } +} + +fn scrub_sensitive_env(cmd: &mut Command) { + cmd.env_remove(SSH_HANDSHAKE_SECRET_ENV); +} + /// Handle to a running process. pub struct ProcessHandle { child: Child, @@ -103,10 +115,8 @@ impl ProcessHandle { .kill_on_drop(true) .env("OPENSHELL_SANDBOX", "1"); - // Set provider environment variables (credentials fetched at runtime). - for (key, value) in provider_env { - cmd.env(key, value); - } + scrub_sensitive_env(&mut cmd); + inject_provider_env(&mut cmd, provider_env); if let Some(dir) = workdir { cmd.current_dir(dir); @@ -215,10 +225,8 @@ impl ProcessHandle { .kill_on_drop(true) .env("OPENSHELL_SANDBOX", "1"); - // Set provider environment variables (credentials fetched at runtime). - for (key, value) in provider_env { - cmd.env(key, value); - } + scrub_sensitive_env(&mut cmd); + inject_provider_env(&mut cmd, provider_env); if let Some(dir) = workdir { cmd.current_dir(dir); @@ -502,6 +510,7 @@ mod tests { use crate::policy::{ FilesystemPolicy, LandlockPolicy, NetworkPolicy, ProcessPolicy, SandboxPolicy, }; + use std::process::Stdio as StdStdio; /// Helper to create a minimal `SandboxPolicy` with the given process policy. fn policy_with_process(process: ProcessPolicy) -> SandboxPolicy { @@ -583,4 +592,39 @@ mod tests { "expected 'not found' in error: {msg}" ); } + + #[tokio::test] + async fn scrub_sensitive_env_removes_ssh_handshake_secret() { + let mut cmd = Command::new("/usr/bin/env"); + cmd.stdin(StdStdio::null()) + .stdout(StdStdio::piped()) + .stderr(StdStdio::null()) + .env(SSH_HANDSHAKE_SECRET_ENV, "super-secret"); + + scrub_sensitive_env(&mut cmd); + + let output = cmd.output().await.expect("spawn env"); + let stdout = String::from_utf8(output.stdout).expect("utf8"); + assert!(!stdout.contains(SSH_HANDSHAKE_SECRET_ENV)); + } + + #[tokio::test] + async fn inject_provider_env_sets_placeholder_values() { + let mut cmd = Command::new("/usr/bin/env"); + cmd.stdin(StdStdio::null()) + .stdout(StdStdio::piped()) + .stderr(StdStdio::null()); + + let provider_env = std::iter::once(( + "ANTHROPIC_API_KEY".to_string(), + "openshell:resolve:env:ANTHROPIC_API_KEY".to_string(), + )) + .collect(); + + inject_provider_env(&mut cmd, &provider_env); + + let output = cmd.output().await.expect("spawn env"); + let stdout = String::from_utf8(output.stdout).expect("utf8"); + assert!(stdout.contains("ANTHROPIC_API_KEY=openshell:resolve:env:ANTHROPIC_API_KEY")); + } } diff --git a/crates/navigator-sandbox/src/proxy.rs b/crates/navigator-sandbox/src/proxy.rs index 1480e5e0b..b644120d7 100644 --- a/crates/navigator-sandbox/src/proxy.rs +++ b/crates/navigator-sandbox/src/proxy.rs @@ -8,6 +8,7 @@ use crate::identity::BinaryIdentityCache; use crate::l7::tls::ProxyTlsState; use crate::opa::{NetworkAction, OpaEngine}; use crate::policy::ProxyPolicy; +use crate::secrets::{SecretResolver, rewrite_header_line}; use miette::{IntoDiagnostic, Result}; use std::net::{IpAddr, SocketAddr}; use std::path::PathBuf; @@ -130,6 +131,7 @@ impl ProxyHandle { entrypoint_pid: Arc, tls_state: Option>, inference_ctx: Option>, + secret_resolver: Option>, denial_tx: Option>, ) -> Result { // Use override bind_addr, fall back to policy http_addr, then default @@ -159,10 +161,13 @@ impl ProxyHandle { let spid = entrypoint_pid.clone(); let tls = tls_state.clone(); let inf = inference_ctx.clone(); + let resolver = secret_resolver.clone(); let dtx = denial_tx.clone(); tokio::spawn(async move { - if let Err(err) = - handle_tcp_connection(stream, opa, cache, spid, tls, inf, dtx).await + if let Err(err) = handle_tcp_connection( + stream, opa, cache, spid, tls, inf, resolver, dtx, + ) + .await { warn!(error = %err, "Proxy connection error"); } @@ -259,6 +264,7 @@ async fn handle_tcp_connection( entrypoint_pid: Arc, tls_state: Option>, inference_ctx: Option>, + secret_resolver: Option>, denial_tx: Option>, ) -> Result<()> { let mut buf = vec![0u8; MAX_HEADER_BYTES]; @@ -302,6 +308,7 @@ async fn handle_tcp_connection( opa_engine, identity_cache, entrypoint_pid, + secret_resolver, denial_tx.as_ref(), ) .await; @@ -518,6 +525,7 @@ async fn handle_tcp_connection( .iter() .map(|p| p.to_string_lossy().into_owned()) .collect(), + secret_resolver: secret_resolver.clone(), }; if l7_config.tls == crate::l7::TlsMode::Terminate { @@ -1402,7 +1410,12 @@ fn parse_proxy_uri(uri: &str) -> Result<(String, String, u16, String)> { /// strips proxy hop-by-hop headers, injects `Connection: close` and `Via`. /// /// Returns the rewritten request bytes (headers + any overflow body bytes). -fn rewrite_forward_request(raw: &[u8], used: usize, path: &str) -> Vec { +fn rewrite_forward_request( + raw: &[u8], + used: usize, + path: &str, + secret_resolver: Option<&SecretResolver>, +) -> Vec { let header_end = raw[..used] .windows(4) .position(|w| w == b"\r\n\r\n") @@ -1454,8 +1467,12 @@ fn rewrite_forward_request(raw: &[u8], used: usize, path: &str) -> Vec { continue; } - // Pass through other headers - output.extend_from_slice(line.as_bytes()); + let rewritten_line = match secret_resolver { + Some(resolver) => rewrite_header_line(line, resolver), + None => line.to_string(), + }; + + output.extend_from_slice(rewritten_line.as_bytes()); output.extend_from_slice(b"\r\n"); if lower.starts_with("via:") { @@ -1497,6 +1514,7 @@ async fn handle_forward_proxy( opa_engine: Arc, identity_cache: Arc, entrypoint_pid: Arc, + secret_resolver: Option>, denial_tx: Option<&mpsc::UnboundedSender>, ) -> Result<()> { // 1. Parse the absolute-form URI @@ -1716,8 +1734,8 @@ async fn handle_forward_proxy( "FORWARD", ); - // 7. Rewrite request and forward to upstream - let rewritten = rewrite_forward_request(buf, used, &path); + // 9. Rewrite request and forward to upstream + let rewritten = rewrite_forward_request(buf, used, &path, secret_resolver.as_deref()); upstream.write_all(&rewritten).await.into_diagnostic()?; // 8. Relay remaining traffic bidirectionally (supports streaming) @@ -2336,7 +2354,7 @@ mod tests { fn test_rewrite_get_request() { let raw = b"GET http://10.0.0.1:8000/api HTTP/1.1\r\nHost: 10.0.0.1:8000\r\nAccept: */*\r\n\r\n"; - let result = rewrite_forward_request(raw, raw.len(), "/api"); + let result = rewrite_forward_request(raw, raw.len(), "/api", None); let result_str = String::from_utf8_lossy(&result); assert!(result_str.starts_with("GET /api HTTP/1.1\r\n")); assert!(result_str.contains("Host: 10.0.0.1:8000")); @@ -2347,7 +2365,7 @@ mod tests { #[test] fn test_rewrite_strips_proxy_headers() { let raw = b"GET http://host/p HTTP/1.1\r\nHost: host\r\nProxy-Authorization: Basic abc\r\nProxy-Connection: keep-alive\r\nAccept: */*\r\n\r\n"; - let result = rewrite_forward_request(raw, raw.len(), "/p"); + let result = rewrite_forward_request(raw, raw.len(), "/p", None); let result_str = String::from_utf8_lossy(&result); assert!( !result_str @@ -2361,7 +2379,7 @@ mod tests { #[test] fn test_rewrite_replaces_connection_header() { let raw = b"GET http://host/p HTTP/1.1\r\nHost: host\r\nConnection: keep-alive\r\n\r\n"; - let result = rewrite_forward_request(raw, raw.len(), "/p"); + let result = rewrite_forward_request(raw, raw.len(), "/p", None); let result_str = String::from_utf8_lossy(&result); assert!(result_str.contains("Connection: close")); assert!(!result_str.contains("keep-alive")); @@ -2370,7 +2388,7 @@ mod tests { #[test] fn test_rewrite_preserves_body_overflow() { let raw = b"POST http://host/api HTTP/1.1\r\nHost: host\r\nContent-Length: 13\r\n\r\n{\"key\":\"val\"}"; - let result = rewrite_forward_request(raw, raw.len(), "/api"); + let result = rewrite_forward_request(raw, raw.len(), "/api", None); let result_str = String::from_utf8_lossy(&result); assert!(result_str.contains("{\"key\":\"val\"}")); assert!(result_str.contains("POST /api HTTP/1.1")); @@ -2379,13 +2397,27 @@ mod tests { #[test] fn test_rewrite_preserves_existing_via() { let raw = b"GET http://host/p HTTP/1.1\r\nHost: host\r\nVia: 1.0 upstream\r\n\r\n"; - let result = rewrite_forward_request(raw, raw.len(), "/p"); + let result = rewrite_forward_request(raw, raw.len(), "/p", None); let result_str = String::from_utf8_lossy(&result); assert!(result_str.contains("Via: 1.0 upstream")); // Should not add a second Via header assert!(!result_str.contains("Via: 1.1 navigator-sandbox")); } + #[test] + fn test_rewrite_resolves_placeholder_auth_headers() { + let (_, resolver) = SecretResolver::from_provider_env( + [("ANTHROPIC_API_KEY".to_string(), "sk-test".to_string())] + .into_iter() + .collect(), + ); + let raw = b"GET http://host/p HTTP/1.1\r\nHost: host\r\nAuthorization: Bearer openshell:resolve:env:ANTHROPIC_API_KEY\r\n\r\n"; + let result = rewrite_forward_request(raw, raw.len(), "/p", resolver.as_ref()); + let result_str = String::from_utf8_lossy(&result); + assert!(result_str.contains("Authorization: Bearer sk-test")); + assert!(!result_str.contains("openshell:resolve:env:ANTHROPIC_API_KEY")); + } + // --- Forward proxy SSRF defence tests --- // // The forward proxy handler uses the same SSRF logic as the CONNECT path: diff --git a/crates/navigator-sandbox/src/secrets.rs b/crates/navigator-sandbox/src/secrets.rs new file mode 100644 index 000000000..4ee1ee846 --- /dev/null +++ b/crates/navigator-sandbox/src/secrets.rs @@ -0,0 +1,262 @@ +// SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +// SPDX-License-Identifier: Apache-2.0 + +use std::collections::HashMap; + +const PLACEHOLDER_PREFIX: &str = "openshell:resolve:env:"; + +#[derive(Debug, Clone, Default)] +pub(crate) struct SecretResolver { + by_placeholder: HashMap, +} + +impl SecretResolver { + pub(crate) fn from_provider_env( + provider_env: HashMap, + ) -> (HashMap, Option) { + if provider_env.is_empty() { + return (HashMap::new(), None); + } + + let mut child_env = HashMap::with_capacity(provider_env.len()); + let mut by_placeholder = HashMap::with_capacity(provider_env.len()); + + for (key, value) in provider_env { + let placeholder = placeholder_for_env_key(&key); + child_env.insert(key, placeholder.clone()); + by_placeholder.insert(placeholder, value); + } + + (child_env, Some(Self { by_placeholder })) + } + + pub(crate) fn resolve_placeholder(&self, value: &str) -> Option<&str> { + self.by_placeholder.get(value).map(String::as_str) + } + + pub(crate) fn rewrite_header_value(&self, value: &str) -> Option { + if let Some(secret) = self.resolve_placeholder(value.trim()) { + return Some(secret.to_string()); + } + + let trimmed = value.trim(); + let split_at = trimmed.find(char::is_whitespace)?; + let prefix = &trimmed[..split_at]; + let candidate = trimmed[split_at..].trim(); + let secret = self.resolve_placeholder(candidate)?; + Some(format!("{prefix} {secret}")) + } +} + +pub(crate) fn placeholder_for_env_key(key: &str) -> String { + format!("{PLACEHOLDER_PREFIX}{key}") +} + +pub(crate) fn rewrite_http_header_block(raw: &[u8], resolver: Option<&SecretResolver>) -> Vec { + let Some(resolver) = resolver else { + return raw.to_vec(); + }; + + let Some(header_end) = raw.windows(4).position(|w| w == b"\r\n\r\n").map(|p| p + 4) else { + return raw.to_vec(); + }; + + let header_str = String::from_utf8_lossy(&raw[..header_end]); + let mut lines = header_str.split("\r\n"); + let Some(request_line) = lines.next() else { + return raw.to_vec(); + }; + + let mut output = Vec::with_capacity(raw.len()); + output.extend_from_slice(request_line.as_bytes()); + output.extend_from_slice(b"\r\n"); + + for line in lines { + if line.is_empty() { + break; + } + + output.extend_from_slice(rewrite_header_line(line, resolver).as_bytes()); + output.extend_from_slice(b"\r\n"); + } + + output.extend_from_slice(b"\r\n"); + output.extend_from_slice(&raw[header_end..]); + output +} + +pub(crate) fn rewrite_header_line(line: &str, resolver: &SecretResolver) -> String { + let Some((name, value)) = line.split_once(':') else { + return line.to_string(); + }; + + match resolver.rewrite_header_value(value.trim()) { + Some(rewritten) => format!("{name}: {rewritten}"), + None => line.to_string(), + } +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn provider_env_is_replaced_with_placeholders() { + let (child_env, resolver) = SecretResolver::from_provider_env( + [("ANTHROPIC_API_KEY".to_string(), "sk-test".to_string())] + .into_iter() + .collect(), + ); + + assert_eq!( + child_env.get("ANTHROPIC_API_KEY"), + Some(&"openshell:resolve:env:ANTHROPIC_API_KEY".to_string()) + ); + assert_eq!( + resolver + .as_ref() + .and_then(|resolver| resolver + .resolve_placeholder("openshell:resolve:env:ANTHROPIC_API_KEY")), + Some("sk-test") + ); + } + + #[test] + fn rewrites_exact_placeholder_header_values() { + let (_, resolver) = SecretResolver::from_provider_env( + [("CUSTOM_TOKEN".to_string(), "secret-token".to_string())] + .into_iter() + .collect(), + ); + let resolver = resolver.expect("resolver"); + + assert_eq!( + rewrite_header_line("x-api-key: openshell:resolve:env:CUSTOM_TOKEN", &resolver), + "x-api-key: secret-token" + ); + } + + #[test] + fn rewrites_bearer_placeholder_header_values() { + let (_, resolver) = SecretResolver::from_provider_env( + [("ANTHROPIC_API_KEY".to_string(), "sk-test".to_string())] + .into_iter() + .collect(), + ); + let resolver = resolver.expect("resolver"); + + assert_eq!( + rewrite_header_line( + "Authorization: Bearer openshell:resolve:env:ANTHROPIC_API_KEY", + &resolver, + ), + "Authorization: Bearer sk-test" + ); + } + + #[test] + fn rewrites_http_header_blocks_and_preserves_body() { + let (_, resolver) = SecretResolver::from_provider_env( + [("CUSTOM_TOKEN".to_string(), "secret-token".to_string())] + .into_iter() + .collect(), + ); + + let raw = b"POST /v1 HTTP/1.1\r\nAuthorization: Bearer openshell:resolve:env:CUSTOM_TOKEN\r\nContent-Length: 5\r\n\r\nhello"; + let rewritten = rewrite_http_header_block(raw, resolver.as_ref()); + let rewritten = String::from_utf8(rewritten).expect("utf8"); + + assert!(rewritten.contains("Authorization: Bearer secret-token\r\n")); + assert!(rewritten.ends_with("\r\n\r\nhello")); + } + + /// Simulates the full round-trip: provider env → child placeholders → + /// HTTP headers → rewrite. This is the exact flow that occurs when a + /// sandbox child process reads placeholder env vars, constructs an HTTP + /// request, and the proxy rewrites headers before forwarding upstream. + #[test] + fn full_round_trip_child_env_to_rewritten_headers() { + let provider_env: HashMap = [ + ( + "ANTHROPIC_API_KEY".to_string(), + "sk-real-key-12345".to_string(), + ), + ( + "CUSTOM_SERVICE_TOKEN".to_string(), + "tok-real-svc-67890".to_string(), + ), + ] + .into_iter() + .collect(); + + let (child_env, resolver) = SecretResolver::from_provider_env(provider_env); + + // Child process reads placeholders from the environment + let auth_value = child_env.get("ANTHROPIC_API_KEY").unwrap(); + let token_value = child_env.get("CUSTOM_SERVICE_TOKEN").unwrap(); + assert!(auth_value.starts_with(PLACEHOLDER_PREFIX)); + assert!(token_value.starts_with(PLACEHOLDER_PREFIX)); + + // Child constructs an HTTP request using those placeholders + let raw = format!( + "GET /v1/messages HTTP/1.1\r\n\ + Host: api.example.com\r\n\ + Authorization: Bearer {auth_value}\r\n\ + x-api-key: {token_value}\r\n\ + Content-Length: 0\r\n\r\n" + ); + + // Proxy rewrites headers + let rewritten = rewrite_http_header_block(raw.as_bytes(), resolver.as_ref()); + let rewritten = String::from_utf8(rewritten).expect("utf8"); + + // Real secrets must appear in the rewritten headers + assert!( + rewritten.contains("Authorization: Bearer sk-real-key-12345\r\n"), + "Expected rewritten Authorization header, got: {rewritten}" + ); + assert!( + rewritten.contains("x-api-key: tok-real-svc-67890\r\n"), + "Expected rewritten x-api-key header, got: {rewritten}" + ); + + // Placeholders must not appear + assert!( + !rewritten.contains("openshell:resolve:env:"), + "Placeholder leaked into rewritten request: {rewritten}" + ); + + // Request line and non-secret headers must be preserved + assert!(rewritten.starts_with("GET /v1/messages HTTP/1.1\r\n")); + assert!(rewritten.contains("Host: api.example.com\r\n")); + assert!(rewritten.contains("Content-Length: 0\r\n")); + } + + #[test] + fn non_secret_headers_are_not_modified() { + let (_, resolver) = SecretResolver::from_provider_env( + [("API_KEY".to_string(), "secret".to_string())] + .into_iter() + .collect(), + ); + + let raw = b"GET / HTTP/1.1\r\nHost: example.com\r\nAccept: application/json\r\nContent-Type: text/plain\r\n\r\n"; + let rewritten = rewrite_http_header_block(raw, resolver.as_ref()); + // The output should be byte-identical since no placeholders are present + assert_eq!(raw.as_slice(), rewritten.as_slice()); + } + + #[test] + fn empty_provider_env_produces_no_resolver() { + let (child_env, resolver) = SecretResolver::from_provider_env(HashMap::new()); + assert!(child_env.is_empty()); + assert!(resolver.is_none()); + } + + #[test] + fn rewrite_with_no_resolver_returns_original() { + let raw = b"GET / HTTP/1.1\r\nAuthorization: Bearer my-token\r\n\r\n"; + let rewritten = rewrite_http_header_block(raw, None); + assert_eq!(raw.as_slice(), rewritten.as_slice()); + } +} diff --git a/crates/navigator-sandbox/src/ssh.rs b/crates/navigator-sandbox/src/ssh.rs index 867bd090b..568d09638 100644 --- a/crates/navigator-sandbox/src/ssh.rs +++ b/crates/navigator-sandbox/src/ssh.rs @@ -20,7 +20,7 @@ use std::io::{Read, Write}; use std::net::SocketAddr; use std::os::fd::{AsRawFd, RawFd}; use std::path::PathBuf; -use std::process::Command; +use std::process::{Command, Stdio}; use std::sync::{Arc, Mutex, mpsc}; use std::time::{Duration, Instant, SystemTime, UNIX_EPOCH}; use tokio::io::{AsyncReadExt, AsyncWriteExt}; @@ -28,6 +28,8 @@ use tokio::net::TcpListener; use tracing::{info, warn}; const PREFACE_MAGIC: &str = "NSSH1"; +#[cfg(test)] +const SSH_HANDSHAKE_SECRET_ENV: &str = "NEMOCLAW_SSH_HANDSHAKE_SECRET"; /// A time-bounded set of nonces used to detect replayed NSSH1 handshakes. /// Each entry records the `Instant` it was inserted; a background reaper task @@ -646,6 +648,46 @@ fn session_user_and_home(policy: &SandboxPolicy) -> (String, String) { } } +fn apply_child_env( + cmd: &mut Command, + session_home: &str, + session_user: &str, + term: &str, + proxy_url: Option<&str>, + ca_file_paths: Option<&(PathBuf, PathBuf)>, + provider_env: &HashMap, +) { + let path = std::env::var("PATH").unwrap_or_else(|_| "/usr/local/bin:/usr/bin:/bin".into()); + + cmd.env_clear() + .env("OPENSHELL_SANDBOX", "1") + .env("HOME", session_home) + .env("USER", session_user) + .env("SHELL", "/bin/bash") + .env("PATH", &path) + .env("TERM", term); + + if let Some(url) = proxy_url { + cmd.env("HTTP_PROXY", url) + .env("HTTPS_PROXY", url) + .env("ALL_PROXY", url) + .env("http_proxy", url) + .env("https_proxy", url) + .env("grpc_proxy", url); + } + + if let Some((ca_cert_path, combined_bundle_path)) = ca_file_paths { + cmd.env("NODE_EXTRA_CA_CERTS", ca_cert_path) + .env("SSL_CERT_FILE", combined_bundle_path) + .env("REQUESTS_CA_BUNDLE", combined_bundle_path) + .env("CURL_CA_BUNDLE", combined_bundle_path); + } + + for (key, value) in provider_env { + cmd.env(key, value); + } +} + #[allow(clippy::too_many_arguments)] fn spawn_pty_shell( policy: &SandboxPolicy, @@ -695,53 +737,19 @@ fn spawn_pty_shell( pty.term.as_str() }; - // Inherit PATH from the container (set via Dockerfile ENV) so that - // sandbox sessions see the same tool layout without hardcoding paths. - // Tool-specific env vars (VIRTUAL_ENV, UV_PYTHON_INSTALL_DIR, etc.) are - // set in /sandbox/.bashrc by the Dockerfile and sourced via login shell. - let path = std::env::var("PATH").unwrap_or_else(|_| "/usr/local/bin:/usr/bin:/bin".into()); - // Derive USER and HOME from the policy's run_as_user when available, // falling back to "sandbox" / "/sandbox" for backward compatibility. let (session_user, session_home) = session_user_and_home(policy); - - cmd.env_clear() - .stdin(stdin) - .stdout(stdout) - .stderr(stderr) - .env("OPENSHELL_SANDBOX", "1") - .env("HOME", &session_home) - .env("USER", &session_user) - .env("SHELL", "/bin/bash") - .env("PATH", &path) - .env("TERM", term); - - // Set proxy environment variables so cooperative tools (curl, wget, etc.) - // route traffic through the CONNECT proxy for OPA policy evaluation. - // Both uppercase and lowercase variants are needed: curl/wget use uppercase, - // gRPC C-core (libgrpc) checks lowercase http_proxy/https_proxy first. - if let Some(ref url) = proxy_url { - cmd.env("HTTP_PROXY", url) - .env("HTTPS_PROXY", url) - .env("ALL_PROXY", url) - .env("http_proxy", url) - .env("https_proxy", url) - .env("grpc_proxy", url); - } - - // Set TLS trust store env vars so SSH shell processes trust the ephemeral CA - if let Some(ref paths) = ca_file_paths { - let (ca_cert_path, combined_bundle_path) = paths.as_ref(); - cmd.env("NODE_EXTRA_CA_CERTS", ca_cert_path) - .env("SSL_CERT_FILE", combined_bundle_path) - .env("REQUESTS_CA_BUNDLE", combined_bundle_path) - .env("CURL_CA_BUNDLE", combined_bundle_path); - } - - // Set provider environment variables (credentials fetched at runtime). - for (key, value) in provider_env { - cmd.env(key, value); - } + apply_child_env( + &mut cmd, + &session_home, + &session_user, + term, + proxy_url.as_deref(), + ca_file_paths.as_deref(), + provider_env, + ); + cmd.stdin(stdin).stdout(stdout).stderr(stderr); if let Some(dir) = workdir.as_deref() { cmd.current_dir(dir); @@ -865,41 +873,19 @@ fn spawn_pipe_exec( }, ); - let path = std::env::var("PATH").unwrap_or_else(|_| "/usr/local/bin:/usr/bin:/bin".into()); - let (session_user, session_home) = session_user_and_home(policy); - - cmd.env_clear() - .stdin(std::process::Stdio::piped()) - .stdout(std::process::Stdio::piped()) - .stderr(std::process::Stdio::piped()) - .env("OPENSHELL_SANDBOX", "1") - .env("HOME", &session_home) - .env("USER", &session_user) - .env("SHELL", "/bin/bash") - .env("PATH", &path) - .env("TERM", "dumb"); - - if let Some(ref url) = proxy_url { - cmd.env("HTTP_PROXY", url) - .env("HTTPS_PROXY", url) - .env("ALL_PROXY", url) - .env("http_proxy", url) - .env("https_proxy", url) - .env("grpc_proxy", url); - } - - if let Some(ref paths) = ca_file_paths { - let (ca_cert_path, combined_bundle_path) = paths.as_ref(); - cmd.env("NODE_EXTRA_CA_CERTS", ca_cert_path) - .env("SSL_CERT_FILE", combined_bundle_path) - .env("REQUESTS_CA_BUNDLE", combined_bundle_path) - .env("CURL_CA_BUNDLE", combined_bundle_path); - } - - for (key, value) in provider_env { - cmd.env(key, value); - } + apply_child_env( + &mut cmd, + &session_home, + &session_user, + "dumb", + proxy_url.as_deref(), + ca_file_paths.as_deref(), + provider_env, + ); + cmd.stdin(Stdio::piped()) + .stdout(Stdio::piped()) + .stderr(Stdio::piped()); if let Some(dir) = workdir.as_deref() { cmd.current_dir(dir); @@ -1094,6 +1080,7 @@ fn to_u16(value: u32) -> u16 { #[cfg(test)] mod tests { use super::*; + use std::process::Stdio; /// Verify that dropping the input sender (the operation `channel_eof` /// performs) causes the stdin writer loop to exit and close the child's @@ -1104,8 +1091,8 @@ mod tests { let (sender, receiver) = mpsc::channel::>(); let mut child = Command::new("cat") - .stdin(std::process::Stdio::piped()) - .stdout(std::process::Stdio::piped()) + .stdin(Stdio::piped()) + .stdout(Stdio::piped()) .spawn() .expect("failed to spawn cat"); @@ -1155,8 +1142,8 @@ mod tests { let mut child = Command::new("wc") .arg("-c") - .stdin(std::process::Stdio::piped()) - .stdout(std::process::Stdio::piped()) + .stdin(Stdio::piped()) + .stdout(Stdio::piped()) .spawn() .expect("failed to spawn wc"); @@ -1296,4 +1283,35 @@ mod tests { assert!(verify_preface(&line1, secret, 300, &cache).unwrap()); assert!(verify_preface(&line2, secret, 300, &cache).unwrap()); } + + #[test] + fn apply_child_env_keeps_handshake_secret_out_of_ssh_children() { + let mut cmd = Command::new("/usr/bin/env"); + cmd.env(SSH_HANDSHAKE_SECRET_ENV, "should-not-leak") + .stdin(Stdio::null()) + .stdout(Stdio::piped()) + .stderr(Stdio::null()); + + let provider_env = std::iter::once(( + "ANTHROPIC_API_KEY".to_string(), + "openshell:resolve:env:ANTHROPIC_API_KEY".to_string(), + )) + .collect(); + + apply_child_env( + &mut cmd, + "/sandbox", + "sandbox", + "dumb", + None, + None, + &provider_env, + ); + + let output = cmd.output().expect("spawn env"); + let stdout = String::from_utf8(output.stdout).expect("utf8"); + + assert!(!stdout.contains(SSH_HANDSHAKE_SECRET_ENV)); + assert!(stdout.contains("ANTHROPIC_API_KEY=openshell:resolve:env:ANTHROPIC_API_KEY")); + } } diff --git a/deploy/docker/sandbox/dev-sandbox-policy.yaml b/deploy/docker/sandbox/dev-sandbox-policy.yaml index 421a69f17..02ef4b1f2 100644 --- a/deploy/docker/sandbox/dev-sandbox-policy.yaml +++ b/deploy/docker/sandbox/dev-sandbox-policy.yaml @@ -145,3 +145,17 @@ network_policies: - { path: /usr/bin/curl } - { path: /usr/bin/wget } - { path: "/sandbox/.cursor-server/**" } + + opencode: + name: opencode + endpoints: + - host: registry.npmjs.org + port: 443 + - host: opencode.ai + port: 443 + - host: integrate.api.nvidia.com + port: 443 + binaries: + - path: /usr/lib/node_modules/opencode-ai/bin/.opencode + - path: /usr/bin/node + - path: /usr/local/bin/opencode diff --git a/e2e/python/test_sandbox_providers.py b/e2e/python/test_sandbox_providers.py index e52010fef..577eb4c83 100644 --- a/e2e/python/test_sandbox_providers.py +++ b/e2e/python/test_sandbox_providers.py @@ -1,12 +1,12 @@ # SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. # SPDX-License-Identifier: Apache-2.0 -"""E2e tests for provider credential injection into sandboxes. +"""E2E tests for supervisor-managed provider placeholders in sandboxes. Provider credentials are fetched at runtime by the sandbox supervisor via the -GetSandboxProviderEnvironment gRPC call. They should appear as environment -variables inside the sandbox process, but must NOT be present in the persisted -sandbox spec's environment map (they are never baked into the K8s pod spec). +GetSandboxProviderEnvironment gRPC call. Sandboxed child processes should see +placeholder values (not raw secrets). Credentials must never be present in the +persisted sandbox spec environment map. """ from __future__ import annotations @@ -25,7 +25,13 @@ from openshell import Sandbox, SandboxClient +# --------------------------------------------------------------------------- +# Policy helpers +# --------------------------------------------------------------------------- + + def _default_policy() -> sandbox_pb2.SandboxPolicy: + """Build a sandbox policy with standard filesystem/process/landlock settings.""" return sandbox_pb2.SandboxPolicy( version=1, filesystem=sandbox_pb2.FilesystemPolicy( @@ -40,6 +46,11 @@ def _default_policy() -> sandbox_pb2.SandboxPolicy: ) +# --------------------------------------------------------------------------- +# Provider lifecycle helper +# --------------------------------------------------------------------------- + + @contextmanager def provider( stub: object, @@ -49,7 +60,6 @@ def provider( credentials: dict[str, str], ) -> Iterator[str]: """Create a provider for the duration of the block, then delete it.""" - # Clean up any leftover from a previous run. _delete_provider(stub, name) stub.CreateProvider( navigator_pb2.CreateProviderRequest( @@ -77,18 +87,16 @@ def _delete_provider(stub: object, name: str) -> None: raise -# TODO: Once the sandbox supervisor sets provider credentials (rather than the -# server injecting them via exec environment), these tests will no longer be -# able to read credential values directly. Update the assertions to verify -# that the env vars are *present* (e.g. non-empty) without checking exact -# values, since the supervisor will be the sole source of truth. +# =========================================================================== +# Tests: placeholder visibility +# =========================================================================== def test_provider_credentials_available_as_env_vars( sandbox: Callable[..., Sandbox], sandbox_client: SandboxClient, ) -> None: - """Sandbox process can read provider credentials as environment variables.""" + """Sandbox child processes see provider env vars as placeholders.""" with provider( sandbox_client._stub, name="e2e-test-provider-env", @@ -108,14 +116,16 @@ def read_env_var() -> str: with sandbox(spec=spec, delete_on_exit=True) as sb: result = sb.exec_python(read_env_var) assert result.exit_code == 0, result.stderr - assert result.stdout.strip() == "sk-e2e-test-key-12345" + value = result.stdout.strip() + assert value == "openshell:resolve:env:ANTHROPIC_API_KEY" + assert value != "sk-e2e-test-key-12345" def test_generic_provider_credentials_available_as_env_vars( sandbox: Callable[..., Sandbox], sandbox_client: SandboxClient, ) -> None: - """Generic provider credentials are injected as arbitrary sandbox env vars.""" + """Generic provider env vars are placeholders, not raw secrets.""" with provider( sandbox_client._stub, name="e2e-test-generic-provider-env", @@ -142,7 +152,7 @@ def read_generic_env_vars() -> str: assert result.exit_code == 0, result.stderr assert ( result.stdout.strip() - == "token-generic-123|https://internal.example.test/api" + == "openshell:resolve:env:CUSTOM_SERVICE_TOKEN|openshell:resolve:env:CUSTOM_SERVICE_URL" ) @@ -150,7 +160,7 @@ def test_nvidia_provider_injects_nvidia_api_key_env_var( sandbox: Callable[..., Sandbox], sandbox_client: SandboxClient, ) -> None: - """NVIDIA provider projects NVIDIA_API_KEY into sandbox process env.""" + """NVIDIA provider projects a placeholder env value into child processes.""" with provider( sandbox_client._stub, name="e2e-test-nvidia-provider-env", @@ -170,7 +180,26 @@ def read_nvidia_key() -> str: with sandbox(spec=spec, delete_on_exit=True) as sb: result = sb.exec_python(read_nvidia_key) assert result.exit_code == 0, result.stderr - assert result.stdout.strip() == "nvapi-e2e-test-key" + assert result.stdout.strip() == "openshell:resolve:env:NVIDIA_API_KEY" + + +# =========================================================================== +# Tests: security & edge cases +# =========================================================================== + + +def test_ssh_handshake_secret_not_visible_in_exec_environment( + sandbox: Callable[..., Sandbox], +) -> None: + def read_handshake_secret() -> str: + import os + + return os.environ.get("NEMOCLAW_SSH_HANDSHAKE_SECRET", "NOT_SET") + + with sandbox(delete_on_exit=True) as sb: + result = sb.exec_python(read_handshake_secret) + assert result.exit_code == 0, result.stderr + assert result.stdout.strip() == "NOT_SET" def test_create_sandbox_rejects_unknown_provider( @@ -185,7 +214,7 @@ def test_create_sandbox_rejects_unknown_provider( sandbox_client.create(spec=spec) assert exc_info.value.code() == grpc.StatusCode.FAILED_PRECONDITION - assert "nonexistent-provider-xyz" in exc_info.value.details() + assert "nonexistent-provider-xyz" in (exc_info.value.details() or "") def test_credentials_not_in_persisted_spec_environment( @@ -205,7 +234,6 @@ def test_credentials_not_in_persisted_spec_environment( ) with sandbox(spec=spec, delete_on_exit=True) as sb: - # Fetch the sandbox record from the server and inspect spec.environment. fetched = sandbox_client._stub.GetSandbox( navigator_pb2.GetSandboxRequest(name=sb.sandbox.name) ) @@ -215,9 +243,9 @@ def test_credentials_not_in_persisted_spec_environment( ) -# --------------------------------------------------------------------------- -# Provider update merge semantics -# --------------------------------------------------------------------------- +# =========================================================================== +# Tests: provider update merge semantics +# =========================================================================== def test_update_provider_preserves_unset_credentials_and_config( @@ -240,7 +268,6 @@ def test_update_provider_preserves_unset_credentials_and_config( ) ) - # Update only KEY_A; send empty config map (= no change). stub.UpdateProvider( navigator_pb2.UpdateProviderRequest( provider=datamodel_pb2.Provider( @@ -319,7 +346,6 @@ def test_update_provider_merges_config_preserves_credentials( ) ) - # Send only config update, empty credentials map. stub.UpdateProvider( navigator_pb2.UpdateProviderRequest( provider=datamodel_pb2.Provider( diff --git a/e2e/rust/tests/provider_auto_create.rs b/e2e/rust/tests/provider_auto_create.rs index 497b41acc..62fe5a461 100644 --- a/e2e/rust/tests/provider_auto_create.rs +++ b/e2e/rust/tests/provider_auto_create.rs @@ -7,21 +7,25 @@ //! //! When `--provider claude` is passed and no provider named "claude" exists, //! the CLI should discover `ANTHROPIC_API_KEY` from the local environment, -//! auto-create a provider, and inject its credentials into the sandbox. +//! auto-create a provider, and inject a supervisor-managed placeholder into the +//! sandbox child process environment. //! //! The sandbox command (`printenv ANTHROPIC_API_KEY`) verifies that the -//! credential made it all the way through to the sandbox process environment. +//! placeholder made it all the way through to the sandbox process environment. //! //! Prerequisites: //! - A running openshell gateway (`openshell gateway start`) //! - The `openshell` binary (built automatically from the workspace) use std::process::Stdio; +use std::sync::Mutex; use openshell_e2e::harness::binary::openshell_cmd; use openshell_e2e::harness::output::{extract_field, strip_ansi}; const TEST_API_KEY: &str = "sk-e2e-auto-provider-test-key"; +const TEST_API_KEY_PLACEHOLDER: &str = "openshell:resolve:env:ANTHROPIC_API_KEY"; +static CLAUDE_PROVIDER_LOCK: Mutex<()> = Mutex::new(()); /// Helper: delete a provider by name, ignoring errors. async fn delete_provider(name: &str) { @@ -34,6 +38,17 @@ async fn delete_provider(name: &str) { let _ = cmd.status().await; } +/// Helper: check whether a provider already exists. +async fn provider_exists(name: &str) -> bool { + let mut cmd = openshell_cmd(); + cmd.arg("provider") + .arg("get") + .arg(name) + .stdout(Stdio::null()) + .stderr(Stdio::null()); + cmd.status().await.is_ok_and(|status| status.success()) +} + /// Helper: delete a sandbox by name, ignoring errors. async fn delete_sandbox(name: &str) { let mut cmd = openshell_cmd(); @@ -46,9 +61,18 @@ async fn delete_sandbox(name: &str) { } /// `--provider claude --auto-providers` with `ANTHROPIC_API_KEY` set should -/// auto-create a "claude" provider and inject the credential into the sandbox. +/// auto-create a "claude" provider and inject a placeholder into the sandbox. #[tokio::test] async fn auto_created_provider_credential_available_in_sandbox() { + let _provider_lock = CLAUDE_PROVIDER_LOCK + .lock() + .unwrap_or_else(std::sync::PoisonError::into_inner); + + if provider_exists("claude").await { + eprintln!("Skipping test: existing provider 'claude' would make shared state unsafe"); + return; + } + // Clean up any leftover from a previous run. delete_provider("claude").await; @@ -100,7 +124,12 @@ async fn auto_created_provider_credential_available_in_sandbox() { ); assert!( - clean.contains(TEST_API_KEY), - "sandbox should have ANTHROPIC_API_KEY in its environment:\n{clean}" + clean.contains(TEST_API_KEY_PLACEHOLDER), + "sandbox should have placeholder ANTHROPIC_API_KEY in its environment:\n{clean}" + ); + + assert!( + !clean.contains(TEST_API_KEY), + "sandbox should not expose the raw ANTHROPIC_API_KEY secret:\n{clean}" ); }