Problem Statement
OpenShell's TLS and SSH subsystems use non-FIPS-validated cryptographic libraries and default to non-FIPS-approved algorithms. On FIPS-enabled clusters (common in government, defense, and regulated-industry Kubernetes deployments), this makes OpenShell non-compliant from an audit perspective -- even though the kernel does not block the operations at runtime.
Specifically:
- TLS (all connections): Uses
ring 0.17 via rustls. ring has no FIPS validation path. Default cipher suites include ChaCha20-Poly1305 and X25519 key exchange, neither of which are FIPS-approved.
- SSH (sandbox transport): Uses russh 0.57 with a mix of aws-lc-rs, ed25519-dalek, and curve25519-dalek. The sandbox SSH server hardcodes Ed25519 host keys (
ssh.rs:56). Default negotiation prefers ChaCha20-Poly1305 and Curve25519 key exchange.
- PKI (certificate generation): Uses rcgen 0.13 backed by ring. The default algorithm (ECDSA P-256) is FIPS-approved, but the implementation module is not validated.
FIPS-enabled RHEL 9 / OpenShift 4.x clusters enforce FIPS 140-3 via system-wide crypto policies. Processes using non-validated crypto modules fail compliance audits regardless of the algorithms selected. There are no existing FIPS-related issues in the tracker.
This is complementary to #899 (Platform mode / restricted SCC support) -- FIPS clusters are a subset of the managed Kubernetes deployments that issue addresses.
Proposed Design
Add a workspace-level fips Cargo feature flag that switches the crypto backend from ring to aws-lc-rs in FIPS mode (CMVP certificate #4631), restricts algorithm negotiation to FIPS-approved algorithms only, and documents the SSH layer's validation gap.
Phase 1: Feature-flagged FIPS for TLS + PKI
Crypto provider switch -- The three binary entry points that install the rustls CryptoProvider would switch based on the feature flag:
// Current (all three binaries):
rustls::crypto::ring::default_provider().install_default()
// With --features fips:
rustls::crypto::aws_lc_rs::default_provider().install_default()
Workspace dependency changes:
# Current:
rustls = { version = "0.23", default-features = false, features = ["std", "logging", "tls12", "ring"] }
tokio-rustls = { version = "0.26", default-features = false, features = ["logging", "tls12", "ring"] }
rcgen = { version = "0.13", features = ["crypto"] }
# With fips feature:
rustls = { version = "0.23", default-features = false, features = ["std", "logging", "tls12", "aws_lc_rs", "fips"] }
tokio-rustls = { version = "0.26", default-features = false, features = ["logging", "tls12", "aws-lc-rs"] }
rcgen = { version = "0.13", default-features = false, features = ["aws_lc_rs", "pem"] }
TLS cipher suite restriction -- In FIPS mode, configure the provider to exclude ChaCha20-Poly1305 cipher suites and X25519 key exchange, allowing only:
- TLS 1.3:
TLS13_AES_256_GCM_SHA384, TLS13_AES_128_GCM_SHA256
- TLS 1.2:
TLS_ECDHE_ECDSA_WITH_AES_*_GCM_SHA*, TLS_ECDHE_RSA_WITH_AES_*_GCM_SHA*
- Key exchange: ECDH-P256, ECDH-P384 (no X25519)
SSH algorithm restriction -- Change the sandbox SSH server's host key from Ed25519 to ECDSA-P256 and configure russh::server::Config::preferred / russh::client::Config::preferred to exclude non-FIPS algorithms:
- Host keys:
ecdsa-sha2-nistp256, ecdsa-sha2-nistp384, rsa-sha2-256, rsa-sha2-512 (no ssh-ed25519)
- Key exchange:
ecdh-sha2-nistp256, ecdh-sha2-nistp384, diffie-hellman-group14-sha256, diffie-hellman-group16-sha512 (no Curve25519, no post-quantum mlkem)
- Ciphers:
aes256-gcm@openssh.com, aes128-gcm@openssh.com, aes256-ctr, aes128-ctr (no ChaCha20-Poly1305)
HMAC switch -- The NSSH1 handshake (ssh.rs:322, ssh_tunnel.rs:320) uses RustCrypto hmac + sha2. In FIPS mode, replace with aws-lc-rs HMAC-SHA256.
Transitive dependency updates -- reqwest, sqlx, tokio-tungstenite, and hyper-rustls all pull in rustls and/or ring. Each needs feature flags updated for the FIPS build:
- reqwest: switch from
rustls-tls to using the globally-installed CryptoProvider
- sqlx
runtime-tokio-rustls: should respect the global provider
- hyper-rustls: switch from
ring to aws-lc-rs feature
- tokio-tungstenite: switch from
rustls-tls-native-roots to aws-lc-rs backend
Phase 2 (deferred): SSH transport FIPS validation
Phase 1 restricts SSH to FIPS-approved algorithms but the underlying implementations (ed25519-dalek, p256, aes from RustCrypto) remain non-validated modules. This is a known gap. The SSH transport only operates within the cluster's mTLS boundary (gateway-to-sandbox), providing defense-in-depth rather than being the primary trust boundary.
If strict auditors require validated modules for the SSH layer, Phase 2 options include:
- Upstream russh support for aws-lc-rs as its crypto backend
- Replacing the embedded russh server with an OpenSSH subprocess (significant architecture change given the deep integration at
ssh.rs -- 1700+ lines of process spawning, PTY management, channel handling, SFTP subsystem)
Scope boundaries:
- The
fips feature is off by default -- current behavior is preserved
- Phase 1 achieves FIPS-validated crypto for all TLS operations (the external-facing attack surface) and FIPS-approved algorithms for SSH
- Phase 1 explicitly documents the SSH validation gap
- Phase 2 is deferred to actual audit requirements
Alternatives Considered
-
System OpenSSL for everything -- Replace rustls with the openssl crate and russh with libssh2 or OpenSSH subprocess. True FIPS validation for all operations via RHEL 9's OpenSSL 3.x (CMVP #4282). Rejected for Phase 1: massive rewrite, loses rustls memory safety guarantees, adds system library dependency, and significantly complicates cross-platform builds.
-
Partial compliance with documented exceptions -- FIPS for TLS only, document SSH as internal-only transport. This is essentially what Phase 1 achieves, but framed as the complete solution rather than a stepping stone. May not satisfy strict auditors.
-
No FIPS support -- Require FIPS-mode clusters to use custom crypto policy exceptions for OpenShell pods. Not viable for enterprise adoption in regulated environments.
-
gVisor RuntimeClass -- gVisor provides its own syscall interception and could theoretically handle crypto at the runtime level. Not applicable -- gVisor intercepts syscalls, not userspace crypto library calls.
Agent Investigation
Investigation performed with a coding agent pointed at the repo. Skills loaded: create-spike, generate-sandbox-policy. The agent traced every crypto dependency, configuration point, and algorithm choice across the 15-crate workspace.
Crypto dependency map
The TLS and SSH subsystems use different crypto backends -- a critical finding for the migration path:
TLS path (all connections):
rustls 0.23.37 -> ring 0.17.14
rcgen 0.13.2 -> ring 0.17.14
rustls-webpki 0.103.10 -> ring 0.17.14
quinn-proto 0.11.14 -> ring 0.17.14
SSH path (sandbox transport):
russh 0.57.1 -> aws-lc-rs 1.16.2
russh 0.57.1 -> ed25519-dalek 2.2.0 -> curve25519-dalek 4.1.3
russh 0.57.1 -> aes 0.8.4, cbc 0.1.2, ctr 0.9.2 (RustCrypto symmetric)
russh 0.57.1 -> p256 0.13.2, p384 0.13.1, p521 0.13.3
russh 0.57.1 -> libcrux-ml-kem 0.0.4 (post-quantum)
Code references
| Location |
Description |
Cargo.toml:36-37 |
Workspace rustls/tokio-rustls pinned to ring feature |
Cargo.toml:39 |
rcgen 0.13 with crypto feature (ring backend) |
Cargo.toml:70 |
reqwest with rustls-tls feature |
Cargo.toml:73 |
tokio-tungstenite with rustls-tls-native-roots |
Cargo.toml:93 |
sqlx with runtime-tokio-rustls |
crates/openshell-server/src/cli.rs:184 |
ring::default_provider().install_default() |
crates/openshell-cli/src/main.rs:1664 |
ring::default_provider().install_default() |
crates/openshell-sandbox/src/main.rs:122 |
ring::default_provider().install_default() |
crates/openshell-server/src/tls.rs:63 |
Gateway mTLS ServerConfig (no cipher suite customization) |
crates/openshell-cli/src/tls.rs:209 |
CLI mTLS ClientConfig (no cipher suite customization) |
crates/openshell-sandbox/src/l7/tls.rs:156 |
MITM proxy ServerConfig |
crates/openshell-sandbox/src/l7/tls.rs:222 |
MITM proxy upstream ClientConfig |
crates/openshell-sandbox/src/l7/tls.rs:44 |
MITM ephemeral CA generation (rcgen/ring) |
crates/openshell-sandbox/src/l7/tls.rs:116 |
MITM leaf cert generation (rcgen/ring) |
crates/openshell-sandbox/src/ssh.rs:56 |
PrivateKey::random(&mut rng, Algorithm::Ed25519) -- hardcoded Ed25519 host key |
crates/openshell-sandbox/src/ssh.rs:58 |
russh::server::Config::default() -- no algorithm restrictions |
crates/openshell-sandbox/src/ssh.rs:322 |
NSSH1 HMAC-SHA256 via RustCrypto hmac + sha2 |
crates/openshell-server/src/ssh_tunnel.rs:320 |
Server-side NSSH1 HMAC-SHA256 |
crates/openshell-server/src/grpc/sandbox.rs:828 |
russh::client::Config::default() -- no algorithm restrictions |
crates/openshell-bootstrap/src/pki.rs:40,60,78 |
PKI key generation via rcgen::KeyPair::generate() (ECDSA P-256 default via ring) |
deploy/docker/Dockerfile.images |
Base image: nvcr.io/nvidia/base/ubuntu:noble-20251013 (not UBI, no FIPS OpenSSL) |
Non-FIPS algorithm inventory
| Operation |
Current Algorithm |
FIPS? |
FIPS Alternative |
Controlling Code |
| TLS 1.3 cipher |
ChaCha20-Poly1305 (in default list) |
No |
AES-256-GCM, AES-128-GCM |
CryptoProvider cipher suite list |
| TLS key exchange |
X25519 (in default list) |
No |
ECDH-P256, ECDH-P384 |
CryptoProvider kx_group list |
| TLS crypto module |
ring 0.17 |
No |
aws-lc-rs (CMVP #4631) |
Cargo.toml feature flags |
| SSH host key |
Ed25519 (hardcoded) |
No |
ECDSA-P256, ECDSA-P384 |
ssh.rs:56 |
| SSH key exchange |
curve25519-sha256 (default preferred) |
No |
ecdh-sha2-nistp256/384 |
russh::Config::preferred |
| SSH cipher |
chacha20-poly1305 (default preferred) |
No |
aes256-gcm, aes128-gcm |
russh::Config::preferred |
| SSH KEX (PQ) |
mlkem768x25519 |
No |
Remove from preference list |
russh::Config::preferred |
| SSH crypto module |
ed25519-dalek, RustCrypto AES, p256 |
No |
aws-lc-fips-sys (requires upstream russh changes) |
russh internals |
| PKI key generation |
ECDSA P-256 via ring |
Algorithm OK, module not validated |
ECDSA P-256 via aws-lc-rs |
rcgen backend feature |
| NSSH1 HMAC |
HMAC-SHA256 via RustCrypto hmac+sha2 |
Algorithm OK, module not validated |
HMAC-SHA256 via aws-lc-rs |
ssh.rs:322, ssh_tunnel.rs:320 |
Existing FIPS awareness: Zero. The only mention of "FIPS" in the codebase is in an OCSF schema JSON referencing NIST FIPS 199 (information classification standard, unrelated to crypto).
Feature flag patterns: The codebase already uses workspace-level feature propagation (bundled-z3, dev-settings) and platform-conditional compilation via #[cfg(target_os = "linux")]. A #[cfg(feature = "fips")] pattern would be consistent.
Risks & open questions:
- aws-lc-rs FIPS build requires CMake + Go, adding build toolchain complexity
- russh's internal crypto (ed25519-dalek, p256, RustCrypto AES) is not FIPS-validated regardless of algorithm selection -- Phase 1 documents this gap
- Does russh have upstream plans for an aws-lc-rs or FIPS backend?
- Cross-compilation from macOS to linux/amd64 for FIPS container builds may require remote builds
- SSH host key change from Ed25519 to ECDSA-P256 changes fingerprint -- sandboxes are ephemeral, so should not cause persistent trust issues
- Verify aws-lc-rs 1.16.2 references a validated AWS-LC build matching CMVP #4631
- Transitive deps (reqwest, sqlx, tokio-tungstenite, hyper-rustls) each need verification with aws-lc-rs provider
- Single
fips feature flag vs separate fips-tls/fips-ssh for phased rollout?
Checklist
Problem Statement
OpenShell's TLS and SSH subsystems use non-FIPS-validated cryptographic libraries and default to non-FIPS-approved algorithms. On FIPS-enabled clusters (common in government, defense, and regulated-industry Kubernetes deployments), this makes OpenShell non-compliant from an audit perspective -- even though the kernel does not block the operations at runtime.
Specifically:
ring 0.17via rustls. ring has no FIPS validation path. Default cipher suites include ChaCha20-Poly1305 and X25519 key exchange, neither of which are FIPS-approved.ssh.rs:56). Default negotiation prefers ChaCha20-Poly1305 and Curve25519 key exchange.FIPS-enabled RHEL 9 / OpenShift 4.x clusters enforce FIPS 140-3 via system-wide crypto policies. Processes using non-validated crypto modules fail compliance audits regardless of the algorithms selected. There are no existing FIPS-related issues in the tracker.
This is complementary to #899 (Platform mode / restricted SCC support) -- FIPS clusters are a subset of the managed Kubernetes deployments that issue addresses.
Proposed Design
Add a workspace-level
fipsCargo feature flag that switches the crypto backend from ring to aws-lc-rs in FIPS mode (CMVP certificate #4631), restricts algorithm negotiation to FIPS-approved algorithms only, and documents the SSH layer's validation gap.Phase 1: Feature-flagged FIPS for TLS + PKI
Crypto provider switch -- The three binary entry points that install the rustls CryptoProvider would switch based on the feature flag:
Workspace dependency changes:
TLS cipher suite restriction -- In FIPS mode, configure the provider to exclude ChaCha20-Poly1305 cipher suites and X25519 key exchange, allowing only:
TLS13_AES_256_GCM_SHA384,TLS13_AES_128_GCM_SHA256TLS_ECDHE_ECDSA_WITH_AES_*_GCM_SHA*,TLS_ECDHE_RSA_WITH_AES_*_GCM_SHA*SSH algorithm restriction -- Change the sandbox SSH server's host key from Ed25519 to ECDSA-P256 and configure
russh::server::Config::preferred/russh::client::Config::preferredto exclude non-FIPS algorithms:ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,rsa-sha2-256,rsa-sha2-512(nossh-ed25519)ecdh-sha2-nistp256,ecdh-sha2-nistp384,diffie-hellman-group14-sha256,diffie-hellman-group16-sha512(no Curve25519, no post-quantum mlkem)aes256-gcm@openssh.com,aes128-gcm@openssh.com,aes256-ctr,aes128-ctr(no ChaCha20-Poly1305)HMAC switch -- The NSSH1 handshake (
ssh.rs:322,ssh_tunnel.rs:320) uses RustCryptohmac+sha2. In FIPS mode, replace with aws-lc-rs HMAC-SHA256.Transitive dependency updates -- reqwest, sqlx, tokio-tungstenite, and hyper-rustls all pull in rustls and/or ring. Each needs feature flags updated for the FIPS build:
rustls-tlsto using the globally-installed CryptoProviderruntime-tokio-rustls: should respect the global providerringtoaws-lc-rsfeaturerustls-tls-native-rootsto aws-lc-rs backendPhase 2 (deferred): SSH transport FIPS validation
Phase 1 restricts SSH to FIPS-approved algorithms but the underlying implementations (ed25519-dalek, p256, aes from RustCrypto) remain non-validated modules. This is a known gap. The SSH transport only operates within the cluster's mTLS boundary (gateway-to-sandbox), providing defense-in-depth rather than being the primary trust boundary.
If strict auditors require validated modules for the SSH layer, Phase 2 options include:
ssh.rs-- 1700+ lines of process spawning, PTY management, channel handling, SFTP subsystem)Scope boundaries:
fipsfeature is off by default -- current behavior is preservedAlternatives Considered
System OpenSSL for everything -- Replace rustls with the
opensslcrate and russh with libssh2 or OpenSSH subprocess. True FIPS validation for all operations via RHEL 9's OpenSSL 3.x (CMVP #4282). Rejected for Phase 1: massive rewrite, loses rustls memory safety guarantees, adds system library dependency, and significantly complicates cross-platform builds.Partial compliance with documented exceptions -- FIPS for TLS only, document SSH as internal-only transport. This is essentially what Phase 1 achieves, but framed as the complete solution rather than a stepping stone. May not satisfy strict auditors.
No FIPS support -- Require FIPS-mode clusters to use custom crypto policy exceptions for OpenShell pods. Not viable for enterprise adoption in regulated environments.
gVisor RuntimeClass -- gVisor provides its own syscall interception and could theoretically handle crypto at the runtime level. Not applicable -- gVisor intercepts syscalls, not userspace crypto library calls.
Agent Investigation
Investigation performed with a coding agent pointed at the repo. Skills loaded:
create-spike,generate-sandbox-policy. The agent traced every crypto dependency, configuration point, and algorithm choice across the 15-crate workspace.Crypto dependency map
The TLS and SSH subsystems use different crypto backends -- a critical finding for the migration path:
Code references
Cargo.toml:36-37ringfeatureCargo.toml:39cryptofeature (ring backend)Cargo.toml:70rustls-tlsfeatureCargo.toml:73rustls-tls-native-rootsCargo.toml:93runtime-tokio-rustlscrates/openshell-server/src/cli.rs:184ring::default_provider().install_default()crates/openshell-cli/src/main.rs:1664ring::default_provider().install_default()crates/openshell-sandbox/src/main.rs:122ring::default_provider().install_default()crates/openshell-server/src/tls.rs:63crates/openshell-cli/src/tls.rs:209crates/openshell-sandbox/src/l7/tls.rs:156crates/openshell-sandbox/src/l7/tls.rs:222crates/openshell-sandbox/src/l7/tls.rs:44crates/openshell-sandbox/src/l7/tls.rs:116crates/openshell-sandbox/src/ssh.rs:56PrivateKey::random(&mut rng, Algorithm::Ed25519)-- hardcoded Ed25519 host keycrates/openshell-sandbox/src/ssh.rs:58russh::server::Config::default()-- no algorithm restrictionscrates/openshell-sandbox/src/ssh.rs:322hmac+sha2crates/openshell-server/src/ssh_tunnel.rs:320crates/openshell-server/src/grpc/sandbox.rs:828russh::client::Config::default()-- no algorithm restrictionscrates/openshell-bootstrap/src/pki.rs:40,60,78rcgen::KeyPair::generate()(ECDSA P-256 default via ring)deploy/docker/Dockerfile.imagesnvcr.io/nvidia/base/ubuntu:noble-20251013(not UBI, no FIPS OpenSSL)Non-FIPS algorithm inventory
Cargo.tomlfeature flagsssh.rs:56russh::Config::preferredrussh::Config::preferredrussh::Config::preferredrcgenbackend featuressh.rs:322,ssh_tunnel.rs:320Existing FIPS awareness: Zero. The only mention of "FIPS" in the codebase is in an OCSF schema JSON referencing NIST FIPS 199 (information classification standard, unrelated to crypto).
Feature flag patterns: The codebase already uses workspace-level feature propagation (
bundled-z3,dev-settings) and platform-conditional compilation via#[cfg(target_os = "linux")]. A#[cfg(feature = "fips")]pattern would be consistent.Risks & open questions:
fipsfeature flag vs separatefips-tls/fips-sshfor phased rollout?Checklist