You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OpenShell's L7 proxy rewrites placeholder tokens (openshell:resolve:env:*) at egress for TLS-terminated REST traffic. For gateway.discord.gg the NemoClaw blueprint policy sets tls: skip (per #544, pass-through is required to keep long-lived WSS sessions working). Result: the placeholder flows unchanged inside the WSS IDENTIFY payload; Discord closes with opcode 4004 (auth failed); the bot never connects.
Reproduce
nemoclaw onboard --non-interactive with a valid DISCORD_BOT_TOKEN
Provider <sandbox>-discord-bridge is created and attached to the sandbox; sandbox env has DISCORD_BOT_TOKEN=openshell:resolve:env:DISCORD_BOT_TOKEN
OpenClaw attempts to connect to wss://gateway.discord.gg
Gateway closes immediately with opcode 4004 (see attached gateway.log, search for 4004)
Confirmation it's a payload-rewrite gap, not a policy/network problem
Writing the real Discord bot token directly into /sandbox/.openclaw/openclaw.json (field channels.discord.accounts.default.token), bypassing the placeholder system for this field, produces a successful IDENTIFY and the bot connects. No policy changes required; the only variable is whether the literal placeholder string or the real token arrives in the WSS IDENTIFY payload.
Proposed directions
Add WSS MITM + JSON-payload-aware rewriting for known channel protocols (Discord IDENTIFY op 2, d.token field), so tls: skip can be removed for gateway.discord.gg.
OR: expose an in-sandbox secret-resolution gRPC endpoint (e.g. reachable via OPENSHELL_ENDPOINT) that clients can call to resolve openshell:resolve:env:* explicitly. OpenClaw (and other consumers) could then resolve at config-read time instead of relying on egress rewriting.
OR: when a provider is attached to a sandbox and the target channel is known to use WSS, let OpenShell inject the real credential value into the child env var directly at sandbox start (documenting the security trade-off that the credential is then at-rest in the sandbox env rather than only in the provider store).
gateway.log — 344-line log from inside the sandbox (micky pod, /tmp/gateway.log). Shows the 4004 pattern before the manual workaround and the quiet "awaiting gateway readiness" (implicit READY) after.
openshell-status.txt — openshell status
openshell-doctor-check.txt — openshell doctor check
openshell-doctor-logs.txt — openshell doctor logs --lines 200
Create a Discord bot credential:
nemoclaw credentials set DISCORD_BOT_TOKEN
Onboard the sandbox (this creates provider <sandbox>-discord-bridge and attaches it):
nemoclaw onboard --non-interactive
Confirm the sandbox env contains the placeholder, not the real token:
kubectl exec -n nemoclaw deploy/micky -- printenv DISCORD_BOT_TOKEN
=> openshell:resolve:env:DISCORD_BOT_TOKEN
Start OpenClaw inside the sandbox so it connects to Discord:
kubectl exec -n nemoclaw deploy/micky -- openclaw start
Observe gateway.log — OpenClaw opens wss://gateway.discord.gg, sends IDENTIFY op 2
with d.token set to the literal string "openshell:resolve:env:DISCORD_BOT_TOKEN",
Discord closes the socket with opcode 4004 (Authentication Failed).
Expected: IDENTIFY carries the resolved bot token; gateway sends READY; bot comes online.
Actual: IDENTIFY carries the literal placeholder string; gateway closes with 4004.
Workaround (confirms payload-rewrite gap, not a policy/network problem):
Edit /sandbox/.openclaw/openclaw.json inside the pod, set
channels.discord.accounts.default.token to the real token value from
~/.nemoclaw/credentials.json. OpenClaw hot-reloads, IDENTIFY now carries the real
token, Discord sends READY, bot connects. No policy changes required.
Agent Diagnostic
agent-diagnostic-output.txt
Description
Environment
policies/presets/discord.yaml)Summary
OpenShell's L7 proxy rewrites placeholder tokens (
openshell:resolve:env:*) at egress for TLS-terminated REST traffic. Forgateway.discord.ggthe NemoClaw blueprint policy setstls: skip(per #544, pass-through is required to keep long-lived WSS sessions working). Result: the placeholder flows unchanged inside the WSS IDENTIFY payload; Discord closes with opcode 4004 (auth failed); the bot never connects.Reproduce
nemoclaw onboard --non-interactivewith a validDISCORD_BOT_TOKEN<sandbox>-discord-bridgeis created and attached to the sandbox; sandbox env hasDISCORD_BOT_TOKEN=openshell:resolve:env:DISCORD_BOT_TOKENwss://gateway.discord.gggateway.log, search for4004)Confirmation it's a payload-rewrite gap, not a policy/network problem
Writing the real Discord bot token directly into
/sandbox/.openclaw/openclaw.json(fieldchannels.discord.accounts.default.token), bypassing the placeholder system for this field, produces a successful IDENTIFY and the bot connects. No policy changes required; the only variable is whether the literal placeholder string or the real token arrives in the WSS IDENTIFY payload.Proposed directions
d.tokenfield), sotls: skipcan be removed forgateway.discord.gg.OPENSHELL_ENDPOINT) that clients can call to resolveopenshell:resolve:env:*explicitly. OpenClaw (and other consumers) could then resolve at config-read time instead of relying on egress rewriting.Related
tls: skipis the escape hatch used by the NemoClaw Discord presetAttachments
gateway.log— 344-line log from inside the sandbox (micky pod,/tmp/gateway.log). Shows the 4004 pattern before the manual workaround and the quiet "awaiting gateway readiness" (implicit READY) after.openshell-status.txt—openshell statusopenshell-doctor-check.txt—openshell doctor checkopenshell-doctor-logs.txt—openshell doctor logs --lines 200openshell-issue-bundle.zip
Reproduction Steps
Deploy a NemoClaw stack (v2026.4.2) with OpenShell 0.0.26 on ARM64 (DGX Spark, k3s).
Blueprint uses policies/presets/discord.yaml which pins gateway.discord.gg to
tls: skip(per feat(sandbox): auto-detect TLS and terminate unconditionally for credential injection #544 — required to keep long-lived WSS sessions alive).
Create a Discord bot credential:
nemoclaw credentials set DISCORD_BOT_TOKEN
Onboard the sandbox (this creates provider
<sandbox>-discord-bridgeand attaches it):nemoclaw onboard --non-interactive
Confirm the sandbox env contains the placeholder, not the real token:
kubectl exec -n nemoclaw deploy/micky -- printenv DISCORD_BOT_TOKEN
=> openshell:resolve:env:DISCORD_BOT_TOKEN
Start OpenClaw inside the sandbox so it connects to Discord:
kubectl exec -n nemoclaw deploy/micky -- openclaw start
Observe gateway.log — OpenClaw opens wss://gateway.discord.gg, sends IDENTIFY op 2
with
d.tokenset to the literal string "openshell:resolve:env:DISCORD_BOT_TOKEN",Discord closes the socket with opcode 4004 (Authentication Failed).
Expected: IDENTIFY carries the resolved bot token; gateway sends READY; bot comes online.
Actual: IDENTIFY carries the literal placeholder string; gateway closes with 4004.
Workaround (confirms payload-rewrite gap, not a policy/network problem):
Edit /sandbox/.openclaw/openclaw.json inside the pod, set
channels.discord.accounts.default.token to the real token value from
~/.nemoclaw/credentials.json. OpenClaw hot-reloads, IDENTIFY now carries the real
token, Discord sends READY, bot connects. No policy changes required.
Environment
OpenShell: 0.0.26
NemoClaw: 2026.4.2 (blueprint: policies/presets/discord.yaml)
OpenClaw: 2026.4.9
Host: NVIDIA DGX Spark (GB10 Grace Blackwell, ARM64 / aarch64)
OS: Ubuntu 24.04 LTS (kernel 6.11, CUDA 13.0)
Runtime: k3s (single-node), containerd
Sandbox pod: micky (namespace: nemoclaw)
Storage: local-path PVC
workspace-micky(2Gi) mounted at /sandboxClient: Discord gateway via
wsnpm library (raw tls.connect, ignores HTTPS_PROXY)Policy: gateway.discord.gg →
tls: skip(L4 CONNECT pass-through)Network: OpenShell L7 proxy at 10.200.0.1:3128 (CONNECT + TLS-MITM for REST egress)
Logs
Agent-First Checklist
debug-openshell-cluster,debug-inference,openshell-cli)