feat(bug-detectors): replace RCE substring check with global canary#873
Open
feat(bug-detectors): replace RCE substring check with global canary#873
Conversation
AGENTS.md uses multiple top-level headings and unlabeled code fences by design. Exclude it from markdownlint rather than restructuring the document.
yargs v18 is ESM-only and requires require()-of-ESM support, which only shipped unflagged in Node 20.19 / 22.12. Since the jazzer CLI entry point must remain loadable via require() on Node >= 20.6, pin yargs to the v17 line.
toThrowError -> toThrow, toBeCalledTimes -> toHaveBeenCalledTimes
Move the 'dist' pattern from modulePathIgnorePatterns (which blocks module resolution) to testPathIgnorePatterns (which only affects test discovery). Disable ts-jest diagnostics since it does not support composite project references — tsc -b is the source of truth for type checking.
When a before-hook throws inside an async function body, the async function returns a rejected Promise AND the finding is stored synchronously. Previously, both paths would resolve the C++ promise and deferred (via the synchronous throw's catch block and the .then() rejection handler's microtask), which is undefined behavior that can hang forked child processes. Check clearFirstFinding() before attaching .then() handlers. If a synchronous finding exists, suppress the rejected Promise and handle the finding exclusively through the synchronous path.
Some libraries (e.g. pdf.js) use prototype-based error constructors where .stack is inherited from a prototype Error, showing a stale header like "Error" instead of the actual .name and .message. Detect this mismatch and replace the first line of .stack with the correct name: message before printing. Only applies to non-Finding errors, since cleanErrorStack already rewrites Finding headers.
Each ESM module will need its own coverage counter buffer, independent of the shared global coverage map used by CJS modules. libFuzzer supports multiple disjoint counter regions, so we add a C++ function that registers per-module buffers and a TypeScript API to allocate them. The GC-prevention array in CoverageTracker keeps module buffers alive for the process lifetime, matching libFuzzer's expectation that counter memory remains valid.
Add an ESM loader that intercepts import() and static imports on Node >= 20.6, applying Babel coverage and compare-hook transforms. Each module gets its own counter buffer via a preamble that runs on the main thread. The shared coverage visitor is extracted from codeCoverage.ts so both the CJS path (global IDs via incrementCounter) and the new ESM path (module-local IDs via direct array writes) reuse the same branch/loop/ternary instrumentation logic. The ESM counter uses % 255 + 1 for NeverZero instead of || 1 to avoid infinite Babel visitor recursion on the generated expression. Istanbul source coverage is included from the start so that ESM modules appear in --coverage reports alongside CJS ones. On older Node versions the loader is silently skipped.
The jest-runner instruments code through Jest's own transformer pipeline and monkey-patches built-ins at runtime — it never uses the ESM loader hooks. Extracted registerEsmLoaderHooks from registerInstrumentor, called only from startFuzzing (the CLI entry point). hookRequire stays in registerInstrumentor because custom hooks loaded via import() in initFuzzing still need coverage instrumentation through require.extensions.
The ESM loader now generates separate source map objects (instead of inline ones) and registers them with the main-thread SourceMapRegistry via a preamble call. This lets source-map-support remap ESM stack traces to original source positions. The preamble line-shift is handled by prepending VLQ semicolons to the mappings string — each semicolon represents an unmapped generated line, pushing all real mappings down by the correct offset.
Establish a MessagePort channel between the main thread and the ESM loader thread. After bug detectors register their hooks and finalizeHooks() completes, sendHooksToLoader() serializes the hook metadata and posts it to the loader. The loader uses receiveMessageOnPort() — a synchronous, non-blocking read — to drain the message queue at the start of each module load. Received hooks are registered as stubs (with no-op functions) in the loader's own hookManager, which the functionHooks Babel plugin reads from. At runtime, the instrumented code calls HookManager.callHook() on the main-thread global, where the real hook functions live. Explicit hook IDs in the serialization format guard against index mismatches between threads. The MessagePort requires Node >= 20.11 (transferList support in module.register); older 20.x builds degrade to ESM instrumentation without function hooks.
Replace crypto.randomInt(512) in fakePC() with a deterministic xorshift32 PRNG seeded from the libFuzzer -seed= value. This makes TORC slot assignments identical across runs, so a given seed produces the exact same mutation schedule every time. If no seed is provided, one is generated, injected into the libFuzzer args, and printed — giving full reproducibility from a single seed. The seed flows from the CLI through the Instrumentor to both the CJS transform path (main thread) and the ESM loader (via module.register initialization data).
Replace the outdated "pure ESM projects will not be instrumented" warning with real documentation: how the loader hook works, the minimal setup (export function fuzz + type: module), Node >= 20.6 requirement, and the function hooks restriction (>= 20.11). Add a comparison table between direct ESM (npx jazzer) and Jest-based ESM (Babel transform approach) so users can pick the right path for their project.
…riptions - fuzz-targets.md: --fuzzFunction -> --fuzzEntryPoint, corpus is positional not a --corpus flag, normalize coverage reporter order - fuzz-settings.md: JAZZER_COVERAGE env -> JAZZER_COVERAGE_REPORTERS, remove incorrect -runs=-1 equivalence claim, fix malformed xmlDictionary and timeout snippets - jest-integration.md: close unclosed code blocks, update stale version pin from 2.1.0 to ^3.0.0 - bug-detectors.md: normalize --disable_bug_detectors to --disableBugDetectors, fix missing spaces before 'in CLI mode' - examples/jest_typescript_integration: runner -> testRunner, fix stale import path, fix trailing commas in JSON, fix title typo - architecture.md: mention ESM loader hook alongside CJS hookRequire - Package READMEs: update jazzer.js-commercial links to jazzer.js, document async fuzzing mode in fuzzer README, add ESM loader path to instrumentor README, fix typos
Users on Node < 20.6 get no ESM instrumentation; users on 20.6–20.10 get coverage but no function hooks in ESM. Both cases previously failed silently, making it hard to understand why coverage wasn't increasing. Now a one-time warning is printed to stderr. Also documents why gNextModulePC uses unique values despite PCFlags=0 making the PC field inert in libFuzzer's core feedback loop.
There was a problem hiding this comment.
Pull request overview
This PR updates the Remote Code Execution bug detector to avoid false positives from substring checks by introducing a global canary on globalThis and ensuring it’s also visible inside Jest’s vm.Context sandbox.
Changes:
- Replace source-text substring detection for
eval/Functionwith a global canary-based detection mechanism. - Update before-hooks to provide only fuzzing guidance (
guideTowardsContainment) instead of reporting findings directly. - Expand and rename fuzz targets/tests to cover canary-triggering cases and “safe” (non-triggering) cases, including string-literal and string-coercible bodies.
Reviewed changes
Copilot reviewed 61 out of 63 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| tests/bug-detectors/remote-code-execution/tests.fuzz.js | Renames/reorganizes fuzz test cases to distinguish canary-triggering vs safe scenarios. |
| tests/bug-detectors/remote-code-execution/fuzz.js | Updates fuzz entrypoints to execute code that should (or should not) trigger the canary for eval and Function. |
| tests/bug-detectors/remote-code-execution.test.js | Aligns CLI + Jest integration tests with the new entrypoints and expected outcomes. |
| packages/bug-detectors/internal/remote-code-execution.ts | Implements the global canary, propagates it to Jest vmContext, and changes hooks to guidance-only. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The old detector checked whether the target string appeared in eval/Function source text. This false-positived on safely quoted occurrences (e.g. Handlebars' lookupProperty(d, "jaz_zer")). Install a canary function on globalThis that fires only when attacker-controlled code actually executes. The before-hooks now provide fuzzing guidance only (guideTowardsContainment). The canary is propagated to the Jest vm.Context sandbox, following the same pattern as the prototype-pollution detector.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The old detector checked whether the target string appeared in eval/Function source text. This false-positived on safely quoted occurrences (e.g. Handlebars' lookupProperty(d, "jaz_zer")).
Install a canary function on globalThis that fires only when attacker-controlled code actually executes. The before-hooks now provide fuzzing guidance only (guideTowardsContainment). The canary is propagated to the Jest vm.Context sandbox, following the same pattern as the prototype-pollution detector.