Skip to content

feat(bug-detectors): replace RCE substring check with global canary#873

Open
oetr wants to merge 19 commits intomainfrom
improve-RCE
Open

feat(bug-detectors): replace RCE substring check with global canary#873
oetr wants to merge 19 commits intomainfrom
improve-RCE

Conversation

@oetr
Copy link
Copy Markdown
Contributor

@oetr oetr commented Apr 1, 2026

The old detector checked whether the target string appeared in eval/Function source text. This false-positived on safely quoted occurrences (e.g. Handlebars' lookupProperty(d, "jaz_zer")).

Install a canary function on globalThis that fires only when attacker-controlled code actually executes. The before-hooks now provide fuzzing guidance only (guideTowardsContainment). The canary is propagated to the Jest vm.Context sandbox, following the same pattern as the prototype-pollution detector.

oetr added 18 commits March 30, 2026 18:39
AGENTS.md uses multiple top-level headings and unlabeled code fences
by design.  Exclude it from markdownlint rather than restructuring
the document.
yargs v18 is ESM-only and requires require()-of-ESM support, which
only shipped unflagged in Node 20.19 / 22.12.  Since the jazzer CLI
entry point must remain loadable via require() on Node >= 20.6, pin
yargs to the v17 line.
toThrowError -> toThrow, toBeCalledTimes -> toHaveBeenCalledTimes
Move the 'dist' pattern from modulePathIgnorePatterns (which blocks
module resolution) to testPathIgnorePatterns (which only affects test
discovery).  Disable ts-jest diagnostics since it does not support
composite project references — tsc -b is the source of truth for
type checking.
When a before-hook throws inside an async function body, the async
function returns a rejected Promise AND the finding is stored
synchronously.  Previously, both paths would resolve the C++ promise
and deferred (via the synchronous throw's catch block and the
.then() rejection handler's microtask), which is undefined behavior
that can hang forked child processes.

Check clearFirstFinding() before attaching .then() handlers.  If a
synchronous finding exists, suppress the rejected Promise and handle
the finding exclusively through the synchronous path.
Some libraries (e.g. pdf.js) use prototype-based error constructors
where .stack is inherited from a prototype Error, showing a stale
header like "Error" instead of the actual .name and .message.

Detect this mismatch and replace the first line of .stack with the
correct name: message before printing.  Only applies to non-Finding
errors, since cleanErrorStack already rewrites Finding headers.
Each ESM module will need its own coverage counter buffer, independent
of the shared global coverage map used by CJS modules.  libFuzzer
supports multiple disjoint counter regions, so we add a C++ function
that registers per-module buffers and a TypeScript API to allocate them.

The GC-prevention array in CoverageTracker keeps module buffers alive
for the process lifetime, matching libFuzzer's expectation that counter
memory remains valid.
Add an ESM loader that intercepts import() and static imports on
Node >= 20.6, applying Babel coverage and compare-hook transforms.
Each module gets its own counter buffer via a preamble that runs on
the main thread.

The shared coverage visitor is extracted from codeCoverage.ts so
both the CJS path (global IDs via incrementCounter) and the new ESM
path (module-local IDs via direct array writes) reuse the same
branch/loop/ternary instrumentation logic.

The ESM counter uses % 255 + 1 for NeverZero instead of || 1 to
avoid infinite Babel visitor recursion on the generated expression.

Istanbul source coverage is included from the start so that ESM
modules appear in --coverage reports alongside CJS ones.

On older Node versions the loader is silently skipped.
The jest-runner instruments code through Jest's own transformer pipeline
and monkey-patches built-ins at runtime — it never uses the ESM loader
hooks.  Extracted registerEsmLoaderHooks from registerInstrumentor,
called only from startFuzzing (the CLI entry point).  hookRequire stays
in registerInstrumentor because custom hooks loaded via import() in
initFuzzing still need coverage instrumentation through require.extensions.
The ESM loader now generates separate source map objects (instead of
inline ones) and registers them with the main-thread SourceMapRegistry
via a preamble call.  This lets source-map-support remap ESM stack
traces to original source positions.

The preamble line-shift is handled by prepending VLQ semicolons to
the mappings string — each semicolon represents an unmapped generated
line, pushing all real mappings down by the correct offset.
Establish a MessagePort channel between the main thread and the ESM
loader thread.  After bug detectors register their hooks and
finalizeHooks() completes, sendHooksToLoader() serializes the hook
metadata and posts it to the loader.

The loader uses receiveMessageOnPort() — a synchronous, non-blocking
read — to drain the message queue at the start of each module load.
Received hooks are registered as stubs (with no-op functions) in the
loader's own hookManager, which the functionHooks Babel plugin reads
from.  At runtime, the instrumented code calls HookManager.callHook()
on the main-thread global, where the real hook functions live.

Explicit hook IDs in the serialization format guard against index
mismatches between threads.  The MessagePort requires Node >= 20.11
(transferList support in module.register); older 20.x builds degrade
to ESM instrumentation without function hooks.
Replace crypto.randomInt(512) in fakePC() with a deterministic
xorshift32 PRNG seeded from the libFuzzer -seed= value.  This makes
TORC slot assignments identical across runs, so a given seed produces
the exact same mutation schedule every time.

If no seed is provided, one is generated, injected into the libFuzzer
args, and printed — giving full reproducibility from a single seed.

The seed flows from the CLI through the Instrumentor to both the CJS
transform path (main thread) and the ESM loader (via module.register
initialization data).
Replace the outdated "pure ESM projects will not be instrumented"
warning with real documentation: how the loader hook works, the
minimal setup (export function fuzz + type: module), Node >= 20.6
requirement, and the function hooks restriction (>= 20.11).

Add a comparison table between direct ESM (npx jazzer) and
Jest-based ESM (Babel transform approach) so users can pick the
right path for their project.
…riptions

- fuzz-targets.md: --fuzzFunction -> --fuzzEntryPoint, corpus is
  positional not a --corpus flag, normalize coverage reporter order
- fuzz-settings.md: JAZZER_COVERAGE env -> JAZZER_COVERAGE_REPORTERS,
  remove incorrect -runs=-1 equivalence claim, fix malformed
  xmlDictionary and timeout snippets
- jest-integration.md: close unclosed code blocks, update stale
  version pin from 2.1.0 to ^3.0.0
- bug-detectors.md: normalize --disable_bug_detectors to
  --disableBugDetectors, fix missing spaces before 'in CLI mode'
- examples/jest_typescript_integration: runner -> testRunner, fix
  stale import path, fix trailing commas in JSON, fix title typo
- architecture.md: mention ESM loader hook alongside CJS hookRequire
- Package READMEs: update jazzer.js-commercial links to jazzer.js,
  document async fuzzing mode in fuzzer README, add ESM loader path
  to instrumentor README, fix typos
Users on Node < 20.6 get no ESM instrumentation; users on 20.6–20.10
get coverage but no function hooks in ESM. Both cases previously
failed silently, making it hard to understand why coverage wasn't
increasing. Now a one-time warning is printed to stderr.

Also documents why gNextModulePC uses unique values despite PCFlags=0
making the PC field inert in libFuzzer's core feedback loop.
@oetr oetr marked this pull request as ready for review April 1, 2026 11:48
Copilot AI review requested due to automatic review settings April 1, 2026 11:48
@oetr oetr marked this pull request as draft April 1, 2026 11:51
@oetr oetr changed the base branch from esm-support to main April 1, 2026 11:52
@oetr oetr marked this pull request as ready for review April 1, 2026 11:52
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Remote Code Execution bug detector to avoid false positives from substring checks by introducing a global canary on globalThis and ensuring it’s also visible inside Jest’s vm.Context sandbox.

Changes:

  • Replace source-text substring detection for eval/Function with a global canary-based detection mechanism.
  • Update before-hooks to provide only fuzzing guidance (guideTowardsContainment) instead of reporting findings directly.
  • Expand and rename fuzz targets/tests to cover canary-triggering cases and “safe” (non-triggering) cases, including string-literal and string-coercible bodies.

Reviewed changes

Copilot reviewed 61 out of 63 changed files in this pull request and generated 2 comments.

File Description
tests/bug-detectors/remote-code-execution/tests.fuzz.js Renames/reorganizes fuzz test cases to distinguish canary-triggering vs safe scenarios.
tests/bug-detectors/remote-code-execution/fuzz.js Updates fuzz entrypoints to execute code that should (or should not) trigger the canary for eval and Function.
tests/bug-detectors/remote-code-execution.test.js Aligns CLI + Jest integration tests with the new entrypoints and expected outcomes.
packages/bug-detectors/internal/remote-code-execution.ts Implements the global canary, propagates it to Jest vmContext, and changes hooks to guidance-only.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The old detector checked whether the target string appeared in
eval/Function source text. This false-positived on safely quoted
occurrences (e.g. Handlebars' lookupProperty(d, "jaz_zer")).

Install a canary function on globalThis that fires only when
attacker-controlled code actually executes. The before-hooks now
provide fuzzing guidance only (guideTowardsContainment). The canary
is propagated to the Jest vm.Context sandbox, following the same
pattern as the prototype-pollution detector.
@oetr oetr requested a review from kyakdan April 1, 2026 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants