Tighten canonical-repo match in queryTargetsDependingOnModules#338
Open
rdark wants to merge 2 commits intoTinder:masterfrom
Open
Tighten canonical-repo match in queryTargetsDependingOnModules#338rdark wants to merge 2 commits intoTinder:masterfrom
rdark wants to merge 2 commits intoTinder:masterfrom
Conversation
On MODULE.bazel-touching commits the impacted-targets path fans out one `bazel query rdeps(//..., @@<repo>//...)` subprocess per canonical repo whose label contains the changed module's name as a substring. A module named "cpp" pulled in every `abseil-cpp+//...` label; in workspaces with many bzlmod module extensions the fan-out reached thousands of serial subprocesses per `get-impacted-targets` call. Replace the `contains` scan with a two-tier predicate that matches on the parsed canonical repo name: 1. Tier A: consult `bazel mod dump_repo_mapping ""` (already implemented in BazelQueryService.discoverRepoMapping; made public for reuse). If the mapping resolves `module.apparentName` to canonical C, any repo equal to C or starting with "C+"/"C~" belongs to that module. 2. Tier B: prefix-match `module.name + "+"` / `module.name + "~"` for transitive modules that root's repo mapping does not include and when discoverRepoMapping fails. Extension-created forms (`name++ext+repo`, `name~~ext~repo`) are subsumed by these prefixes. Canonical repos to query are collected across all changed modules before running rdeps, which avoids duplicate queries when findChangedModules reports a version bump as both an add and a remove (two Module objects sharing the same canonical). Also fix a pre-existing bug in the same function: the trailing `computeSimpleImpactedTargets(emptyMap(), allTargets)` call returned `allTargets.keys`, silently unioning every target back into the rdeps result. Thread `from` through the call chain so the union uses the real hash diff. Without this, the rdeps filtering added here would be cost-only — the impacted set would still equal the full workspace. Tests: new CalculateImpactedTargetsInteractorModuleQueryTest covers canonical-plus, canonical-tilde, extension-created-via-mapping, extension-created-via-name-fallback, the "cpp must not match abseil-cpp" regression, transitive-only-via-name-fallback, ghost-module (Tier C skip), multiple-disjoint-modules, discoverRepoMapping failure, and the executeWithDistances module-query path. Output assertions guard against regression to the `emptyMap()` union bug.
2ceb1c8 to
a61697d
Compare
The per-repo rdeps query-failure fallback filtered out every `@@` label on the assumption that the hash set was mostly workspace-local `//...` targets. That assumption does not hold on bzlmod-only workspaces where `generate-hashes` emits almost entirely `@@canonical` labels plus a small number of `//external:<apparent>` bzlmod-synthetic bridges, with zero workspace-local targets. On such workspaces, a single failed `rdeps(//..., @@<repo>//...)` query (for example, one caused by an unrelated loading error elsewhere in the dep graph) left the filter with only the `//external:*` bridges, which the downstream `excludeExternalTargets=true` default (Tinder#334) then strips entirely — producing silently-empty impacted output on what should be a "conservatively rebuild everything" signal. Replace the unconditional `!startsWith("@@")` filter with a shape-aware fallback: - Compute a "buildable workspace" set that excludes BOTH `@@canonical` transitives AND `//external:<apparent>` bridges. - If that set is non-empty (mixed WORKSPACE + bzlmod, or WORKSPACE-only shapes), emit it — preserves pre-existing granularity, avoids leaking tens of thousands of `@@` transitives into the impacted set on a single flaky `bazel query`. - If it is empty (bzlmod-only shape), fall through to `allTargets.keys` so the downstream filter has something to keep — rebuild-everything signal rather than a directly-buildable label list. Adds a mixed-shape regression test asserting that `@@` transitives and `//external:*` bridges do NOT leak into the fallback when buildable workspace labels are present, complementing the existing bzlmod-only test. An unconditional `allTargets.keys` fallback fails the new test; the pre-commit `!startsWith("@@")` filter fails the existing bzlmod-only one. Only the shape-aware fallback passes both.
e031d63 to
1f737d4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Partial fix for #335
On MODULE.bazel-touching commits the impacted-targets path fans out one
bazel query rdeps(//..., @@<repo>//...)subprocess per canonical repo whose label contains the changed module's name as a substring. A module named "cpp" pulled in everyabseil-cpp+//...label; in workspaces with many bzlmod module extensions the fan-out reached thousands of serial subprocesses perget-impacted-targetscall.Replace the
containsscan with a two-tier predicate that matches on the parsed canonical repo name:bazel mod dump_repo_mapping ""(already implemented in BazelQueryService.discoverRepoMapping; made public for reuse). If the mapping resolvesmodule.apparentNameto canonical C, any repo equal to C or starting with "C+"/"C~" belongs to that module.module.name + "+"/module.name + "~"for transitive modules that root's repo mapping does not include and when discoverRepoMapping fails. Extension-created forms (name++ext+repo,name~~ext~repo) are subsumed by these prefixes.Canonical repos to query are collected across all changed modules before running rdeps, which avoids duplicate queries when findChangedModules reports a version bump as both an add and a remove (two Module objects sharing the same canonical).
Also fix a pre-existing bug in the same function: the trailing
computeSimpleImpactedTargets(emptyMap(), allTargets)call returnedallTargets.keys, silently unioning every target back into the rdeps result. Threadfromthrough the call chain so the union uses the real hash diff. Without this, the rdeps filtering added here would be cost-only — the impacted set would still equal the full workspace.Tests: new CalculateImpactedTargetsInteractorModuleQueryTest covers canonical-plus, canonical-tilde, extension-created-via-mapping, extension-created-via-name-fallback, the "cpp must not match abseil-cpp" regression, transitive-only-via-name-fallback, ghost-module (Tier C skip), multiple-disjoint-modules, discoverRepoMapping failure, and the executeWithDistances module-query path. Output assertions guard against regression to the
emptyMap()union bug.