fix: ORDER BY/window partition preserves prior boundaries on NULL & const keys#24259
Open
aunjgr wants to merge 1 commit intomatrixorigin:mainfrom
Open
fix: ORDER BY/window partition preserves prior boundaries on NULL & const keys#24259aunjgr wants to merge 1 commit intomatrixorigin:mainfrom
aunjgr wants to merge 1 commit intomatrixorigin:mainfrom
Conversation
…st keys Fixes matrixorigin#24248. partition.Partition is called once per sort key in pkg/sql/colexec/order/order.go::sortAndSend (and per partition spec in window.go), reusing the same diffs slice across calls. Each call must *OR* new boundaries onto diffs and never overwrite an existing true. Two paths violated this: 1. genericPartition / bytesPartition: when both adjacent rows were NULL under the current key, the code did diffs[i] = false, erasing a boundary set by a prior key. This made multi-key ORDER BY over a FULL OUTER JOIN produce rows out of order: with the primary key NULL on the right-padded side, the secondary key was never sub-sorted within the t1.s=NULL partition (e.g. 14 emitted before 13). 2. Const branch: vec.IsConst() returned only [0] and IsConstNull() even cleared all of diffs. This collapsed previously found partitions when a const key appeared mid-list. Fix: in both functions, treat all 'rows are equal under this key' outcomes as a no-op on diffs (the both-NULL case and the entire const branch). Always build partitions from the OR-accumulated diffs at the end. Tests: pkg/partition/partition_test.go gets TestPartitionAccumulatesDiffs covering both fixed-width and bytes paths for: both-NULL preservation, const non-null preservation, const-NULL preservation, and the union-of-boundaries case. test/distributed/cases/join/fullouterjoin.sql gets the issue's repro (the 13/14 ordering case) and a NULLS-LAST secondary-key variant. mo-tester: 45/45 SUCCESS. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one. |
There was a problem hiding this comment.
Pull request overview
Fixes multi-key ORDER BY / window partition boundary handling when partition.Partition is called repeatedly with a reused diffs slice, ensuring previously-discovered boundaries are preserved for NULL-vs-NULL and const-vector keys (regression #24248).
Changes:
- Update
genericPartitionandbytesPartitionto never overwrite existingdiffs[i]==truewithfalsefor “both NULL” and const-vector paths, and always rebuildpartitionsfrom accumulateddiffs. - Add unit coverage to verify the “accumulate diffs across successive Partition calls” contract.
- Add a distributed regression case for FULL OUTER JOIN + multi-key ORDER BY with NULLs.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
pkg/partition/partition.go |
Preserves prior boundaries in diffs for NULL/const keys and rebuilds partitions from the accumulated diffs. |
pkg/partition/partition_test.go |
Adds regression/unit tests covering successive Partition calls that reuse the same diffs/partitions slices. |
test/distributed/cases/join/fullouterjoin.sql |
Adds an end-to-end FULL OUTER JOIN + ORDER BY repro for #24248 (including a DESC variant). |
test/distributed/cases/join/fullouterjoin.result |
Updates expected output for the new distributed regression queries. |
iamlinjunhong
approved these changes
Apr 30, 2026
heni02
approved these changes
Apr 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
Which issue(s) this PR fixes:
issue #24248
What this PR does / why we need it:
partition.Partitionis invoked once per sort key frompkg/sql/colexec/order/order.go::sortAndSend(and per partition specfrom
pkg/sql/colexec/window/window.go), reusing the samediffsslice across calls. The contract is that each call must OR new
boundaries onto
diffs— never overwrite an existingtrue.Two code paths in
pkg/partition/partition.goviolated that contract:Both-NULL overwrite. In
genericPartition/bytesPartition,when both adjacent rows were NULL under the current key the code
did
diffs[i] = false, erasing a boundary set by a prior key.This made multi-key
ORDER BYover a FULL OUTER JOIN produce rowsout of order: with the primary key NULL on the right-padded side,
the secondary key was never sub-sorted within the
t1.s = NULLpartition. Repro from the issue:
Before this PR,
(NULL,NULL,14,NULL)was emitted before(NULL,NULL,13,'c'). After: 13 before 14, as expected.Const-vector boundary collapse. When
vec.IsConst()the codereturned
partitions = [0]and, forIsConstNull(), even clearedall of
diffs. A const key appearing mid-list therefore collapsedall previously found partitions in the same
Partitioncall series.Fix
In both
genericPartitionandbytesPartition, treat every "rows areequal under this key" outcome as a no-op on
diffs(the both-NULLcase and the entire const branch). Always rebuild
partitionsfromthe OR-accumulated
diffsat the end.diffs[0] = trueis set upfront, so the const-vector path now naturally yields
[0]via thefinal scan rather than via a special branch.
This is the same semantics the non-const non-null path already had
(
diffs[i] = diffs[i] || (v != w)).Tests
pkg/partition/partition_test.go::TestPartitionAccumulatesDiffsexercises the reuse contract — multiple successive
Partitioncalls on the same
diffs/partitionsslices — across all fourproblematic shapes:
boundaries
test/distributed/cases/join/fullouterjoin.sqlgets the issue'sexact repro and a
NULLS LAST(DESC) secondary-key variant.mo-tester -m runon the test file: 45/45 SUCCESS, 0 FAILED.pkg/sql/colexec/order/...,pkg/sql/colexec/window/..., andpkg/partition/...all pass.PR Type and Checklists
Standard Checklist:
Checklist for BUG PR Type: