Skip to content

fix(parquet): align dictionary fallback with parquet-mr#786

Merged
zeroshade merged 5 commits intoapache:mainfrom
twuebi:tp/parquet-dict-fallback-parity
Apr 30, 2026
Merged

fix(parquet): align dictionary fallback with parquet-mr#786
zeroshade merged 5 commits intoapache:mainfrom
twuebi:tp/parquet-dict-fallback-parity

Conversation

@twuebi
Copy link
Copy Markdown
Contributor

@twuebi twuebi commented Apr 28, 2026

Rationale for this change

On dictionary overflow, arrow-go always flushed the dictionary page and any buffered dict-encoded data pages before switching to PLAIN, even when no dict-encoded data page had been cut. On mid-cardinality columns the result was a 4-encoding chunk layout (PLAIN_DICTIONARY, PLAIN, RLE, PLAIN) that bloated output by 20-30% versus parquet-mr.

This was noticed when testing iceberg-go's recently added compaction feature, where some tables with particular high cardinality columns would see a 30% size increase after compaction.

What changes are included in this PR?

Mirror parquet-mr's FallbackValuesWriter:

  • Discard the dictionary and re-encode buffered indices as PLAIN when no dict-encoded data page has been flushed yet; only emit the dictionary page once a dict-encoded page is committed.
  • Before the first dict-encoded page, fall back to PLAIN if dict + indices >= raw input bytes.
  • Size dict-encoded pages by raw input bytes (not the RLE indices' encoded size) so the page cadence matches PLAIN.

Adds DictEncoder.FallBackTo / ObservedRawSize and exposes BinaryMemoTable.Value for the fallback translation.

Are these changes tested?

Yes, as part of the PR and also e2e while testing compaction in iceberg-go.

Are there any user-facing changes?

No public API changes, only observable thing should be the dropped double encoding.

On dictionary overflow, arrow-go always flushed the dictionary page
and any buffered dict-encoded data pages before switching to PLAIN,
even when no dict-encoded data page had been cut. On mid-cardinality
columns the result was a 4-encoding chunk layout
(PLAIN_DICTIONARY, PLAIN, RLE, PLAIN) that bloated output by 20-30%
versus parquet-mr.

Mirror parquet-mr's FallbackValuesWriter:

  - Discard the dictionary and re-encode buffered indices as PLAIN
    when no dict-encoded data page has been flushed yet; only emit
    the dictionary page once a dict-encoded page is committed.
  - Before the first dict-encoded page, fall back to PLAIN if
    dict + indices >= raw input bytes.
  - Size dict-encoded pages by raw input bytes (not the RLE indices'
    encoded size) so the page cadence matches PLAIN.

Adds DictEncoder.FallBackTo / ObservedRawSize and exposes
BinaryMemoTable.Value for the fallback translation.
@twuebi twuebi requested a review from zeroshade as a code owner April 28, 2026 13:18
Comment thread parquet/file/column_writer.go Outdated
rawSize := dictEnc.ObservedRawSize()
encodedSize := dictEnc.EstimatedDataEncodedSize()
dictSize := int64(dictEnc.DictEncodedSize())
if rawSize > 0 && dictSize+encodedSize >= rawSize {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we actually need the rawSize > 0 check?

Comment on lines -434 to -435
// To keep pages in consistent state,
// remove the pages that will be released using above defer call.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove the comment?

Comment thread parquet/file/column_writer.go Outdated
Comment on lines +545 to +547
if err == nil {
w.dictPageWritten = true
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if err == nil {
w.dictPageWritten = true
}
w.dictPageWritten = err == nil

Comment thread parquet/file/column_writer.go Outdated
Comment on lines +144 to +147
// fallbackFn is set by each typed column writer at construction to its
// own FallbackToPlain. It lets the base FlushCurrentPage trigger
// fallback without needing to know the concrete value type.
fallbackFn func()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FallbackToPlain is already part of the ColumnChunkWriter interface, could we just modify logic in checkDictionarySizeLimit etc. instead of needing to pass the function callback like this?

Comment thread parquet/internal/encoding/memo_table.go Outdated
Comment on lines +314 to +316
func (m *binaryMemoTableImpl) Value(i int) []byte {
return m.builder.Value(i)
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the legacy map-based implementation. Luckily this function already exists in internal/hashing/xxh3_memo_table.go for the binary memo table that is actually being used.

@zeroshade zeroshade merged commit 2b2aa6b into apache:main Apr 30, 2026
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants