Skip to content

Add dict support for frames and blocks#218

Open
paddor wants to merge 4 commits intoPSeitz:mainfrom
paddor:frame-dict-support
Open

Add dict support for frames and blocks#218
paddor wants to merge 4 commits intoPSeitz:mainfrom
paddor:frame-dict-support

Conversation

@paddor
Copy link
Copy Markdown

@paddor paddor commented Apr 24, 2026

Complete dict support. Fairly efficient reuse across multiple compressions.

paddor added 4 commits April 13, 2026 11:28
FrameEncoder::with_dictionary and FrameDecoder::with_dictionary wire the
LZ4 frame Dict_ID feature through to the block layer. Encoder forces
independent block mode and compresses each block against the external
dictionary. Decoder verifies the frame's Dict_ID matches the supplied
dictionary and decompresses blocks with the dict as initial history.
Adds the primitives rlz4's RLZ4::BlockCodec needs to amortise dictionary
initialisation across many compress calls:

- `CompressTable::load_dict(dict)` — clear + hash dict positions into the
  table (same work `compress_into_with_table_and_dict` does today, but
  extracted so callers can do it once).
- `CompressTable::copy_from(other)` — memcpy another table's entries
  into this one. Reuses the existing allocation; no heap traffic.
  Delegates to new `HashTable4K::copy_from` / `HashTable4KU16::copy_from`
  methods.
- `compress_into_with_loaded_table_and_dict(input, output, table, dict)`
  — variant of `compress_into_with_table_and_dict` that skips the clear
  and the init_dict pass, trusting the caller to have populated the
  table via `load_dict` and restored it via `copy_from` before each call.

The typical per-call sequence for a dict-compressing codec becomes:

    scratch_table.copy_from(&pristine_table);  // ~30-100 ns memcpy
    compress_into_with_loaded_table_and_dict(input, output, &mut scratch_table, dict);

vs. the ~3-5 µs `init_dict` loop on every call with the original API.

No change to existing public functions.
Covers the primitives added in 0d69f8f:

- `test_loaded_table_round_trip` — load_dict → copy_from →
  compress_into_with_loaded_table_and_dict → decompress_into_with_dict.
- `test_loaded_table_matches_with_dict` — byte-for-byte equivalence
  with the clear+init-dict variant `compress_into_with_table_and_dict`.
- `test_copy_from_variant_mismatch_panics` — Small vs Large copy_from
  panics as documented.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant