Skip to content

Add zstd compression and Compression Dictionary Transport (CDT) support for static web assets#53836

Draft
javiercn wants to merge 24 commits intomainfrom
javiercn/zstd-support
Draft

Add zstd compression and Compression Dictionary Transport (CDT) support for static web assets#53836
javiercn wants to merge 24 commits intomainfrom
javiercn/zstd-support

Conversation

@javiercn
Copy link
Copy Markdown
Member

@javiercn javiercn commented Apr 13, 2026

Zstd Compression and CDT Support for Static Web Assets

Closes #53855

Summary

This PR adds zstd compression and Compression Dictionary Transport (CDT, RFC 9842) support to the Static Web Assets SDK. See the spec issue for full motivation, scenarios, and detailed design.

Changes overview

Phase 1: Generalized compression framework

  • Refactored ApplyCompressionNegotiation from hardcoded gzip/brotli to a generic EndpointGroup<CompressionGroupState> model
  • All compression formats driven by CompressionFormat MSBuild items with FileExtension, ContentEncoding, and UsesDictionary metadata
  • Group-based quality ranking with fused walks for better performance
  • C# tasks are fully format-agnostic; MSBuild targets handle format-specific dispatch

Phase 2: Zstd compression task

  • New ZstdCompress MSBuild task using the SWA CLI tool with .NET's ZstandardStream API
  • Compression at max quality (22) with optional dictionary support
  • Integrated into publish pipeline alongside gzip and brotli — .zst files produced automatically

Phase 3: Compression Dictionary Transport (CDT)

  • GeneratePublishAssetPack: Creates a ZIP containing the manifest and all assets, keyed by assets/{BasePath}/{RelativePath}
  • ResolveDictionaryCandidates: Matches current assets against a previous pack by non-fingerprinted route patterns, producing dictionary candidates for changed files
  • ZstdCompress with dictionary: Produces .dcz files (zstd with external dictionary) for matched assets
  • ApplyCompressionNegotiation: Emits CDT endpoint metadata per RFC 9842:
    • Use-As-Dictionary response header on all content-negotiated responses
    • Dictionary-Hash endpoint property with SHA-256 as Structured Fields Byte Sequence (:base64:)
    • Content-Encoding: dcb selector on delta-compressed endpoints
    • Vary: Accept-Encoding, Available-Dictionary for correct caching

MSBuild interface

<PropertyGroup>
  <StaticWebAssetDictionaryCompression>true</StaticWebAssetDictionaryCompression>
  <StaticWebAssetPreviousAssetPack>path/to/previous/staticwebassets.pack.zip</StaticWebAssetPreviousAssetPack>
</PropertyGroup>
Property Default Description
StaticWebAssetDictionaryCompression false Enable CDT support and the dcz format
StaticWebAssetPreviousAssetPack (empty) Path to previous version's asset pack
StaticWebAssetPublishAssetPackPath $(OutputPath)staticwebassets.pack.zip Output pack path

Performance

Scenario vs Brotli Notes
Standalone zstd +7.9% Expected — zstd trades size for CDT capability
CDT delta (app update) -21.7% Clear win for update payloads
Best individual file -98.6% System.Collections: 8,674 → 118 bytes

Testing

  • Unit tests for all new tasks (ZstdCompress, GeneratePublishAssetPack, ResolveDictionaryCandidates)
  • Unit tests for generalized ApplyCompressionNegotiation with CDT headers and dictionary hash properties
  • Unit tests for ComputeMatchPattern literal preservation in path patterns
  • Integration tests for zstd compression in build and publish pipelines
  • End-to-end CDT integration test: Publish_WithPreviousPack_GeneratesDczForModifiedAssets
  • Baseline tests updated for .zst/.dcz format registration across all app types

Open items

  • Embed dictionary hash in .dcz file for multi-dictionary support
  • MSBuild error when dictionary compression enabled but pack file missing
  • Introduce named types to replace tuples in task internals

@javiercn javiercn force-pushed the javiercn/zstd-support branch from 16cb3ee to 7bdd52d Compare April 13, 2026 19:20
@github-actions github-actions Bot added the Area-AspNetCore RazorSDK, BlazorWebAssemblySDK, StaticWebAssetsSDK label Apr 13, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Thanks for your PR, @@javiercn.
To learn about the PR process and branching schedule of this repo, please take a look at the SDK PR Guide.

Comment on lines +559 to +577
<Target Name="GeneratePublishAssetPack"
AfterTargets="GenerateStaticWebAssetsPublishManifest"
Condition="'$(StaticWebAssetProjectMode)' == 'Root'">

<ItemGroup>
<_AssetPackPublishAssets Include="@(StaticWebAsset)" Condition="'%(AssetKind)' != 'Build'" />
</ItemGroup>

<GeneratePublishAssetPack
ManifestPath="$(StaticWebAssetPublishManifestPath)"
Assets="@(_AssetPackPublishAssets)"
PackOutputPath="$(StaticWebAssetPublishAssetPackPath)">
<Output TaskParameter="GeneratedPackPath" PropertyName="_GeneratedAssetPackPath" />
</GeneratePublishAssetPack>

<ItemGroup>
<FileWrites Include="$(_GeneratedAssetPackPath)" />
</ItemGroup>
</Target>
Copy link
Copy Markdown
Member Author

@javiercn javiercn Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably limit this to primary assets only. We will only match primary asset endpoints against older primary assets, so we can filter out all the alternative (compressed) representations

javiercn and others added 14 commits April 18, 2026 10:23
…p-based quality ranking, fused walks

- Make StaticWebAssetEndpointGroup generic with TState, Modified, LinkedGroups
- Non-generic version inherits from StaticWebAssetEndpointGroup<object> for compat
- Rewrite ApplyCompressionNegotiation.Execute() with 3 fused walks:
  Walk 1: Parse endpoints into route groups + endpointsByAsset (single pass)
  Walk 2: Parse + sort assets, backward walk for quality computation
  Walk 3: Process compressed assets across all their endpoints
  Walk 4: Collect results from modified route groups
- Replace 1/(FileLength+1) quality with tiered series (1.0, 0.9...0.1, 0.09...)
- Add CompressionFormats ITaskItem[] for format-priority-aware ranking
- Add 3 new route-collision tests + 5 new negotiation tests + 8 quality theory cases
- Generalize compression tasks to accept ITaskItem[] CompressionFormats from MSBuild
- All 25 ApplyCompressionNegotiation tests pass

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Create ZstdCompress.cs ToolTask (mirrors BrotliCompress, uses 'zstd' command)
- Add 'zstd' command to Tool/Program.cs using ZstandardStream + ZstandardCompressionOptions
- Wire zstd into Compression.targets:
  - Add ZstdCompress UsingTask and zstd CompressionFormat item (.zst extension, zstd content-encoding)
  - Add *.zst to compression exclusion patterns
  - Add zstd to PublishCompressionFormats (publish-only, not build)
  - Add ZstdCompressionLevel property (default: 19, max compression)
  - Add _ZstdCompressed* item groups and task invocations in build and publish targets
- Update StaticWebAssetEndpointsIntegrationTest:
  - Update regex patterns and match helpers to include .zst
  - Update endpoint count assertions for zstd (45->63 endpoints in publish)
  - Add zstd endpoint assertions (selectors, response headers, compressed files)
  - Add .zst to VerifyEndpointsCollection compressed file check
- Regenerate baselines for publish manifests and file lists

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ets (Phase 3)

Implement zstd dictionary compression (dcz format) using previous asset
versions as dictionaries per RFC 9842, enabling delta compression that
achieves 10-100x better compression for incremental updates.

New tasks:
- ResolveDictionaryCandidates: extracts previous asset pack, matches
  current assets by RelativePath, reads integrity hashes from manifest
- GeneratePublishAssetPack: creates zip with publish manifest +
  uncompressed original assets for future dictionary use

Extended tasks:
- ResolveCompressedAssets: dcz candidates gated on dictionary availability,
  propagates DictionaryPath/DictionaryHash metadata
- ZstdCompress + tool: -d flag for dictionary path, creates
  ZstandardDictionary when provided
- ApplyCompressionNegotiation: dual selectors (Content-Encoding +
  Available-Dictionary) on dcz endpoints, Use-As-Dictionary header on
  originals, Vary: Available-Dictionary headers

Targets wiring:
- dcz CompressionFormat item (conditional on StaticWebAssetDictionaryCompression)
- ResolveDictionaryCandidates before publish compression configuration
- DictionaryCandidates passed to ResolveCompressedAssets and
  ApplyCompressionNegotiation
- GeneratePublishAssetPack runs after publish manifest generation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…erns

- Add ComputeMatchPattern() to StaticWebAssetPathPattern that converts
  fingerprint token expressions to * wildcards for Use-As-Dictionary: match=
- Refactor ResolveDictionaryCandidates to match by endpoint Route (with
  RelativePath fallback), output dictionary-centric items with Identity=
  extracted bytes path, Hash, TargetAsset, MatchPattern metadata
- Update ResolveCompressedAssets to key dictionary lookup by TargetAsset
- Update ApplyCompressionNegotiation to use Hash/MatchPattern from candidates
- Add 5 new route-based matching tests and fingerprint pattern test
- Add ComputeMatchPattern theory tests to StaticWebAssetPathPatternTest
- Update targets to pass CurrentEndpoints to ResolveDictionaryCandidates

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add integration test Publish_WithPreviousPack_GeneratesDczForModifiedAssets
  that verifies the full CDT pipeline: publish, save pack, modify file,
  republish with previous pack, assert dcz endpoints for modified files only
- Skip same-hash dictionary candidates in ResolveDictionaryCandidates
  (when old integrity == new integrity, delta compression is pointless)
- Include dictionary hash in compressed asset path generation to avoid
  filename collisions between regular zstd and dictionary-zstd variants
- Add unit test SkipsAssetWhenIntegrityMatchesCurrent

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Verify the complete set of 9 endpoints for the modified file:
- 5 content-negotiated (identity, gzip, br, zstd, dcz)
- 4 direct-access (.gz, .br, .zst, .dcz)

For each endpoint, assert exact headers, selectors, and properties:
- dcz endpoint: Available-Dictionary selector with v1 hash, Vary: Available-Dictionary
- identity endpoint: Use-As-Dictionary with match=, Vary: Available-Dictionary
- All compressed: Content-Encoding, ETag, Content-Length, Content-Type
- Non-dcz: no dictionary-related headers

Also validates: v1 has 7 endpoints, v2 has 9, Available-Dictionary hash
matches the v1 asset integrity, unchanged files have no dcz endpoints.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Per RFC 9842, the Use-As-Dictionary response header should be on ALL
content-negotiated responses for a resource (identity, gzip, br, zstd),
not just the identity endpoint. The client decompresses the response
before storing the raw body as a dictionary, so it doesn't matter which
encoding delivered the response.

Only dcz endpoints (which serve delta-compressed content, not the full
resource) should NOT get Use-As-Dictionary.

Implementation: store the dictionary match pattern on the route group
state in Walk 3, then apply Use-As-Dictionary and Vary: Available-Dictionary
to all non-dcz endpoints in Walk 4.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix error propagation in zstd/brotli tool (Parallel.For errors now fail the process)
- Add arg count validation and descriptive variable names in Program.cs
- Fix dictionary mtime check in ZstdCompress incremental skip logic
- Use ResolveFile() instead of Identity for file resolution in GeneratePublishAssetPack
- Fix null vs empty array bug in ContentTypeProvider compressed extensions
- Move Modified property from generic EndpointGroup to CompressionGroupState
- Remove unused LinkedGroups from EndpointGroup
- Use CreateEndpointGroups in ApplyCompressionNegotiation
- Remove dead primaryAssetsWithDictVariants HashSet
- Add Root-project guard on CDT target condition
- Replace XML docs with // comments on internal MSBuild tasks
- Update stale comment mentioning only .gz/.br formats
- Fix net472 compatibility (ContainsKey pattern instead of TryAdd)
- Add .zst/.dcz to baseline factory fingerprint regexes and KnownExtensions
- Regenerate all SWA test baselines for zstd/dcz support

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use Path.Combine and Path.GetTempPath() instead of hardcoded C:\prev\wwwroot
for cross-platform test compatibility.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ertions, add cleanup

- Refactor ApplyCompressionNegotiation.Execute into phase helpers:
  BuildEndpointsByAsset, BuildDictionaryLookups, ComputeQualityRankings,
  SortVariantsByEfficiency, ProcessCompressedVariants, UpdateCompressedEndpointHeaders,
  CreateSyntheticEndpoints, CollectModifiedEndpoints
- Strengthen Use-As-Dictionary header assertions to check exact match= value
- Add IDisposable cleanup for temp directories in ResolveDictionaryCandidatesTest
  and GeneratePublishAssetPackTest
- Fix match pattern assertions to include BasePath prefix and leading slash
- Rename test methods for consistency (RouteCollisions_, MultipleCompressedAssets_)
- Fix double-if nesting bugs in BuildFormatPriority, BuildFormatUsesDictionary,
  and ResolveDictionaryCandidates fallback path
- Add ValidateCompressionFormats() input validation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
string.StartsWith(char) is not available on net472. Use the
string overload with StringComparison.Ordinal instead.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… .zst

- Changed Dictionary-Hash from endpoint selector to endpoint property per
  ASP.NET Core routing contract (ContentEncodingNegotiationMatcherPolicy
  reads it from resource.Properties, not selectors)
- Added .zst to ServiceWorkerAssert.IsCompressedFile() to prevent
  KeyNotFoundException when .zst files appear in publish output
- Excluded .zst files from Publish_CompressesAllFrameworkFiles companion
  check (zst files don't need re-compression to .zst.gz/.zst.br)
- Regenerated BlazorWebAssembly baselines for BackCompatibilityPublish
  and HostedApp_ReferencingNetStandardLibrary tests (new .zst entries)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… visibility narrowing

- Fix format trimming in ResolveCompressedAssets (trim whitespace after Split)
- Add early return when Formats is empty
- Fix tool exit codes: switch from void SetAction + Environment.ExitCode to int-returning
- Add compression level validation (1-22) in zstd command
- Fix ZstdCompress ToolName to use ToolExe/ToolPath from targets
- Fix ComputeMatchPattern to preserve literal parts within token segments
- Include BasePath in GeneratePublishAssetPack entry keys for multi-library apps
- Include BasePath in ResolveDictionaryCandidates pack entry lookup
- Filter Build-only assets in ResolveDictionaryCandidates
- Add duplicate extension validation in DiscoverPrecompressedAssets
- Remove dead CompressedAssets from CompressionGroupState
- Narrow BuildFormatPriority/BuildFormatUsesDictionary to private
- Narrow ContentTypeProvider constructor to internal
- Fix _framework path duplication in StaticWebAssetsBaselineFactory.ToTemplate
- Update tests for new match pattern behavior and BasePath-aware pack keys

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Move dcz CompressionFormat definition outside EnableDefaultCompressionFormats
  group so dictionary compression works when defaults are disabled
- Write compression output to temp file and rename on success to prevent
  stale partial files when compression fails mid-stream (both brotli/zstd)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
javiercn and others added 10 commits April 18, 2026 10:23
- Fix double-slash in Use-As-Dictionary match pattern when already rooted
- Add **/*.dcz to CompressionExcludePatterns to prevent recompression
- Add .dcz to ServiceWorkerAssert.IsCompressedFile
- Remove unused GenerateStaticWebAssetsPublishManifestDependsOn_AssetPack
- Use OrdinalIgnoreCase for existingFormats HashSets
- Replace manual Split+Trim with SplitPattern helper

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…lidation

- Replace (string, string, bool) tuple with CompressionFormatInfo readonly struct
- Replace (Endpoint, Asset) tuple with PreviousRouteMatch readonly struct
- Embed old file fingerprint in dcz RelativePath: name.{fingerprint}.dcz
- Propagate OldFileFingerprint metadata from ResolveDictionaryCandidates
- Add MSBuild Error when StaticWebAssetPreviousAssetPack is set but missing
- Update tests to verify fingerprint in dcz path

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…erty

- Rename single-letter option variable in Program.cs (zstdCompressionLevelOption → zstdLevelOption)
- Rename _messages → _logMessages in ResolveDictionaryCandidatesTest
- Remove restating comments in ResolveDictionaryCandidates and ResolveCompressedAssets
- Fix CDT integration test for dcz naming with old file fingerprint
- Fix CDT integration test: Dictionary-Hash is endpoint property, not selector
- Handle dcz direct route Content-Type check (fingerprint changes route format)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… nesting, fix bugs

- Split ResolveDictionaryCandidates.ExecuteCore into focused sub-methods
- Remove RelativePath fallback: matching is route-based only
- Rename ProcessCompressedVariants -> ApplyNegotiationToCompressedVariants
- Flatten nesting in ZstdCompress and DiscoverPrecompressedAssets
- Fix pack-only segment skip in ComputeMatchPattern
- Fix dictionary gating in DiscoverPrecompressedAssets
- Make CompressionFormatInfo and PreviousRouteMatch private
- Remove unused test fields and XML doc comments
- Pin exact match pattern values in CDT integration test
- Update all unit tests to use route-based matching with endpoints
- Fix documentation: selector -> property, rewrite hybrid strategy

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…hen tests

- Remove unused System.IO.Compression.FileSystem reference from Tasks.csproj
- Remove unreachable if (!fileInfo.Exists) branch in GeneratePublishAssetPack
  (ResolveFile() already throws on missing files)
- Remove unused errorMessages capture in DiscoverPrecompressedAssetsTest
- Add .zst assertion in Publish_CompressesAllFrameworkFiles (WasmCompressionTests)
- Fix BaselineFactory.EndsWithPathSegment to be segment-aware
  (prevents false matches like 'my_framework' matching '_framework')
- Strengthen CDT integration test unchanged-file control assertion
  (verify unchanged route still has identity endpoint, not just no dcz)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@javiercn javiercn force-pushed the javiercn/zstd-support branch from 81890b9 to e5b6423 Compare April 18, 2026 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area-AspNetCore RazorSDK, BlazorWebAssemblySDK, StaticWebAssetsSDK

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[StaticWebAssets] Zstd compression and Compression Dictionary Transport (CDT) support

1 participant