Skip to content

Record the source URI of imported images at /.enroot/source#267

Open
alec-flowers wants to merge 1 commit intoNVIDIA:mainfrom
alec-flowers:feat/record-import-source
Open

Record the source URI of imported images at /.enroot/source#267
alec-flowers wants to merge 1 commit intoNVIDIA:mainfrom
alec-flowers:feat/record-import-source

Conversation

@alec-flowers
Copy link
Copy Markdown

@alec-flowers alec-flowers commented Apr 17, 2026

Summary

enroot import docker://... and enroot load now drop a tiny provenance
file inside the imported image at /.enroot/source:

uri=docker://nvcr.io#nvidia/pytorch:25.06-py3
digest=sha256:<manifest-digest>

This answers the recurring question "where did this .sqsh come from?"
— a question that's currently unanswerable once an imported image has
been copied between hosts or handed off to someone else, since enroot
keeps no trace of the source URI in the image itself.

Motivation

For reproducibility work around benchmark pipelines, we want to be able
to look at a squashfs on disk and know which registry image produced it.
enroot digest (added in dba4f81) helps before import, but once
imported there's no link back to the source. External sidecar metadata
works until the file gets renamed or moved.

This puts the provenance inside the image itself, so it travels with
the .sqsh.

Design notes

  • Path /.enroot/source: uses the existing enroot-owned namespace
    (runtime.sh:25 bundle_dir="/.enroot"), which is already bind-mounted
    read-only at runtime so a container can't tamper with its own
    provenance record. No collision risk with guest OS files in /etc/.
  • File format: key=value lines, shell-parseable, trivial to read
    with unsquashfs -cat image.sqsh .enroot/source or from inside a
    running container.
  • Credential stripping: USER@ is removed using the user already
    parsed by docker::_parse_uri, so the IMAGE@DIGEST shorthand
    (e.g. docker://ubuntu@sha256:...) isn't mis-handled.
  • Manifest digest: fetched via a lightweight HEAD on the manifest
    URL, same pattern as docker::digest. Empty (key omitted) for
    dockerd:// / podman:// since there is no registry manifest.
  • enroot export intentionally strips /.enroot/, which is correct
    here: a rootfs that was created, modified, and re-exported is no
    longer the image at the original URI, so dropping the field avoids
    lying. No changes to runtime.sh needed.
  • No new CLI: unsquashfs -cat is sufficient; if demand warrants,
    a wrapper command can be added later.

Diff size

50 insertions / 12 deletions across src/docker.sh and
doc/image-format.md.

Test plan

  • bash -n src/docker.sh
  • Unit-tested docker::_sanitize_uri across the URI shapes
    accepted by docker::_parse_uri (with and without USER@,
    with and without @DIGEST, IMAGE@DIGEST shorthand,
    dockerd://, podman://).
  • enroot import docker://ubuntu:22.04
    unsquashfs -cat ubuntu+22.04.sqsh .enroot/source shows URI
    and digest.
  • enroot import docker://user@registry/image:tag with credentials
    in .credentials → recorded URI has no user@.
  • enroot import dockerd://local-image:tag → URI-only record,
    no digest= line.

End-to-end validation against enroot 4.1.2 on x86_64 — see comment
below for the verbatim output.

Happy to iterate on the file location, key names, or scope (e.g. add
arch=, imported_at=) if preferred.

enroot import docker://... and enroot load now write a small provenance
file inside the image rootfs recording the URI and manifest digest. The
URI is captured as provided to enroot, with any USER@ credential
component stripped. dockerd:// and podman:// imports record the URI
only (no registry digest available).

The file can be read with unsquashfs -cat image.sqsh .enroot/source
or, once the image is unpacked, from inside a running container.

No new CLI, no runtime.sh changes: enroot export already strips
/.enroot/ which is correct behavior here, since a rootfs modified and
re-exported is no longer the image at the original URI.

Signed-off-by: Alec Flowers <aflowers@nvidia.com>
@alec-flowers
Copy link
Copy Markdown
Author

alec-flowers commented Apr 17, 2026

End-to-end test output

Tested against enroot 4.1.2 (enroot_4.1.2-1_amd64.deb) on x86_64,
with src/docker.sh and src/common.sh/src/runtime.sh/src/bundle.sh
pointed at the patched tree via ENROOT_LIBRARY_PATH.

1. enroot import docker://ubuntu:22.04

$ enroot import -o ubuntu-22.04.sqsh docker://ubuntu:22.04
[INFO] Fetching image manifest list
[INFO] Fetching image manifest
[INFO] Downloading 1 missing layers...
[INFO] Extracting image layers...
[INFO] Converting whiteouts...
...

$ unsquashfs -cat ubuntu-22.04.sqsh .enroot/source
uri=docker://ubuntu:22.04
digest=sha256:14be402d3f1eeeb5e7da73d3260322c68e7b51c88388f53e88eb21d6450bd520

Digest matches enroot digest docker://ubuntu:22.04 output
(sha256:14be402d3f1eeeb5e7da73d3260322c68e7b51c88388f53e88eb21d6450bd520)
— confirming we're capturing the same value the existing digest
command returns.

2. enroot import dockerd://alpine:3.19

$ docker pull alpine:3.19
$ enroot import -o alpine.sqsh dockerd://alpine:3.19
...
$ unsquashfs -cat alpine.sqsh .enroot/source
uri=dockerd://alpine:3.19

URI only, no digest= line — as documented (no registry manifest for
daemon imports).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant