Skip to content

[Bug]: k8s-cc-manager image pull fails with "Access Denied" when logged in with NGC API key #2295

@dimisik

Description

@dimisik

Describe the bug
In the recent GPU Operator v26.3.0 release, ccManager was changed to be enabled by default in the Helm values, with the version bumped to v0.3.0.

Previous (e.g., v25.10.1):

ccManager:
  enabled: false
  defaultMode: "off"
  repository: nvcr.io/nvidia/cloud-native
  image: k8s-cc-manager
  version: v0.1.1

New (v26.3.0):

ccManager:
  enabled: true
  defaultMode: "on"
  repository: nvcr.io/nvidia/cloud-native
  image: k8s-cc-manager
  version: v0.3.0

However, there is an authentication issue with the newly enabled nvcr.io/nvidia/cloud-native/k8s-cc-manager:v0.3.0 OSS image when using an NGC API key for image pulls. While it successfully pulls without authentication, it fails with an Access Denied error if Docker is logged in using an NGC API key.

The k8s-cc-manager pull gets rejected when Docker is authenticated with an NGC API key scoped to another project (like vgpu), breaking deployments that rely on imagePullSecrets or node-level Docker logins for proprietary drivers.

To Reproduce

  1. Pulling without login (Succeeds)
$ docker pull nvcr.io/nvidia/cloud-native/k8s-cc-manager:v0.3.0
v0.3.0: Pulling from nvidia/cloud-native/k8s-cc-manager
...
Status: Downloaded newer image for nvcr.io/nvidia/cloud-native/k8s-cc-manager:v0.3.0
nvcr.io/nvidia/cloud-native/k8s-cc-manager:v0.3.0
  1. Logging in with NGC API Key
$ docker login nvcr.io/nvidia/vgpu
Username: $oauthtoken
Password: <YOUR_NGC_API_KEY>
...
Login Succeeded
  1. Pulling after login (Fails)
$ docker pull nvcr.io/nvidia/cloud-native/k8s-cc-manager:v0.3.0
Error response from daemon: Head "https://nvcr.io/v2/nvidia/cloud-native/k8s-cc-manager/manifests/v0.3.0": denied: {"errors": [{"code": "DENIED", "message": "Access Denied"}]}

Expected behavior

We want the k8s-cc-manager (Scenario 3 below) to have the same authentication behavior as k8s-driver-manager (Scenario 1). Because it is an OSS image, it should successfully pull regardless of whether an NGC API key is currently logged in.

Here is a breakdown of the observed vs. expected behavior:

scenario image example without login with login
1 nvcr.io/nvidia/cloud-native/k8s-driver-manager (OSS) pulls pulls
2 nvcr.io/nvidia/vgpu/vgpu-guest-driver-7:580.126.09-ubuntu24.04 (proprietary) fails (auth error) pulls
3 nvcr.io/nvidia/cloud-native/k8s-cc-manager:v0.3.0 (OSS) pulls fails (auth error)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIssue/PR to expose/discuss/fix a bugneeds-triageissue or PR has not been assigned a priority-px label

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions