Fix is_managed reporting for pool-allocated managed memory by cpcloud · Pull Request #1924 · NVIDIA/cuda-python

cpcloud · 2026-04-16T09:51:55Z

Summary

Buffer.is_managed now returns True when either the driver pointer attribute says so or the owning memory resource advertises managed allocations. The driver signal takes precedence; the resource signal is only a fallback.
Expose is_managed on the MemoryResource base (default False); ManagedMemoryResource overrides it to True. Other subclasses inherit False.

Why

ManagedMemoryResource allocates via cuMemAllocFromPoolAsync from a pool created with CU_MEM_ALLOCATION_TYPE_MANAGED. On some CUDA driver / hardware combinations, cuPointerGetAttributes on those allocations returns IS_MANAGED=0 and MEMORY_TYPE=CU_MEMORYTYPE_HOST. _query_memory_attrs therefore set is_device_accessible=True, is_host_accessible=True, is_managed=False, and classify_dl_device returned kDLCUDAHost (3).

CCCL's make_tma_descriptor (libcudacxx/include/cuda/__tma/make_tma_descriptor.h) accepts only kDLCUDA or kDLCUDAManaged, so StridedMemoryView.as_tensor_map() failed on a ManagedMemoryResource buffer with:

ValueError: Failed to build TMA descriptor via CCCL: Device type must be kDLCUDA or kDLCUDAManaged

Surfaced in TestTensorMapMultiDeviceValidation::test_from_tiled_accepts_managed_buffer_on_nonzero_device on NVIDIA B300 SXM6 AC.

Caveat on the driver behavior

Reproducing the exact pre-fix cuPointerGetAttributes values on RTX 5070 Ti / driver 13.2.0 shows IS_MANAGED=1 and MEMORY_TYPE=DEVICE for both cuMemAllocManaged and cuMemAllocFromPoolAsync from a managed pool — i.e. this configuration does not hit the bug. The fix is still sound: it is a no-op when the driver attributes are reported correctly, and it closes the gap when they aren't, without relying on driver-side quirks. The precise driver / CTK / hw combination that triggers the kDLCUDAHost classification on B300 is not reproduced in this PR; the failing test in the description comes from the reporter's B300 environment.

🤖 Generated with Claude Code

Pool-allocated managed memory via cuMemAllocFromPoolAsync (from a pool created with CU_MEM_ALLOCATION_TYPE_MANAGED) does not set CU_POINTER_ATTRIBUTE_IS_MANAGED=1. _query_memory_attrs therefore classified the allocation as pinned host memory, causing classify_dl_device to return kDLCUDAHost instead of kDLCUDAManaged. CCCL's make_tma_descriptor only accepts kDLCUDA or kDLCUDAManaged, so as_tensor_map() failed with "Device type must be kDLCUDA or kDLCUDAManaged" on managed buffers. Buffer.is_device_accessible / is_host_accessible already delegate to the memory resource when one is attached. Apply the same pattern to is_managed, and expose is_managed on the MemoryResource base (defaulting to False) with ManagedMemoryResource overriding it to True. Also ignore .claude/settings.local.json in .gitignore. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The existing test_managed_buffer_dlpack_roundtrip_device_type uses a DummyUnifiedMemoryResource backed by cuMemAllocManaged, which sets CU_POINTER_ATTRIBUTE_IS_MANAGED and so never exercised the pool-allocated path that surfaced the bug. Add two targeted tests: - test_managed_memory_resource_buffer_dlpack_device_type: allocates from ManagedMemoryResource (cuMemAllocFromPoolAsync on a managed pool) and asserts is_managed and kDLCUDAManaged through Buffer and view. - test_non_managed_resources_report_not_managed: parametrized smoke test ensuring DeviceMemoryResource and PinnedMemoryResource still report is_managed=False so the new MemoryResource.is_managed default does not silently misclassify non-managed resources. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Previous fix unconditionally delegated Buffer.is_managed to _memory_resource.is_managed, which returns False for any MemoryResource subclass that does not opt in. That broke DummyUnifiedMemoryResource (and any user-defined MR wrapping cuMemAllocManaged) where the driver pointer attribute correctly reports IS_MANAGED=1 but the resource does not override is_managed. Query the driver first; only fall back to the memory resource when the driver does not report IS_MANAGED (the pool-allocated managed memory path). This keeps both old-style cuMemAllocManaged buffers and ManagedMemoryResource pool allocations correctly classified. Also rework the regression test parametrization to skip the pinned case when PinnedMemoryResource is unavailable (CUDA < 13.0), and pick up the ruff-format reflow of the helper call site. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Pick up cuda-nvrtc 13.2.78, libcufile 1.17.1.22, and other transitive package updates from conda-forge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-16T17:28:10Z

Doc Preview CI
Preview removed because the pull request was closed or merged.

cpcloud added bug Something isn't working cuda.core Everything related to the cuda.core module labels Apr 16, 2026

cpcloud self-assigned this Apr 16, 2026

cpcloud added this to the cuda.core v1.0.0 milestone Apr 16, 2026

This comment has been minimized.

Sign in to view

cpcloud and others added 3 commits April 16, 2026 06:08

chore: refresh pixi.lock for upstream package updates

7c7c755

Pick up cuda-nvrtc 13.2.78, libcufile 1.17.1.22, and other transitive package updates from conda-forge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions Bot added the Needs-Restricted-Paths-Review PR touches cuda_bindings or cuda_python; only NVIDIA employees may modify these paths; see LICENSEs label Apr 16, 2026

cpcloud enabled auto-merge (squash) April 16, 2026 12:32

cpcloud requested a review from rparolin April 16, 2026 16:48

rparolin approved these changes Apr 16, 2026

View reviewed changes

cpcloud merged commit cb3c132 into NVIDIA:main Apr 16, 2026
173 of 177 checks passed

cpcloud deleted the worktree-linked-splashing-bonbon branch April 16, 2026 17:25

rwgk removed the Needs-Restricted-Paths-Review PR touches cuda_bindings or cuda_python; only NVIDIA employees may modify these paths; see LICENSEs label Apr 16, 2026

cpcloud mentioned this pull request Apr 23, 2026

Add managed-memory advise, prefetch, and discard-prefetch free functions #1775

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix is_managed reporting for pool-allocated managed memory#1924

Fix is_managed reporting for pool-allocated managed memory#1924
cpcloud merged 4 commits intoNVIDIA:mainfrom
cpcloud:worktree-linked-splashing-bonbon

cpcloud commented Apr 16, 2026 •

edited

Loading

Uh oh!

This comment has been minimized.

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cpcloud commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Caveat on the driver behavior

Uh oh!

This comment has been minimized.

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cpcloud commented Apr 16, 2026 •

edited

Loading