Fix/tweak pinned memory accounting by rattus128 · Pull Request #13221 · Comfy-Org/ComfyUI

rattus128 · 2026-03-29T15:22:01Z

A bugfix and and tweak to pinned memory accounting from a shared debug session.

I never reproduced this personally.

The pinned memory accumulation was clearly under-reporting the shared-gpu-memory usage, possibly due to rounding fragmentation. Tweaking the pin ceiling saved the crash.

Example test conditions (from user report and live debug session):

Windows, RTX5080 laptop, 32GB RAM
LTX2.3 720Px12s (with latent upscaler)

Before:

...
  File "C:\Users\xxx\Documents\comfyui-portable\ComfyUI\comfy\lora.py", line 428, in calculate_weight
    output = v.calculate_weight(weight, key, strength, strength_model, offset, function, intermediate_dtype, original_weights)
  File "C:\Users\xxx\Documents\comfyui-portable\ComfyUI\comfy\weight_adapter\lora.py", line 236, in calculate_weight
    mat1 = comfy.model_management.cast_to_device(
        v[0], weight.device, intermediate_dtype
    )
  File "C:\Users\xxx\Documents\comfyui-portable\ComfyUI\comfy\model_management.py", line 1319, in cast_to_device
    return cast_to(tensor, dtype=dtype, device=device, non_blocking=non_blocking, copy=copy)
  File "C:\Users\xxx\Documents\comfyui-portable\ComfyUI\comfy\model_management.py", line 1314, in cast_to
    r.copy_(weight, non_blocking=non_blocking)
    ~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.AcceleratorError: CUDA error: out of memory
Search for `cudaErrorMemoryAllocation' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
...

After:

Workflow runs ✅

Regression perf test:

Windows 5060, 64GB, LTX2 FP16.

Before (30.9GB shared mem usage):

Prompt executed in 230.68 seconds

After (27.2 GB shared mem usage):

Prompt executed in 231.95 seconds

Windows 5060, 64GB, wan 2.2 14Bx2 FP16:

Before (29.8GB shared mem usage):

Prompt executed in 77.16 seconds

After:

Prompt executed in 77.44 seconds

Some workflows have more extranous use of shared GPU memory than is accounted for in the 5% pin headroom. Lower this for safety.

TOTAL_PINNED_MEMORY is shared between the legacy and aimdo pinning systems, however this catch-all assumes only the legacy system exists. Remove the catch-all as the PINNED_MEMORY buffer is coherent already.

coderabbitai · 2026-03-29T15:24:47Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 72f9ad9f-c14a-4f7d-9f12-d2e06fea5af4

📥 Commits

Reviewing files that changed from the base of the PR and between a500f1e and c4e6511.

📒 Files selected for processing (1)

comfy/model_management.py

📝 Walkthrough

Walkthrough

The change modifies pinned-memory budget allocation in the model management system. Specifically, it adjusts the maximum pinned-memory multipliers for NVIDIA/AMD GPUs on both platforms: the Windows multiplier decreased from 0.45 to 0.40, and the non-Windows multiplier decreased from 0.95 to 0.90. Additionally, the unpin_memory() function no longer explicitly sets TOTAL_PINNED_MEMORY to 0 after unpinning the final pinned tensor.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main changes: adjusting pinned memory budget multipliers and removing explicit zeroing logic in unpin_memory.
Description check	✅ Passed	The description is directly related to the changeset, explaining the bugfix context, providing reproduction details, and including performance regression test results.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

* mm: Lower windows pin threshold Some workflows have more extranous use of shared GPU memory than is accounted for in the 5% pin headroom. Lower this for safety. * mm: Remove pin count clearing threshold. TOTAL_PINNED_MEMORY is shared between the legacy and aimdo pinning systems, however this catch-all assumes only the legacy system exists. Remove the catch-all as the PINNED_MEMORY buffer is coherent already.

rattus128 added 2 commits March 30, 2026 00:39

mm: Lower windows pin threshold

d188a67

Some workflows have more extranous use of shared GPU memory than is accounted for in the 5% pin headroom. Lower this for safety.

mm: Remove pin count clearing threshold.

c4e6511

TOTAL_PINNED_MEMORY is shared between the legacy and aimdo pinning systems, however this catch-all assumes only the legacy system exists. Remove the catch-all as the PINNED_MEMORY buffer is coherent already.

rattus128 requested review from Kosinkadink, comfyanonymous and guill as code owners March 29, 2026 15:22

rattus128 mentioned this pull request Mar 29, 2026

Regression: CUDA Out of Memory during offload_stream.synchronize() after recent update #13210

Closed

1 task

comfyanonymous merged commit 8d723d2 into Comfy-Org:master Mar 29, 2026
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/tweak pinned memory accounting#13221

Fix/tweak pinned memory accounting#13221
comfyanonymous merged 2 commits intoComfy-Org:masterfrom
rattus128:prs/pin-balances

rattus128 commented Mar 29, 2026

Uh oh!

coderabbitai bot commented Mar 29, 2026

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rattus128 commented Mar 29, 2026

Uh oh!

coderabbitai bot commented Mar 29, 2026

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants