Fix/tweak pinned memory accounting#13221
Conversation
Some workflows have more extranous use of shared GPU memory than is accounted for in the 5% pin headroom. Lower this for safety.
TOTAL_PINNED_MEMORY is shared between the legacy and aimdo pinning systems, however this catch-all assumes only the legacy system exists. Remove the catch-all as the PINNED_MEMORY buffer is coherent already.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe change modifies pinned-memory budget allocation in the model management system. Specifically, it adjusts the maximum pinned-memory multipliers for NVIDIA/AMD GPUs on both platforms: the Windows multiplier decreased from 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
* mm: Lower windows pin threshold Some workflows have more extranous use of shared GPU memory than is accounted for in the 5% pin headroom. Lower this for safety. * mm: Remove pin count clearing threshold. TOTAL_PINNED_MEMORY is shared between the legacy and aimdo pinning systems, however this catch-all assumes only the legacy system exists. Remove the catch-all as the PINNED_MEMORY buffer is coherent already.
A bugfix and and tweak to pinned memory accounting from a shared debug session.
I never reproduced this personally.
The pinned memory accumulation was clearly under-reporting the shared-gpu-memory usage, possibly due to rounding fragmentation. Tweaking the pin ceiling saved the crash.
Example test conditions (from user report and live debug session):
Windows, RTX5080 laptop, 32GB RAM
LTX2.3 720Px12s (with latent upscaler)
Before:
After:
Workflow runs ✅
Regression perf test:
Windows 5060, 64GB, LTX2 FP16.
Before (30.9GB shared mem usage):
After (27.2 GB shared mem usage):
Windows 5060, 64GB, wan 2.2 14Bx2 FP16:
Before (29.8GB shared mem usage):
After: