JitKernels is a JIT-first kernel loader with provider support.
The first provider is Hugging Face kernels 🤗, but the project is being shaped to support other kernel ecosystems later, including FlashAttention, FlashInfer, and similar provider-specific loaders.
The upstream kernels package is useful, but its default loading model has a few sharp edges:
- It routes many kernels through prebuilt wheel-like payloads downloaded from the Hub, which can break on ABI mismatches such as
nogilPython builds. - Native kernels are often compiled against an older CUDA /
nvccmatrix than the local machine can use. - Triton and other pure-Python kernels still go through the same artifact loader even though they can already JIT themselves on first use.
JitKernels changes the default 🚀:
- Pure Python / Triton / noarch kernels are imported directly from Hub source or Hub build payloads without treating them like binary wheels.
- Sourceful native kernels can be compiled locally on first run instead of relying on the published binary payload.
- Legacy native kernels that still do not publish enough source metadata fail loudly instead of silently falling back to incompatible binaries.
This repository is the bootstrap implementation.
Current support:
- Direct import for pure Python Hub kernels, including Triton-style kernels that JIT naturally at first call.
- Detection of sourceful native kernels from Hub metadata.
- Local native build path for sourceful
torch-only CUDA/CPU kernels. - Strict errors for native kernels that still do not expose enough source or require unsupported build dependencies.
Not implemented yet:
- Full dependency resolution for non-
torchnative build inputs such as CUTLASS. - ROCm/XPU/Metal native compilation.
- End-to-end coverage for every historical
kernels-community/*repo variant.
pip install -e .Load directly with the provider-oriented API:
import jitkernels
from jitkernels import KernelProvider
kernel = jitkernels.load(KernelProvider.HF_KERNELS, "triton-layer-norm")Check whether a kernel is available through the active provider integration:
import jitkernels
if jitkernels.has("hf", "triton-layer-norm"):
print("kernel is supported")Inspect what JitKernels plans to do before loading:
import jitkernels
from jitkernels import KernelProvider
inspection = jitkernels.inspect(KernelProvider.HF_KERNELS, "cv_utils")
print(inspection.strategy)
print(inspection.expected_extension_name)Patch the upstream HF kernels package so existing call sites keep working:
import jitkernels
jitkernels.patch()
from kernels import get_kernel
kernel = get_kernel("kernels-community/triton-layer-norm")Undo the patch when you want upstream behavior back:
import jitkernels
jitkernels.unpatch()CLI examples:
jitkernels inspect triton-layer-norm
jitkernels inspect cv_utils --provider hf_kernels
jitkernels warmup triton-layer-normJitKernels keeps a local cache under ~/.cache/jitkernels by default. The location can be overridden with JITKERNELS_CACHE.
The current codebase is HF-Kernels-first, not HF-only. The patching API follows the same provider-oriented shape as loading: patch(provider=...) and unpatch(provider=...).
The main user-facing load path is also provider-specific: load(provider, kernel_name) takes a provider-local kernel name rather than exposing Hugging Face repo ids in the default API surface.
For native kernels, the first milestone intentionally only accepts sourceful kernels whose build.toml dependencies are a subset of:
torch
Anything else raises a clear error so unsupported dependency resolution does not get silently hidden behind a binary fallback.