Skip to content

ModelCloud/JITKernels

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JitKernels ⚡

JitKernels is a JIT-first kernel loader with provider support.

The first provider is Hugging Face kernels 🤗, but the project is being shaped to support other kernel ecosystems later, including FlashAttention, FlashInfer, and similar provider-specific loaders.

The upstream kernels package is useful, but its default loading model has a few sharp edges:

  1. It routes many kernels through prebuilt wheel-like payloads downloaded from the Hub, which can break on ABI mismatches such as nogil Python builds.
  2. Native kernels are often compiled against an older CUDA / nvcc matrix than the local machine can use.
  3. Triton and other pure-Python kernels still go through the same artifact loader even though they can already JIT themselves on first use.

JitKernels changes the default 🚀:

  • Pure Python / Triton / noarch kernels are imported directly from Hub source or Hub build payloads without treating them like binary wheels.
  • Sourceful native kernels can be compiled locally on first run instead of relying on the published binary payload.
  • Legacy native kernels that still do not publish enough source metadata fail loudly instead of silently falling back to incompatible binaries.

Status 🧪

This repository is the bootstrap implementation.

Current support:

  • Direct import for pure Python Hub kernels, including Triton-style kernels that JIT naturally at first call.
  • Detection of sourceful native kernels from Hub metadata.
  • Local native build path for sourceful torch-only CUDA/CPU kernels.
  • Strict errors for native kernels that still do not expose enough source or require unsupported build dependencies.

Not implemented yet:

  • Full dependency resolution for non-torch native build inputs such as CUTLASS.
  • ROCm/XPU/Metal native compilation.
  • End-to-end coverage for every historical kernels-community/* repo variant.

Install 📦

pip install -e .

Usage 🛠️

Load directly with the provider-oriented API:

import jitkernels
from jitkernels import KernelProvider

kernel = jitkernels.load(KernelProvider.HF_KERNELS, "triton-layer-norm")

Check whether a kernel is available through the active provider integration:

import jitkernels

if jitkernels.has("hf", "triton-layer-norm"):
    print("kernel is supported")

Inspect what JitKernels plans to do before loading:

import jitkernels
from jitkernels import KernelProvider

inspection = jitkernels.inspect(KernelProvider.HF_KERNELS, "cv_utils")
print(inspection.strategy)
print(inspection.expected_extension_name)

Patch the upstream HF kernels package so existing call sites keep working:

import jitkernels

jitkernels.patch()

from kernels import get_kernel

kernel = get_kernel("kernels-community/triton-layer-norm")

Undo the patch when you want upstream behavior back:

import jitkernels

jitkernels.unpatch()

CLI examples:

jitkernels inspect triton-layer-norm
jitkernels inspect cv_utils --provider hf_kernels
jitkernels warmup triton-layer-norm

Design Notes 🧠

JitKernels keeps a local cache under ~/.cache/jitkernels by default. The location can be overridden with JITKERNELS_CACHE.

The current codebase is HF-Kernels-first, not HF-only. The patching API follows the same provider-oriented shape as loading: patch(provider=...) and unpatch(provider=...).

The main user-facing load path is also provider-specific: load(provider, kernel_name) takes a provider-local kernel name rather than exposing Hugging Face repo ids in the default API surface.

For native kernels, the first milestone intentionally only accepts sourceful kernels whose build.toml dependencies are a subset of:

  • torch

Anything else raises a clear error so unsupported dependency resolution does not get silently hidden behind a binary fallback.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages