Popular repositories Loading
-
-
vllm-gaudi
vllm-gaudi PublicForked from vllm-project/vllm-gaudi
Community maintained hardware plugin for vLLM on Intel Gaudi
Python
-
neural-compressor
neural-compressor PublicForked from intel/neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime
Python
-
vllm
vllm PublicForked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Python
-
lm-evaluation-harness
lm-evaluation-harness PublicForked from EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
Python
-
ray
ray PublicForked from ray-project/ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Python
If the problem persists, check the GitHub status page or contact support.
