llm-benchmark

Star

Here are 7 public repositories matching this topic...

enescingoz / mac-llm-bench

Star

Community benchmark database for running LLMs on Apple Silicon Macs

benchmark inference apple-silicon llm llama-cpp local-llm llm-benchmark tokens-per-second

Updated Apr 9, 2026
Shell

vanderheijden86 / showdown-claude-skill

Star

Claude Code skill that pits Claude, ChatGPT, and Gemini against each other, then lets them cross-judge each other blind

gemini ai-tools chatgpt anthropic-claude llm-comparison claude-code llm-benchmark claude-skill claude-plugin

Updated Feb 11, 2026
Shell

Jesutofunmie / Haiku-4.5-vs-Minimax-2.1

Star

🧠 Benchmark Haiku 4.5 and MiniMax M2.1 on agentic tasks, revealing strengths in design thinking and operational skills for multi-turn workflows.

minimax model-evaluation system-design ai-agents claude anthropic coding-agents agentic-ai llm-comparison llm-benchmark

Updated Apr 14, 2026
Shell

1337hero / rx7900xtx-llama-bench-rocm

Star

Benchmark script for llama.cpp & results for AMD RX 7900 XTX

linux benchmark amd llama amdgpu rocm radeon llm llamacpp llm-benchmarking llm-benchmark

Updated Dec 11, 2025
Shell

kristianbonnici / ollama-bench

Star

Benchmark Ollama models on your own prompts, on your own hardware.

bash cli benchmark model-comparison llm local-llm ollama llm-benchmark ollama-benchmark

Updated Apr 8, 2026
Shell

sergeyklay / agentprobe

Star

Reproducible benchmark framework for testing hypotheses about AI coding agents

coding-agents ai-agent-evaluation claude-code agentic-coding swe-bench context-engineering llm-evaluation-benchmark llm-benchmark

Updated Mar 4, 2026
Shell

EmZod / Haiku-4.5-vs-Minimax-2.1

Star

Systematic benchmark comparing Claude Haiku 4.5 vs MiniMax M2.1 on agentic coding tasks. Includes full audit trails, LLM-as-judge evaluation, and path divergence analysis.

minimax model-evaluation system-design ai-agents claude anthropic coding-agents agentic-ai llm-comparison llm-benchmark

Updated Jan 14, 2026
Shell

Improve this page

Add a description, image, and links to the llm-benchmark topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-benchmark topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-benchmark

Here are 7 public repositories matching this topic...

enescingoz / mac-llm-bench

vanderheijden86 / showdown-claude-skill

Jesutofunmie / Haiku-4.5-vs-Minimax-2.1

1337hero / rx7900xtx-llama-bench-rocm

kristianbonnici / ollama-bench

sergeyklay / agentprobe

EmZod / Haiku-4.5-vs-Minimax-2.1

Improve this page

Add this topic to your repo