14-stage Fusion Pipeline for LLM token compression — reversible compression, AST-aware code analysis, intelligent content routing. Zero LLM inference cost. MIT licensed.
-
Updated
Apr 1, 2026 - Python
14-stage Fusion Pipeline for LLM token compression — reversible compression, AST-aware code analysis, intelligent content routing. Zero LLM inference cost. MIT licensed.
[TMLR 2026] Survey: https://arxiv.org/pdf/2507.20198
📚 Collection of token-level model compression resources.
Token-Oriented Object Notation - A compact data format for reducing token consumption when sending structured data to LLMs (PHP implementation)
The official code for the paper: LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs
Official repository of the paper "A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models"
You say it. AutoCode builds it. 38 professional skills, persistent memory, 60%+ dev cost savings. Zero dependencies. Free forever.
[CVPR 2026] FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding
[ICLR 2026 Oral] FlashVID: Efficient Video Large Language Models via Training-free Tree-based Spatiotemporal Token Merging
[ICLR 2026] Official code repository for "⚡️VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration"
Open-source AI gateway written in Rust, with token compression for Claude Code, Codex... and any other LLM client.
[ICLR 2026] MergeMix: A Unified Augmentation Paradigm for Visual and Multi-Modal Understanding
😎 Awesome papers on token redundancy reduction
⚡ Compress Claude Code context by 60-90%. Six noise filters RTK doesn't have.
This repo integrates DyCoke's token compression method with VLMs such as Gemma3 and InternVL3
The browser engine for agents. HTML in, Semantic Object Model out. 10x token compression, V8 JS rendering, CDP compatible. Apache-2.0.
[ICLR 2026] Official code of PPE: Positional Preservation Embedding for Token Compression in Multimodal Large Language Models.
Rust Local Token Compression Proxy for coding agents, built solo for GenAI Genesis 2026. 🏆 1st Google Sustainability Hack
Token compression + context memory for Claude Code etc. Runs automatically. No configuration required.
Official implementation of TCSVT 2025 paper: DiViCo: Disentangled Visual Token Compression For Efficient Large Vision-Language Model
Add a description, image, and links to the token-compression topic page so that developers can more easily learn about it.
To associate your repository with the token-compression topic, visit your repo's landing page and select "manage topics."