Local-only AI coding agent. No API keys. No network. Just code.
Quick Start • Air-Gapped Deploy • Tools • Architecture • Config
zipcode is a Rust-based coding agent that runs entirely on your machine using local LLM inference. Powered by Gemma 4 through the candle ML framework, it provides Claude Code-like functionality — file editing, shell execution, code search — without ever touching the network.
Ship it on a USB stick. Run it in a SCIF. It just works.
$ zipcode
zipcode v0.1.0 -- local AI coding agent
Type /help for commands, Ctrl+D to exit
> read main.rs and add error handling to the database connection
Reading src/main.rs...
> edit_file {"file_path": "src/main.rs", ...}
< Successfully edited src/main.rs
I've wrapped the database connection in a proper error handler with
retry logic and connection pooling...
| Problem | zipcode Solution |
|---|---|
| API keys expire, leak, or get rate-limited | No API. Model runs locally. |
| Corporate networks block LLM endpoints | No network needed after setup. |
| Sensitive code can't leave the machine | Everything stays on disk. |
| Cloud LLMs add latency | GPU inference on your hardware. |
Setup requires npm, pip, Docker... |
Single static binary. |
git clone https://github.com/devswha/zipcode.git
cd zipcode
cargo build --release# Automated (requires internet + huggingface-cli)
./scripts/download_model.sh
# Or manual: download Gemma 4 27B GGUF from HuggingFace
# Place in ~/.zipcode/models/./target/release/zipcode doctorzipcode doctor
--------------
Binary: zipcode v0.1.0 (linux-x86_64)
CUDA: Available
Model: gemma-4-27b-it-Q8_0.gguf (28.3 GB)
Ready: All checks passed
# Interactive REPL
zipcode
# One-shot
zipcode prompt "explain this codebase"
# Custom model
zipcode --model ./my-model.gguf
# Read-only mode (safe exploration)
zipcode --permission-mode read-onlyzipcode was designed from the ground up for isolated networks. Package everything into a ZIP, move it on a USB drive, and run.
./scripts/package.sh
# Creates: dist/zipcode-v0.1.0-linux-x86_64-cuda.zipzipcode-v0.1.0-linux-x86_64-cuda.zip
|- zipcode # single binary (~30 MB)
|- libcudart.so.12 # CUDA runtime (optional)
|- install.sh # one-command setup
|- download_model.sh # model download helper
|- README.md
'- models/
'- PLACE_MODEL_HERE.txt
Internet Machine Air-Gapped Machine
================ ==================
1. Download zipcode.zip
2. Download gemma-4-27b.gguf
3. Copy to USB
---- USB ---->
4. Unzip
5. ./install.sh
6. cp *.gguf ~/.zipcode/models/
7. zipcode doctor
8. zipcode
zipcode comes with 10 built-in tools that the model can invoke autonomously:
| Tool | Description | Permission |
|---|---|---|
| Bash | Execute shell commands via bash -c |
Needs approval in workspace-write |
| ReadFile | Read file contents with line numbers, offset/limit | Always allowed |
| WriteFile | Create or overwrite files, auto-creates parent dirs | Blocked in read-only |
| EditFile | Targeted string replacement in existing files | Blocked in read-only |
| GlobSearch | Find files by glob pattern (**/*.rs, src/*.py) |
Always allowed |
| GrepSearch | Search file contents with regex, returns file:line: |
Always allowed |
| TodoWrite | Structured todo list persistence (.zipcode-todos.json) |
Blocked in read-only |
| REPL | Execute Python/Node.js code snippets | Blocked in read-only |
| Agent | Spawn sub-agent for delegated tasks | Stub in v0.1 |
| ToolSearch | Search available tools by keyword | Always allowed |
All tool output is automatically truncated at 8 KB with a byte-count summary.
| Mode | Bash | Writes | Reads | Use case |
|---|---|---|---|---|
read-only |
Denied | Denied | Allowed | Safe exploration |
workspace-write |
Approval required | Allowed | Allowed | Default |
full-access |
Allowed | Allowed | Allowed | Trusted automation |
+-------------------------------------------------+
| zipcode CLI |
| REPL / one-shot / doctor |
+-------------------------------------------------+
| Agentic Runtime |
| +--------+ +----------+ +---------+ |
| |Session | |Permission| | Config | |
| |Manager | | Policy | | Loader | |
| +--------+ +----------+ +---------+ |
| +-----------------+ |
| | Conversation |<--- tool results |
| | Loop | |
| +-------+---------+ |
| | tool calls |
| +-------v---------+ |
| | Tool Router | |
| +--+---------+----+ |
| | | |
| +-----+ +----+----+ |
| |Bash | |FileOps | ...10 tools |
| +-----+ +---------+ |
+-------------------------------------------------+
| Inference Engine |
| candle GGUF loader + CUDA acceleration |
| Tokenizer | KV Cache | Streaming Generation |
+-------------------------------------------------+
| Model Store |
| ~/.zipcode/models/*.gguf |
+-------------------------------------------------+
zipcode/
|- crates/
| |- inference/ Candle GGUF engine, Gemma 4 chat template, sampler
| |- tools/ 10 tool implementations + Tool trait + registry
| |- runtime/ Agentic loop, config, permissions, sessions
| '- cli/ REPL, one-shot, doctor, slash commands
|- scripts/ install.sh, download_model.sh, package.sh
|- models/ .gguf model files (gitignored)
'- docs/ Design specs + implementation plans
cli --> runtime --> inference
|
'--> tools
| Crate | Responsibility |
|---|---|
| inference | GGUF loading, tokenization, KV cache, streaming generation. Pure computation. |
| tools | 10 tool implementations. Tool trait, ToolRegistry, ToolResult truncation. |
| runtime | Agentic conversation loop. Config hierarchy, permission policy, session persistence. |
| cli | Binary entry point. REPL (rustyline), ANSI rendering (termimad), clap argument parsing. |
| Command | Description |
|---|---|
zipcode |
Start interactive REPL |
zipcode prompt "..." |
One-shot prompt, then exit |
zipcode doctor |
Check CUDA, model, binary version |
| Flag | Default | Description |
|---|---|---|
--model <PATH> |
~/.zipcode/models/ |
Path to .gguf file |
--permission-mode |
workspace-write |
read-only / workspace-write / full-access |
| Command | Action |
|---|---|
/help |
Show commands |
/status |
Session ID, message count, cwd |
/clear |
Reset conversation |
/quit |
Exit |
Loaded from working directory. Overrides global ~/.zipcode/config.json.
{
"permission_mode": "workspace-write",
"model_dir": "~/.zipcode/models",
"generation": {
"temperature": 0.7,
"top_p": 0.9,
"max_tokens": 4096
}
}Place in any project root. Contents are injected into the system prompt.
# My Project
- Language: Rust 2021
- Build: cargo build --release
- Test: cargo test --workspace
- Never modify files under vendor/~/.zipcode/
|- config.json # global config
|- models/ # GGUF model files
'- sessions/ # conversation history
'- <uuid>.json
| Component | Minimum |
|---|---|
| Model | Gemma 4 27B IT (GGUF format) |
| VRAM | 24 GB for Q8_0 quantization |
| CUDA | 12.0+ (optional, CPU fallback available) |
| Disk | ~28 GB for Q8_0 model file |
# Option 1: helper script
./scripts/download_model.sh
# Option 2: manual
# Download from: huggingface.co/google/gemma-4-27b-it-GGUF
# Place in: ~/.zipcode/models/CPU inference works but is significantly slower. zipcode auto-detects CUDA availability at startup.
| Crate | Purpose |
|---|---|
| candle | Rust-native ML framework for GGUF inference |
| tokenizers | HuggingFace tokenizer |
| clap | CLI argument parsing |
| rustyline | REPL line editing |
| termimad | Markdown to ANSI rendering |
| tokio | Async runtime |
Contributions welcome. Please open an issue first for major changes.
# Development workflow
cargo test --workspace # run all tests
cargo clippy --workspace # lint
cargo fmt --all # format
cargo build --release -p zipcode # build binaryInspired by claw-code by @instructkr.
Built with candle by Hugging Face.