Cross-platform desktop transcription app built with Tauri + Leptos. Transcribe audio files or record live — using local Whisper models or any OpenAI-compatible API — then post-process results with AI prompt templates. One Rust codebase, all desktop platforms.
| Recording | Settings | Post-processing |
|---|---|---|
![]() |
![]() |
![]() |
- Local & API transcription — run Whisper locally (tiny, base, small, large-v3-turbo) or hit any OpenAI-compatible endpoint. Automatic model download and caching for local mode.
- File import — drag in WAV, MP3, FLAC, OGG, or M4A files. Large files are auto-compressed to MP3 before API upload.
- Live recording — push-to-talk or toggle mode with a configurable global hotkey. Works even when the app is in the background.
- Meeting capture — dual-stream mode records microphone and system audio simultaneously, then mixes them into one file for transcription.
- Real-time streaming — transcription segments appear as they're produced, with progress updates.
- Post-processing pipeline — send transcripts through AI prompts. Ships with built-in templates (cleanup, meeting notes, summary) and supports custom user templates. Run post-processing locally with a bundled llama-server sidecar (no API key needed) or via any OpenAI-compatible API.
- Secure API key storage — credentials are stored in the system keyring, not in config files.
- Persistent settings — provider, model, device, hotkey, and API config auto-save with debouncing.
- Cross-platform — macOS, Windows, and Linux from one codebase.
| Layer | Technology |
|---|---|
| Frontend | Leptos (Rust → WASM), bundled with Trunk |
| Desktop runtime | Tauri 2 |
| Local transcription | whisper-rs (whisper.cpp bindings) |
| Local LLM | llama-server sidecar (llama.cpp, OpenAI-compatible HTTP API) |
| Audio I/O | CPAL for capture, Symphonia for decoding |
| API transport | reqwest with multipart streaming |
| Settings | JSON config + system keyring for secrets |
frontend/src/ Leptos UI components and state
src-tauri/src/
commands.rs Tauri IPC command handlers
transcription.rs Transcription orchestration
llm_engine.rs llama-server sidecar lifecycle and chat completion
live_recording/ Audio capture and mixing
providers/ Backend adapters (Whisper, OpenAI, local LLM)
settings.rs Persistent config management
Prerequisites:
- Rust stable
wasm32-unknown-unknowntargettrunktauri-clicargo-makecmake(required to build whisper.cpp from source)- A C/C++ compiler (Xcode Command Line Tools on macOS, MSVC on Windows,
gcc/g++on Linux) - Platform dependencies required by Tauri
Commands:
cargo install cargo-make --locked
cargo make setup
./scripts/download-llama-server.sh # download llama-server sidecar binary
cargo make devProduction build:
cargo make buildUseful task shortcuts:
cargo make setup: install the WASM target plus required CLI toolscargo make dev: run the Tauri desktop app in development modecargo make build: build production desktop bundlescargo make build-frontend: build the Leptos frontend only
This repo is intentionally Rust-first to reduce future JavaScript ecosystem maintenance. The UI is simple enough that Leptos and Trunk are a good fit, while native-heavy behavior like audio capture, hotkeys, settings persistence, and local model orchestration still live in Tauri.


