My Translator is a real-time speech translation desktop app built with Tauri. It captures audio directly from your system or microphone, transcribes it, and displays translations in a minimal overlay — with no intermediary server involved.
📖 Installation guides: macOS (EN) · macOS (VI) · Windows (EN) · Windows (VI)
System Audio / Mic → 16kHz PCM → Soniox API (STT + Translation) → Overlay UI
↓ (optional)
TTS (Edge/Google/ElevenLabs) → 🔊
| Feature | Detail |
|---|---|
| Latency | ~2–3s |
| Languages | 70+ (source) → any target, one-way & two-way |
| Cost | ~$0.12/hr (Soniox API) |
| TTS | 3 providers (Edge free, Google, ElevenLabs) |
| Platform | macOS (ARM + Intel) · Windows |
| Signed | ✅ macOS signed & notarized |
| Auto-Update | ✅ Built-in, check & install from Settings |
Two display modes:
- Single (default) — Translation text only, clean and focused
- Dual — Source | Translation side-by-side, each panel scrolls independently
Toggle with the panel button (bottom-right on hover).
Auto-scroll only when you're at the bottom. Scroll up to read old content without being yanked back down.
A- / A+ floating controls (bottom-right on hover). Font size adjustable up to 140px — great for presentations.
Translate conversations between two languages simultaneously — ideal for bilingual meetings.
- One-way: Source language → Target language (e.g., Japanese → Vietnamese)
- Two-way: Language A ↔ Language B (e.g., Vietnamese ↔ Japanese) — the app detects who is speaking and translates to the other language automatically
Setup for video calls (Zoom, Google Meet, MS Teams):
- Audio Source: Both (System + Mic)
- Translation Type: Two-way
- Set Language A and Language B
Note: TTS narration is automatically disabled in two-way mode to prevent audio feedback loops (TTS output → mic recapture → re-translation).
Read translations aloud in one-way mode — 3 providers:
| Edge TTS ⭐ | Google Chirp 3 HD | ElevenLabs | |
|---|---|---|---|
| Cost | Free | Free 1M chars/mo | ~$5/mo+ |
| Quality | ★★★★☆ Neural | ★★★★★ Near-human | ★★★★★ Premium |
| Vietnamese | ✅ 2 voices | ✅ 6 voices | ✅ Yes |
| Setup | None | Google Cloud API key | API key |
| Speed control | ✅ | ✅ 0.5x–2.0x | ❌ |
TTS is OFF by default — toggle with the TTS button or ⌘ T.
📖 TTS guide: English · Tiếng Việt
Define how domain-specific words should be translated:
Original sin = Tội nguyên tổ
Christ = Kitô
Pneumonia = Viêm phổi
Add terms in Settings → Translation → Translation terms. Great for religious, medical, or technical content.
Experimental offline mode using MLX + Whisper + Gemma — runs 100% on-device. JA/EN/ZH/KO → VI/EN.
Your audio never touches our servers — because there are none.
- App connects directly to APIs you configure — no relay, no middleman
- You own your API keys — stored locally, never transmitted elsewhere
- No account, no telemetry, no analytics — zero tracking
- Transcripts saved as
.mdfiles locally, per session
- Tauri 2 — Rust backend + WebView frontend
- ScreenCaptureKit — macOS system audio
- WASAPI — Windows system audio
- cpal — Cross-platform microphone
- Soniox — Real-time STT + translation
- Edge TTS — Free neural TTS (default)
- Google Cloud TTS — Chirp 3 HD (near-human quality)
- ElevenLabs — Premium TTS
git clone https://github.com/phuc-nt/my-translator.git
cd my-translator
npm install
npm run tauri buildRequires: Rust (stable), Node.js 18+, macOS 13+ or Windows 10+.
MIT
