___      ___  ________     ___    ___ ________  ___  ___  _______   ________  ________  ________     
|\  \    /  /||\   __  \   |\  \  /  /|\   ____\|\  \|\  \|\  ___ \ |\   __  \|\   __  \|\   __  \    
\ \  \  /  / /\ \  \|\  \  \ \  \/  / | \  \___|\ \  \\\  \ \   __/|\ \  \|\  \ \  \|\  \ \  \|\  \   
 \ \  \/  / /  \ \  \\\  \  \ \    / / \ \_____  \ \   __  \ \  \_|/_\ \   _  _\ \   ____\ \   __  \  
  \ \    / /    \ \  \\\  \  /     \/   \|____|\  \ \  \ \  \ \  \_|\ \ \  \\  \\ \  \___|\ \  \ \  \ 
   \ \__/ /      \ \_______\/  /\   \     ____\_\  \ \__\ \__\ \_______\ \__\\ _\\ \__\    \ \__\ \__\
    \|__|/        \|_______/__/ /\ __\   |\_________\|__|\|__|\|_______|\|__|\|__|\|__|     \|__|\|__|
                           |__|/ \|__|   \|_________|                                                   
  

Offline Neural TTS · Android · No Cloud · No Limits

↓ Download APK v1.0-beta ★ GitHub GPL v3.0 Android 11+ F-Droid — Soon

VoxSherpa TTS runs studio-quality neural text-to-speech entirely on your Android device.

Powered by Sherpa-ONNX — supports Kokoro-82M, Piper, and VITS engines.
Hindi, English, Japanese, Chinese and 50+ languages — zero internet required.

SCROLL

See it in action

Generate
Generate
Models
Models
Library
Library
Settings
Settings

Everything you need.
Nothing you don't.

🔒
100% Offline & Private
All processing on your device. No internet after model download. No account, no telemetry. Your text never leaves your phone.
🌐
50+ Languages
Hindi, English, British, Japanese, Chinese, French, Spanish and more. First offline Android TTS with serious Hindi support.
📦
Flexible Model Import
Download models inside the app or import your own .onnx files from local storage. Multiple models simultaneously.
🎧
Audio Controls
Real-time waveform visualization, adjustable speed & pitch, play/pause/replay, export as WAV with correct sample rate per model.
📚
Speech Library
Save all generated audio locally. Favorites system, generation history with timestamps, voice model attribution per recording.
⚙️
Emotion Tags
Smart Punctuation for natural pauses. Express emotion with [whisper] [angry] [happy] tags. 100+ Kokoro voices.

Two engines.
One app.

⚡ Fast

Piper / VITS

Lightweight and fast. Generates natural speech in seconds on any Android device — perfect for daily use.

Fast on budget hardware
Natural, clear output
Best for quick synthesis
Low memory footprint

Speak every language
offline.

🇮🇳 Hindi 🇬🇧 English 🇬🇧 British English 🇯🇵 Japanese 🇨🇳 Chinese 🇫🇷 French 🇪🇸 Spanish 🇩🇪 German 🇵🇹 Portuguese 🇮🇹 Italian 🇰🇷 Korean 🇷🇺 Russian 🇵🇱 Polish + 37 more

What's inside the build?

tech-stack.json
[$] cat tech-stack.json
 
{
  "language""Java"
  "platform""Android 11+"
  "built_with""Sketchware Pro"
  "inference""Sherpa-ONNX · ONNX Runtime"
  "tts_engines"["Kokoro-82M", "Piper", "VITS"]
  "audio_api""Android AudioTrack (PCM)"
  "theme""Dark Navy #0B1220 + #1D61FF"
  "license""GNU GPL v3.0"
}
 
[$]

Honest numbers.
Real hardware.

Device Tier Kokoro-82M Piper / VITS
● Flagship (SD 8 Gen 3) ~20–40 sec / min audio ~5 sec / min audio
● Mid-range (8-core) ~60–90 sec / min audio ~10 sec / min audio
● Budget (6-core) ~2–3 min / min audio ~20 sec / min audio

Open Source —
contributions welcome

1. Fork the repo
2. git checkout -b feature/YourFeature
3. git commit -m "Add YourFeature"
4. git push origin feature/YourFeature
5. Open a Pull Request

Found a bug? → Open an Issue

GNU GPL v3.0

VoxSherpa TTS — Copyright (C) 2025 CodeBySonu95

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

See the full LICENSE file for details.