On-device AI · whisper.cpp

On-device AI transcription. No cloud required.

Run whisper.cpp on Android with one Gradle line. 99 languages, on-device — your audio never leaves the phone. No NDK setup, no source build, no API key. Start free, go Pro for real-time streaming.

Choose your build → Free on GitHub ↗

// on-device AI · arm64-v8a · API 24+ · Android 15 ready · 16 KB aligned

MainActivity.kt

// load a model (downloaded separately, ~75-466 MB)

val model = Whisper.loadModel(context, "ggml-base.bin")

// transcribe — 99 languages, auto-detect

val result = Whisper.transcribe(model, "audio.wav")

// segments with timestamps

result.segments.forEach { seg ->

Log.d("Whisper", "[${seg.startMs}→${seg.endMs}] ${seg.text}")

}

✓ On-device · no cloud · no API key TRANSCRIBED

Why on-device

Your audio stays on the phone.

Cloud APIs send your audio to a server. Whisper Android runs the AI model locally — nothing uploaded, nothing stored, nothing billed per minute. Private by design, works offline, zero latency to a server.

On-device AI · OpenAI Whisper model 99 languages · auto-detect Private · audio never leaves the device Offline · no internet required No API key · no per-minute billing Prebuilt .aar · no NDK, no source build arm64-v8a · NEON optimized · 16 KB aligned Maven Central · dev.ffmpegkit-maintained

Free & Pro

Transcribe files for free. Stream in real time with Pro.

The free tier gives you full whisper.cpp transcription in one Gradle line. Pro adds real-time streaming from the microphone with AI-powered voice detection (Silero VAD), so you can build dictation, live subtitles and voice assistants.

Free MIT

$0 · Maven / JitPack / GitHub

Full whisper.cpp transcription (file → text)
99 languages · auto-detect · translate to English
Segments with timestamps · processing time
Clean Kotlin coroutine API
arm64-v8a · CPU NEON optimized

Get free on GitHub ↗

Dual AI

Pro bundled MIT

$24 · $62 team (5)

Everything in Free
Real-time streaming (mic → text, VAD-segmented)
Silero VAD bundled — AI voice detection, skip silence
Dual AI pipeline: Silero detects speech, Whisper transcribes
arm64-v8a + x86_64 (emulators, Chromebooks)
Quantized model support (q4_0, q5_1, q8_0)

Get Pro — code applied

Choose a model

Downloaded separately — pick what fits your app.

Models are not bundled in the AAR (too large). Download the one that matches your speed/quality tradeoff and ship it with your app or download it at first launch. All models support 99 languages.

Model guide

tiny ~75 MB fastest · good for real-time on mid-range phones

base ~142 MB balanced · best speed/quality tradeoff

small ~466 MB slower · best quality, needs flagship phone

// download from HuggingFace (ggml format)

→ huggingface.co/ggerganov/whisper.cpp

One Gradle line

No NDK. No source build. Just add the dependency.

The prebuilt AAR includes the compiled libwhisper.so and the Kotlin API. Add one line to your build file, download a model, transcribe.

app/build.gradle.kts

// Free — Maven Central

implementation("dev.ffmpegkit-maintained:whisper-android:1.0.0")

// Free — JitPack (alternative)

implementation("com.github.ffmpegkit-maintained:whisper:v1.0.0")

// Pro — real-time streaming + Silero VAD

implementation("dev.ffmpegkit-maintained:whisper-android-pro:1.0.0")

Works with FFmpegKit

Decode any audio → transcribe with Whisper.

Whisper needs 16 kHz mono PCM. Pair it with FFmpegKit to decode MP4, MKV, AAC, OGG or any format into a WAV that Whisper can transcribe — a full media-to-text pipeline on device.