On-device AI · whisper.cpp

On-device AI transcription. No cloud required.

Run whisper.cpp on Android with one Gradle line. 99 languages, on-device — your audio never leaves the phone. No NDK setup, no source build, no API key. Start free, go Pro for real-time streaming.

// on-device AI · arm64-v8a · API 24+ · Android 15 ready · 16 KB aligned

MainActivity.kt
// load a model (downloaded separately, ~75-466 MB)
val model = Whisper.loadModel(context, "ggml-base.bin")
// transcribe — 99 languages, auto-detect
val result = Whisper.transcribe(model, "audio.wav")
// segments with timestamps
result.segments.forEach { seg ->
  Log.d("Whisper", "[${seg.startMs}→${seg.endMs}] ${seg.text}")
}
On-device · no cloud · no API key TRANSCRIBED
Why on-device

Your audio stays on the phone.

Cloud APIs send your audio to a server. Whisper Android runs the AI model locally — nothing uploaded, nothing stored, nothing billed per minute. Private by design, works offline, zero latency to a server.

On-device AI · OpenAI Whisper model 99 languages · auto-detect Private · audio never leaves the device Offline · no internet required No API key · no per-minute billing Prebuilt .aar · no NDK, no source build arm64-v8a · NEON optimized · 16 KB aligned Maven Central · dev.ffmpegkit-maintained
Free & Pro

Transcribe files for free. Stream in real time with Pro.

The free tier gives you full whisper.cpp transcription in one Gradle line. Pro adds real-time streaming from the microphone with AI-powered voice detection (Silero VAD), so you can build dictation, live subtitles and voice assistants.

Launch offer — code JOKOBEE10 takes $10 off Pro at checkout.
Free MIT
$0 · Maven / JitPack / GitHub
  • Full whisper.cpp transcription (file → text)
  • 99 languages · auto-detect · translate to English
  • Segments with timestamps · processing time
  • Clean Kotlin coroutine API
  • arm64-v8a · CPU NEON optimized
Dual AI
Pro bundled MIT
$24 · $62 team (5)
  • Everything in Free
  • Real-time streaming (mic → text, VAD-segmented)
  • Silero VAD bundled — AI voice detection, skip silence
  • Dual AI pipeline: Silero detects speech, Whisper transcribes
  • arm64-v8a + x86_64 (emulators, Chromebooks)
  • Quantized model support (q4_0, q5_1, q8_0)
Choose a model

Downloaded separately — pick what fits your app.

Models are not bundled in the AAR (too large). Download the one that matches your speed/quality tradeoff and ship it with your app or download it at first launch. All models support 99 languages.

Model guide
tiny     ~75 MB   fastest  · good for real-time on mid-range phones
base     ~142 MB  balanced · best speed/quality tradeoff
small    ~466 MB  slower  · best quality, needs flagship phone
// download from HuggingFace (ggml format)
One Gradle line

No NDK. No source build. Just add the dependency.

The prebuilt AAR includes the compiled libwhisper.so and the Kotlin API. Add one line to your build file, download a model, transcribe.

app/build.gradle.kts
// Free — Maven Central
implementation("dev.ffmpegkit-maintained:whisper-android:1.0.0")
// Free — JitPack (alternative)
implementation("com.github.ffmpegkit-maintained:whisper:v1.0.0")
// Pro — real-time streaming + Silero VAD
implementation("dev.ffmpegkit-maintained:whisper-android-pro:1.0.0")
Works with FFmpegKit

Decode any audio → transcribe with Whisper.

Whisper needs 16 kHz mono PCM. Pair it with FFmpegKit to decode MP4, MKV, AAC, OGG or any format into a WAV that Whisper can transcribe — a full media-to-text pipeline on device.