MLX Audio Transcriber
On-device ML, Apple Silicon native
A personal batch audio transcriber that runs Whisper models natively on Apple Silicon through the MLX framework. Zero cloud dependency — audio never leaves the machine. Built because I wanted a local tool for meeting recordings, podcasts, and voice notes without sending files to an external API.
What it is
A command-line tool I use on my own machine to batch-transcribe audio files. It wraps MLX-Whisper, Apple's MLX port of OpenAI's Whisper, so inference runs on the M-series GPU through Metal instead of calling a hosted model. Input is a folder of audio files; output is transcripts. No queue, no web UI, no multi-user anything — just a local batch processor.
This is not a product. It is the script I reach for when I have a recorded meeting, a podcast episode, or a stack of voice memos and want text out without uploading anything.
Key features
- Fully on-device — MLX-Whisper runs Whisper inference on Apple Silicon's unified memory architecture via Metal. No network calls, no API keys, no cloud.
- Batch processing — points at a directory and transcribes every audio file it finds, rather than one-at-a-time.
- Whisper model family — uses the standard Whisper model weights through MLX, so model size is a tradeoff I pick per job (speed vs. accuracy).
- Daily-driver reliability — I run it on real meeting recordings and podcast files regularly; it is the test bed.
Stack
| Layer | Technology |
|---|---|
| Language | Python |
| ML framework | MLX (Apple) |
| Model | MLX-Whisper |
| Hardware | Apple Silicon (M-series, Metal) |
Links
- Status: Personal tool, not published.