MLX Audio Transcriber

stableUpdated 2026-04-19

A personal batch audio transcriber that runs Whisper models natively on Apple Silicon through the MLX framework. Zero cloud dependency — audio never leaves the machine. Built because I wanted a local tool for meeting recordings, podcasts, and voice notes without sending files to an external API.

What it is

A command-line tool I use on my own machine to batch-transcribe audio files. It wraps MLX-Whisper, Apple's MLX port of OpenAI's Whisper, so inference runs on the M-series GPU through Metal instead of calling a hosted model. Input is a folder of audio files; output is transcripts. No queue, no web UI, no multi-user anything — just a local batch processor.

This is not a product. It is the script I reach for when I have a recorded meeting, a podcast episode, or a stack of voice memos and want text out without uploading anything.

Key features

Fully on-device — MLX-Whisper runs Whisper inference on Apple Silicon's unified memory architecture via Metal. No network calls, no API keys, no cloud.
Batch processing — points at a directory and transcribes every audio file it finds, rather than one-at-a-time.
Whisper model family — uses the standard Whisper model weights through MLX, so model size is a tradeoff I pick per job (speed vs. accuracy).
Daily-driver reliability — I run it on real meeting recordings and podcast files regularly; it is the test bed.

Stack

Layer	Technology
Language	Python
ML framework	MLX (Apple)
Model	MLX-Whisper
Hardware	Apple Silicon (M-series, Metal)

What it is

Key features

Stack

Links

On this page