Skip to main content
WorkProjects

MLX Audio Transcriber

On-device ML, Apple Silicon native

stable
View raw

A personal batch audio transcriber that runs Whisper models natively on Apple Silicon through the MLX framework. Zero cloud dependency — audio never leaves the machine. Built because I wanted a local tool for meeting recordings, podcasts, and voice notes without sending files to an external API.

What it is

A command-line tool I use on my own machine to batch-transcribe audio files. It wraps MLX-Whisper, Apple's MLX port of OpenAI's Whisper, so inference runs on the M-series GPU through Metal instead of calling a hosted model. Input is a folder of audio files; output is transcripts. No queue, no web UI, no multi-user anything — just a local batch processor.

This is not a product. It is the script I reach for when I have a recorded meeting, a podcast episode, or a stack of voice memos and want text out without uploading anything.

Key features

  • Fully on-device — MLX-Whisper runs Whisper inference on Apple Silicon's unified memory architecture via Metal. No network calls, no API keys, no cloud.
  • Batch processing — points at a directory and transcribes every audio file it finds, rather than one-at-a-time.
  • Whisper model family — uses the standard Whisper model weights through MLX, so model size is a tradeoff I pick per job (speed vs. accuracy).
  • Daily-driver reliability — I run it on real meeting recordings and podcast files regularly; it is the test bed.

Stack

LayerTechnology
LanguagePython
ML frameworkMLX (Apple)
ModelMLX-Whisper
HardwareApple Silicon (M-series, Metal)
  • Status: Personal tool, not published.