---
title: VocaLIFT
description: Voice-first iOS workout tracker
section: craft
tags: [project, full-stack]
genre: reference
stability: stable
lastUpdated: 2026-04-19
url: https://fardiniqbal.com/docs/craft/projects/vocalift
---


Voice-first iOS workout tracker. Tap mic, speak, see structured data.
Native iOS app that replaces gym logging with voice. Speak naturally
("plate and a 10 each side for 8") and WhisperKit transcribes on-device
while GPT-4o-mini structures the utterance into exercise, weight, and
reps.

In Development. iOS 17+, Swift 6, Xcode 26.3. Epics 1–7 shipped (voice
pipeline, offline queue, PR detection, home, paywall, profile). Private
repo.

## What it is [#what-it-is]

Replaces manual set entry in legacy lifting apps (Strong, Hevy) with a
voice loop tuned for between-set logging under load. On-device ASR means
zero per-transcription cost, zero audio leaves the phone, and no network
dependency for the hot path. LLM structuring runs server-side through a
Supabase Edge Function with a strict JSON schema and deterministic
prompting.

## By the numbers [#by-the-numbers]

| Metric                | Value                                                |
| --------------------- | ---------------------------------------------------- |
| Transcription cost    | $0 per set (on-device)                               |
| Structuring cost      | \~$0.005 per workout                                 |
| Target gross margin   | 95%+ on paid tier                                    |
| Auto-confirm window   | under 5 seconds                                      |
| Offline resilience    | 80%+                                                 |
| Edge Function timeout | 15 seconds                                           |
| Epics shipped         | 7 (voice, offline, PR, home, paywall, profile, sync) |
| Audio uploaded        | zero bytes (text transcript only)                    |

## Architecture [#architecture]

```
[Mic tap]
   |
   v
AVAudioEngine --> AudioBufferProcessor --> WhisperKit (on-device)
                                               |
                                               v transcript
                                   VoiceTranscriptionService
                                               |
                                               v
                                VoiceLogPipelineManager
                                               |
                   +---------------------------+--------------------+
                   v                           v                    v
        ExerciseHistoryService       EdgeFunctionClient      AutoConfirmEngine
        ("last time" context)              |                        |
                                           v                        v
                        Supabase Edge Fn: structure-workout       commit
                        (GPT-4o-mini + JSON schema)                 |
                                           |                        v
                                           v                    SwiftData
                                    StructureResponse               |
                                                                    v
                                                      PRDetectionService
                                                                    |
                                                                    v
                                              SyncService / OfflineQueueProcessor
                                                                    |
                                                                    v
                                              Supabase Postgres (reconciliation)
```

### Data flow guarantees [#data-flow-guarantees]

* Local write always precedes remote write. Failed sync enqueues; queue
  replays in submission order.
* Edge Function returns `{ error, retryable, message }` with explicit
  retryability. Client distinguishes transient from terminal failure.
* Response schema is versioned (`CURRENT_SCHEMA_VERSION`) so the client
  can refuse unknown shapes.
* Idempotency via client-generated IDs carried through sync DTOs.

## Key features [#key-features]

* **On-device ASR** — WhisperKit (argmaxinc/WhisperKit 0.9+) runs Whisper
  models natively on the Neural Engine. No audio uploaded or stored.
* **GPT-4o-mini structuring** — Supabase Edge Function
  (`supabase/functions/structure-workout`) wraps OpenAI with a typed
  schema, 15s timeout, and versioned response contract.
* **Auto-confirm ring under 5s** — `AutoConfirmEngine` commits a
  structured set if the user doesn't intervene inside the window.
* **AI plate math** — Model resolves gym vernacular ("plate and a 10 each
  side") to barbell loading with correct total weight.
* **Exercise-specific "last time" context** — `ExerciseHistoryService`
  feeds the previous session's reps/weight for the current exercise into
  the prompt and UI for carry-forward prompting.
* **PR detection with shareable graphics** — `PRDetectionService` computes
  1RM and volume PRs on commit; `ShareImageRenderer` produces social-ready
  cards.
* **Progressive overload tracking** — Per-exercise charts, PR accordion,
  progression history.
* **Offline queue** — `OfflineQueueProcessor` + `SyncService` maintain an
  ordered, idempotent sync queue with conflict-safe replay when network
  returns.
* **SwiftData local-first persistence** — `Workout`, `Exercise`,
  `ExerciseSet` models are the source of truth; server sync is
  reconciliation, not dependency.
* **Audio interruption handling** — Music ducking, call/interruption
  recovery, graceful mic-session teardown.
* **RevenueCat paywall + A/B testing** — Free trial, subscription
  management, restoration, remote offering variants.
* **Design system** — Explicit tokens (color/typography/spacing/animation),
  reduced-motion aware modifiers, haptic engine.
* **Test coverage** — Unit, integration, and E2E targets covering voice
  pipeline, auth sync, offline queue, PR detection, inline editing, error
  recovery, share flow.

## What makes it stand out [#what-makes-it-stand-out]

* **Zero audio ever leaves the device.** On-device Whisper on the Neural
  Engine means microphone input stays local; only the text transcript
  crosses the wire.
* **$0 transcription cost, \~$0.005 structuring cost per workout** — a
  95%+ gross margin target without compromising privacy.
* **Local-first, not cloud-dependent** — SwiftData is the source of
  truth. Sync reconciles; it doesn't gate logging.
* **Versioned Edge Function contract** — the client refuses unknown
  response shapes, so server rollouts can't silently corrupt local state.

## Stack [#stack]

| Layer         | Technology                                           |
| ------------- | ---------------------------------------------------- |
| UI            | SwiftUI, Swift 6, iOS 17+                            |
| Persistence   | SwiftData (local-first)                              |
| ASR           | WhisperKit (on-device, Neural Engine)                |
| LLM           | GPT-4o-mini via Supabase Edge Function (Deno)        |
| Backend       | Supabase (Postgres, Auth, Edge Functions)            |
| Monetization  | RevenueCat (paywall, A/B, restoration)               |
| Analytics     | PostHog                                              |
| Audio         | AVFoundation, AVAudioEngine, ObjC exception bridging |
| Lint / format | SwiftLint (pre-build phase)                          |
| Project gen   | XcodeGen (`project.yml`)                             |
