Terminal voice-to-text. Tap Space, speak, tap Space — your words appear in the transcript and on the clipboard before you reach for the mouse.
Qwen3-ASR-1.7B runs in-process on the Apple GPU via MLX — int8, ~2.5 GB resident, warm on every take. Fully local: no cloud, no runtime network calls.
uv tool install automaton-tnt
No menus, no modes to learn. The model is already loaded and waiting — recording starts the instant you press the key.
Press Space to start. Hold it instead to record only while held — a live braille oscilloscope shows real mic levels as you talk.
Press Space again to stop. Audio goes straight to the resident GPU model — a short take returns in a fraction of a second.
The transcript appears in the log and is auto-copied. Paste anywhere. Click any past entry to copy it again. Press Space mid-transcribe to cancel.
Pure MLX inference on the Apple GPU, a microphone that can always be reclaimed, and a TUI that reshapes itself to your terminal.
Pure MLX — no PyTorch, no CUDA, no subprocess for the model. Weights are int8 (~2.5 GB, about half of BF16) with a faster decode, loaded once in a background warmup so every single take is warm.
Real audio levels render as a braille waveform while you record, so you always know the mic is hearing you.
Everything runs on your machine. No cloud round-trips, no telemetry, nothing leaves the laptop.
Native AVFoundation capture in an isolated Swift helper process. A wedged audio stack? TNT kills the helper and macOS releases the mic.
Language auto-detected, or force it via env var — keep mixed zh/en speech from being translated away.
New transcriptions auto-copy; click any past entry to copy it again. The layout uses a side-rail on wide terminals and stacks on narrow ones — it fits whatever window you've got.
Apple Silicon, Python 3.13+, and uv. The bootstrap script pulls the int8 checkpoint and links it; first launch compiles the tiny Swift mic helper and caches it.
xcode-select --install$ git clone https://github.com/appautomaton/tnt-asr.git
$ cd tnt-asr
$ uv sync
$ ./bootstrap-mlx-asr.sh # downloads + links the int8 checkpoint (~2.5 GB)
$ uv run tnt
$ uv tool install automaton-tnt
$ TNT_MLX_MODEL=/path/to/qwen3-asr-1.7b-int8-mlx tnt
# or symlink the checkpoint instead of setting the env var:
# ~/.local/share/tnt/qwen3-asr-mlx
A ready-to-use int8 build is published at appautomaton/qwen3-asr-1.7b-int8-mlx. BF16 and mxfp8 builds work too — mlx-speech reads the quant from the checkpoint config, so switching is just a relink.
Inference runs in-process on your Apple GPU. There are no network calls at runtime — not for the model, not for analytics. What you say stays on your machine.