App Automaton is an open-source workshop for engineering with coding agents: Claude Code, Codex, Gemini, and OpenCode. It ships portable SKILLs, stage-gated runtimes, and a quiet streak of pure-MLX work for Apple Silicon.
Every App Automaton skill has the same shape: YAML frontmatter the agent always sees, a workflow body loaded on invocation, and a sibling tree of references and scripts pulled only as the path narrows.
Everything public in the org sorts onto four shelves: skills that teach agents workflows, harnesses that keep them honest, models that run on the laptop's own silicon, and a creative harness that points the same method at club music.
Portable SKILL.md packs built in three layers: frontmatter the agent always sees, a workflow body it loads on invocation, and references it pulls on demand. The same folder works under Claude Code, Codex, Gemini, and OpenCode.
The skills workspace. Issue-driven workflows, an MCP tool catalog, and bridge skills that let one agent delegate to another with session continuity.
deep dive → document-SKILLsSkills for docx, xlsx, pptx, and pdf. Extraction, forms, formulas, and tracked changes, adapted from Anthropic's official set. Runs on uv with PEP 723 metadata, no virtualenv.
Python presentationConsulting-quality decks, from strategy storyboarding and brand identity through to pixel-perfect PDF or fully editable PPTX.
JS webmatonWeb research the way a researcher reads it: grounded citations, deterministic HTML to Markdown, and persistent sessions over Playwright, nodriver, and Chrome DevTools.
Python playwright-skillBrowser automation that also runs on Android, with a Termux launcher patch and headless Chromium. Installs as a git subtree.
JSThe machinery around the skills: stage-gated workflows, multi-agent orchestration in real terminals, and the small sharp tools agents call along the way.
A stage-gated harness: frame → plan → review → execute → verify → resume. It installs into any project as plain markdown and keeps durable state under .agent/.
Multi-agent orchestration in tmux or kitty. Coders, planners, and reviewers work in parallel git worktrees and coordinate through files, never by scraping the terminal.
TS openclaw-monorepoA repo-local OpenClaw workspace with modular JSON5 config, plugins, and Docker sandboxes. It runs coding-agent CLIs on the desktop or in Android Termux.
JS markmatonHTML to Markdown for agent pipelines: a fast Go engine wrapped in a Python CLI and API, published on PyPI.
GoModels that run on the Apple GPU with no cloud and no PyTorch. Speech, vision, video, and 3D, all MLX-native, all on the laptop.
Speech synthesis, voice cloning, dialogue, sound effects, and recognition, with models from Fish S2 Pro and VibeVoice to LongCat, MOSS, and Step-Audio.
Python tnt-asrA terminal voice-to-text TUI. Qwen3-ASR runs on the Apple GPU and transcribes in about a second, fully local.
Python ltx-video-mlxText- and image-to-video with synchronized audio, built on LTX-2.3 22B. It runs 8-bit and 4-bit inference and fine-tunes LoRAs on device.
Python mlx-cvMLX-native computer vision: object detection, segmentation, and open-vocabulary grounding with SAM 3 and LocateAnything.
Python mlx-spatial3D and spatial inference on device. SAM 3D Objects, TRELLIS.2, WorldMirror, and MapAnything turn images into meshes, Gaussian splats, and point clouds.
PythonThe same stage-gated, local-first method pointed at a creative domain. Agents prepare the material — specs, MIDI, stems, renders — and a human listening gate keeps only what earns its place.
Clone, symlink, point an agent at it. The shape stays the same under Claude Code, Codex, Gemini, and OpenCode.