← all work
FLAGSHIPMay–Jun 2026

demo-agentany web app, demoed automatically

Point an agent at any live web app; get a narrated video, a guided tour, or an offline clickable sandbox.

3
output modes
~10 min
to ~40 screens
$0.05–0.15
per deep sandbox
37+20
safety rules

Storylane and Navattic make you author a product demo screen by screen. I built an agent that points at a live web-app URL and produces three things from one crawl: a narrated MP4 with Gemini TTS and burned-in subtitles, a click-through guided tour, and a free-roam offline sandbox made of the app's real captured DOM. It installs as a Claude Code, Codex, or Cursor skill.

The hard part is clicking real buttons on someone's production SaaS without deleting data or losing the session. My coverage engine stamps a stable id on every clickable, then relocates each target after a fresh page load by position and identity, and fails closed on any mismatch. 37 destructive-label patterns and 20 auth-URL keys are never clicked; icon-only elements with no accessible name are treated as uncertifiable and skipped.

I wrote it solo: 30 commits in 21 days, about 5,100 lines of Python and 550 lines of browser JavaScript across 14 modules.

The hard part

Clicking production buttons without breaking anything

To capture a real app you have to click real buttons — but a wrong click deletes a record or logs you out. I built a coverage engine that stamps a stable id on every clickable element, then re-locates each target after a page reload by both position and identity and fails closed on any mismatch. A never-click policy of 37 destructive-label patterns (delete, revoke, deactivate…) and 20 auth-URL keys is matched across the path, query, and fragment, and any element without an accessible name is skipped as uncertifiable.

Highlights

  • Built 3 output modes from one crawl: a narrated MP4 with Gemini TTS and burned-in subtitles, a click-through tour, and an offline sandbox of real captured DOM.
  • Engineered a never-click safety policy of 37 destructive-label patterns and 20 auth-URL keys matched across path, query, and fragment; pages with password fields are never captured.
  • Wrote a 273-line DOM serializer that freezes a hydrated SPA into a self-contained offline HTML file, inlining CSSOM-only rules and stripping hidden and password values so no session tokens land in shared bundles.
  • Killed audio clicks at every slide boundary by re-rendering the video as one ffmpeg concat-filter pass instead of per-slide AAC segments joined with -c copy.
  • Crawled ~40 screens in ~10 minutes at $0.05–$0.15 per deep sandbox using URL-template clustering and layered wall-clock, screen-count, and per-page click budgets.

Stack

Python 3.12 asyncioPlaywrightStagehandBrowserbaseGemini 2.5PydanticffmpegVanilla JS