"A year ago I was wrong about AI. This talk is the honest version of what I learned." Confident, not hyped. Pause after "honest version." Skip the agenda — open with the tension. → Sets up promise vs reality. ═══ 45-MIN BUDGET ═══ Open+thesis(1-4) 4 · Ladder+poll(5) 3 · L0-L2(6-9) 5 · L3+local files(10-16) 8 3 workflows(17-19) 4 · L4+Sauron(20-22) 7 · game(21) 2 · L5(23-26) 5 skip+takehome+close(27-28) 2 · Q&A 5 = 45 min OFFLINE-SAFE: every demo runs from LOCAL files. No internet required to hit 45. - .claude/ files (13-16): your own repo, fully offline. - Game + Konami (21): static HTML, pre-loaded tab, runs offline. - Sauron (22): play a PRE-RECORDED screen-capture of a Telegram exchange. Internet only UPGRADES Sauron to live — never gates the talk. If wifi is solid AND you want it, go live; otherwise the recording carries the same beat. Record all three the night before. The 35-min read-through is the hard floor if all media fails.
"We all felt the rush." [pause] "Most of us also felt the hangover." Land the final line slowly — it's the thesis of the whole talk. → Let the next slide breathe. It's a quote. Don't rush into it.
Read it once. Slowly. [long pause] Don't explain. Don't add context. Let it sit. → Slide 4 is your personal story. Start: "I was that engineer."
"I was there — skeptic, resistant. A decade of TDD and clean code." My fear: AI ships impressive code that decays fast. Land: "I stopped racing AI and started directing it." [pause] This is the emotional core. Vulnerability = trust from the audience. → "Here's the framework I built from that journey."
"Before we go deeper — show of hands." "Who's at L0? L1? L2?" Pause after each. Count. React genuinely. "Every slide from here is one step up this ladder. This is the map." ~2 min here. Worth it — engagement anchors the whole rest of the talk. → Start at the bottom.
"Some of you are here right now. Being honest about it is step one." Land: "The risk isn't the technology. It's the time." [pause] "Every month at L0, competitors compound their advantage." → "L1 is where most of you already live."
"Same task. Same prompt. Two engineers. Two wildly different outputs." "Why? Know-how trapped in each head." Stack Overflow analogy lands with this audience — they lived that era. → "There's a floor at L1 that never changes, no matter how high you climb."
"This floor stays no matter how high you climb." Land: "You own every line. Can't blame Claude at 2am." [pause for laugh] "Never accept code you don't understand. Can't explain it = time bomb." → "When your team shares that discipline — that's L2."
"L1 to L2 is cultural, not technical. The hardest step on the ladder." "Write down how your team uses AI. Commit it. Review it monthly." "First level where AI is a team skill, not a personal habit." → "But shared practices still have no memory. That's where context comes in."
"Stop chasing models. Start engineering context." Land: "Weak model + great context beats frontier model with none." [pause] "The intelligence was always there. Context gives it hands." → "First tool that gives the agent hands: MCP."
"MCP — Model Context Protocol — gives the agent hands: filesystem, github, postgres, your tools. It turns it from a conversation partner into an active participant." "One folder. Six layers. Committed to git. Everyone who clones inherits the full setup." Walk the tree, ~10s per item: - CLAUDE.md = the onboarding doc. - settings.json = permissions, hooks, env. - skills/ = runnable procedures. - rules/ = glob-targeted conventions. - hooks/ = shell scripts that fire on events. - agents/ = specialized roles. → "But that folder is Claude-specific. What happens when you switch tools?"
"You just saw the .claude/ folder. But that's one vendor's format." "I wrote agnostic-ai for exactly this: define agents, skills, rules, hooks once — sync to 14+ tools in their native format." Land: "Bet on the context, not the vendor. The tool ships next year. Your context shouldn't have to be rewritten." [OFFLINE-SAFE: WASM playground runs in-browser, no internet. Pre-load a tab if you want to demo a sync.] → "Now — the most important file in that folder."
"A good CLAUDE.md is a good onboarding doc." "The better it is, the less you repeat yourself to the agent." "Every byte ships in every prompt — keep it short. Past one screen, move to rules/ or skills/." [DEMO ~2 min, spans 13-16 · OFFLINE-SAFE] Open your editor. Show the REAL files: your project CLAUDE.md, one skill.md, and a hook firing on save. All local, no internet. Concrete beats the diagram. Pre-open the files before the talk. → "But some rules shouldn't be suggestions. That's what hooks are for."
"Rules: what the agent should know. Hooks: what the system enforces regardless." "Permissions: lines the agent can't cross, even when asked politely." Land: "Safety before leverage. That order matters." → "Skills are where your competitive edge actually lives."
[pause after the accountant line. Let it land.] "Intelligence is not expertise. Skills close that gap. A skill is a markdown file — 20 cost nothing until one fits." Land: "The agent ships next year. Your skills ship forever. They encode your domain, your conventions, your architecture." "Start with the second repeated prompt. That's a skill waiting to be written." → "Now let's see this in action — three concrete workflows."
"AI defaults to adding. It won't improve code unless you explicitly ask." Step 1: "Make it work first. Don't refactor in the same prompt." Step 2: ask out loud — "simplify this", "remove the boilerplate", "any SOLID violations?" Step 3: run /refactor-check — a clean-code-reviewer agent reads the diff. Land: "You name the target shape. The co-pilot drives the mechanical move." → "Workflow 2: where AI usually makes tests worse — and how to fix that."
"Say 'test behavior' out loud." [pause] "That changes everything." Step 1: ask for behavior, not implementation. Name the outcome, not the method. Step 2: a TDD-coach agent runs red → green → refactor in sequence. Step 3: a hook blocks the commit unless the suite is green and coverage holds. Land: "TDD used to compete with deadlines. With a hook enforcing coverage, it doesn't have to." → "The hardest workflow: architecture."
"The agent sees local context. You hold global." Step 1: plan mode maps dependencies and proposes an approach before touching anything. Step 2: you approve / modify / reject the plan. First decision you make, not the last. Step 3: a domain-architect agent owns boundaries; a pair-buddy agent challenges before you commit. Land: "Architecture is human judgment. The co-pilot executes the plan you approve." → "What if these ran as standing roles? That's L4."
"The 3 workflows become standing roles. Org chart unchanged — each person produces a lot more." Land: "Humans stop racing AI on speed and start directing it." → "Two shapes of multi-agent work — pick the right one."
"Start with subagents — simpler, cheaper, easier to debug." "Upgrade to teams when tasks need to run in parallel and share findings." "Teams cost ~3x tokens. Make the parallelism earn it." → "My own team runs 24/7. Let me show you."
[tone shift — personal, proud] "A coding agent lives in one repo for one task. Sauron lives in my life." "Sauron isn't a tool I open. It's a place I work." "Runs on my own hardware via OpenClaw. Always on. No tab to open." "I talk to him over Telegram. He reviews PRs, opens issues, ships OSS." "He pushes back when I tunnel — that's the point." Mention sauronbot.github.io. [DEMO ~4 min · OFFLINE-SAFE] Default: play a PRE-RECORDED screen capture of a Telegram exchange — a real pushback, then Sauron opens an issue/PR. Plays from local file, no wifi. OPTIONAL live upgrade: only if wifi is solid AND you tested it before the talk. The recording lands the same beat; live is a bonus, not a dependency. → "The most memorable thing we built together."
[let the demo/gif play first, then talk] "I brought the vision. Sauron brought the craft — rendering, physics, Web Audio, mobile input." "5k lines. Two days. I wrote zero code — but every creative decision was mine." Land: "Neither could have made it alone. That's what a co-pilot looks like at its best." [DEMO ~2 min · OFFLINE-SAFE] Static HTML game — pre-load in a browser tab BEFORE the talk so it runs with zero internet. Clear a level, trigger the Konami easter egg. If the tab somehow fails: play a local screen-recording instead. Either way, 2 min. → "L5 is where this changes the structure, not just the output."
"L4 multiplies output inside the existing structure." [pause] "L5 changes the structure itself." Spell out the shift: tickets written so an agent can act on them, reviews that assume part was machine-written, architecture that accounts for what agents do well. "A senior IC starts to look more like a tech lead — leading people and agents." "Few companies fully here in 2026 — but ignoring the direction is its own decision." → "L5 creates a new bottleneck. Us."
[slow down here — this one lands differently] "We spent decades making machines faster." [pause] "Now we're the slowest part of the system." Don't rush. Let it sit. Land: "You can add more agents. You can't add more of yourself." → "So how do you stay in control without being in the way?"
"Pilot and autopilot. Plane flies itself. Pilot watches, steps in when something looks wrong." Say Ship / Show / Ask explicitly — these are the decision rule. Point to each card. "Beware automation complacency. Read diffs you don't have to. Stay sharp." → "But some things still require full eyes — on purpose."
"Human-on-the-loop doesn't mean rubber stamp." Walk the list: security (auth, keys, permissions), irreversible (migrations, deletions, money, messages to users), a new codebase (still learning the territory), regulated systems. Land: "The AI approved it isn't an answer for an auditor." Everything else: use a pair-buddy agent as a thinking partner, not a replacement for review. → "One last trap to avoid before we wrap."
"The most common mistake: team jumps L1 to L4." "Agents produce mountains of low-quality code. Nobody agreed what quality means." Land: "The agents aren't the problem. The missing foundation is." [pause] Order is the lesson: shared practices → context → teams → AI-native. → "So what do you do Monday morning?"
"Monday morning. Here's exactly what you do." Walk the list in order — each item is one action, not a concept. 1. Locate yourself on the ladder. Pick the next step, not the top. 2. Write down how your team uses AI (L2). Commit it. 3. Engineer context before chasing models: CLAUDE.md, rules, skills. 4. Extract a skill on the second repeated prompt. 5. Refactor: make it work, then tell AI to simplify. 6. Test: ask for behavior, not implementation; let a TDD loop + hook hold the line. 7. Architect: plan mode first. Think, approve, then code. 8. Review where it matters. Let agents handle the rest. "Pick one. Just one. The next step on your ladder, not the top." Re-poll the room (callback to s5): "Show of hands — who'll be one step higher by Monday?" Leave this slide up during Q&A. [Q&A ~5 min]
[slow, weight, meaning] Read the quote once. [long pause] "The question won't be 'did you use AI?'" [pause] "It'll be 'at what level, and with what direction?'" Thank the room. Let it land before you say another word.