
Skills Over Agents
People compare coding agents. Claude Code, Codex, Gemini CLI. Which one is smarter, faster, cheaper. New benchmarks every month.
Wrong question.
After a year wiring agents into real projects, what moved the needle wasn’t the agent. It was the skills I wrote for it.
Agents are commodities
Every coding agent has the same shape. A language model, a runtime, filesystem access. Read, reason, write. Generalist by design.
Two teams use the same agent. One ships clean, tested code. The other ships garbage that looks good. Same model. Different teaching.
The model is the engine. Skills are the map. Without a map, a powerful engine gets you lost sooner.
Intelligence is not expertise
Who handles your taxes? A 300 IQ genius who never read tax law, or an accountant with 20 years of filings?
An accountant knows which deductions apply, which filings your business needs, which mistakes get flagged. Not intelligence. Expertise.
AI agents have the same gap. A model reasons about code and writes solutions. It doesn’t know your hexagonal layers. It doesn’t know domain entities must never import framework code. It doesn’t know every feature starts with a failing test.
Skills close that gap.
Skills load context on demand
A skill is a markdown file in .claude/skills/. A procedure, a pattern, a slice of domain knowledge. Plain markdown with frontmatter.
The key is how they load. The agent reads only names and descriptions at startup. Loads the full skill when the task matches. Follows links to references only when it needs to dig deeper.
That on-demand loading is what makes skills scale. Twenty skills cost almost nothing until one fits the task. Specialized agents, by contrast, carry their full instructions every time they run. More agents, more fixed cost. The Agent sees the description in the skill list. Ask for a review, it loads
Deep Dive: A real-world skill example
.claude/skills/
code-review/
SKILL.md # main instructions
reference/
solid-checklist.md # detailed SOLID examples
test-patterns.md # test quality guidelinesSKILL.md:---
description: "Review code changes for SOLID violations, test quality, and architecture alignment"
allowed-tools: Read, Grep, Glob
argument-hint: "[file or PR]"
---
# Code Review
Review code changes against project conventions.
## Steps
1. Read the diff or specified files
2. Check architecture: domain layer has no framework imports, infrastructure stays thin
3. Check SOLID principles (see reference/solid-checklist.md for patterns)
4. Check test quality: tests verify behavior, not implementation details
5. Flag issues with the specific principle violated and a suggested fix
## Output
For each issue found:
- File and line
- What's wrong (which principle or convention)
- What the fix looks likeSKILL.md. Needs a SOLID pattern, reads the reference. Two levels, on demand.

Skills vs specialized agents
I covered specialized agents already: isolated workers with their own prompt and tool set. Great for parallel work and clean context boundaries.
Specialized agents are coarse. One agent, one role, one fixed prompt. If you want three kinds of review quality, you either write three agents or stuff one agent with everything.
Skills are finer. One agent, many skills. The right skill loads for the task. Context stays small. Quality stays high.
Rule of thumb:
- Use a skill when you need a procedure or pattern. From
phel-lang:/gh-issue(issue to PR),/commit(conventional commit),/refactor-check(SOLID review). - Use an agent when you need isolation. From
phel-lang:tdd-coach(TDD pairing),clean-code-reviewer(PR review),domain-architect(architecture exploration).
Most needs are skills, not agents.
Agents give you speed. Skills give you quality. If you must pick one first, pick skills.
Skills are your edge
Models improve every month. This year’s best is next year’s baseline. The major families converge. Better tool shows up, you switch.
Your skills don’t switch with the tool. They encode your domain, conventions, architecture. They live in your repo. They travel with your code. Point a new model at the library, productive day one.
The agent is replaceable. Your skills are not.
Start with the first repeated prompt
You don’t need 20 skills on day one.
Zero. Then one.
The signal is repetition. The second time you type the same context, that’s a skill waiting. Extract it into a markdown file. Next session, the agent knows.
Concrete example. On phel-lang, I kept pasting the same brief every session: read issue #N, branch from the labels, TDD, open the PR. Third repeat, I extracted it into a /gh-issue skill. Now I type /gh-issue 142 and the agent picks up the issue, creates fix/... or feat/... from the labels, writes the failing test first, implements, opens the PR. One markdown file. The session no longer starts from zero.
Don’t write from scratch. Ask the agent: “Read this project and draft a minimal code review skill based on what you see.” It scans, picks up conventions, drafts v1. Then you adjust. Add what it missed. Cut what doesn’t apply. Sharpen the description.
The second skill usually comes from a mistake. Agent breaks a convention. Write a skill that teaches the correct approach. It won’t happen again.
Skills add up. Each one lifts the baseline. A markdown file, maybe 50 lines. Permanent payoff.
People who don’t write skills keep re-explaining what they “really want.” Every session from zero. Not a tool problem. A knowledge management problem.
The agent ships next year. The skill ships forever.
Write the skill once. Every session after that starts where the last one ended.
