COLLEAGUE.SKILL — two-track skill packages from one expert trace
AgentThe news. On May 29, 2026, a team posted COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation. The pitch: agents are increasingly asked to carry "bounded representations of human expertise, judgment, and interaction style," but that knowledge lives in messy traces, not clean instructions. The system distills those traces into a versioned skill package with two coordinated tracks — a capability track for practices and decision heuristics, and a bounded behavior track for communication style and correction history — that you can inspect, invoke, correct in natural language, roll back, and install across agent hosts. The authors point to a public skill gallery (reportedly ~215 skills from ~165 contributors) as evidence the packaging format already has traction. Read the paper →
Picture the master chef working the line for one dinner service. They taste, they adjust the salt, they plate fast, they tell the runner exactly how to describe the dish. If you only filmed it once, you'd have everything you need to reproduce the meal — but it's all tangled together. The recipe (sear three minutes a side, rest until it springs back) is mixed in with the plating (sauce under, not over; one sprig, not three) and the table-side patter. A new cook can't just replay the tape; they need it written down. And crucially, the steps and the style are different kinds of knowledge — one is procedure, the other is presentation.
That split is the whole idea. COLLEAGUE.SKILL reads the expert trace and sorts it onto two cards-in-one: the capability track is the recipe — what to do, the heuristics and decision rules ("check the tax line before you trust the total"). The behavior track is the plating and patter — how to do it, the tone and interaction style ("lead with the bottom line, stay terse, don't hedge"). Keeping them on separate tracks is what lets you tune one without disturbing the other: you can sharpen a decision rule without rewriting the voice, or soften the tone without touching the logic. This is exactly the "skills as a primitive" framing the Tool Use module's Skills step builds toward — and it changes what you decide to load into an agent's context, because a package is a single, nameable thing to install rather than a wall of prose to paste.
The payoff is portability. Because the package is plain text rather than weights, the same card cooks in any kitchen — you install it unchanged on a CLI harness, an IDE assistant, or a server-side orchestrator. When the chef changes their mind, you don't re-shoot the film: you scribble a fix in the margin in plain English, the package rolls from v1 to v2, and a bad edit reverts like any other commit. Compared with the usual options, that's a real shift in where expertise lives:
| Approach | Where expertise lives | Inspect & edit? | Versioned / rollback? | Portable across hosts? |
|---|---|---|---|---|
| Fine-tuning | Model weights | No — opaque | Only by retraining | No — tied to that model |
| Monolithic system prompt | One prose blob | Yes, but what/how are tangled | Ad hoc (copy-paste) | Partly — must hand-port |
| Skill package (capability + behavior) | Versioned text file | Yes — two clean tracks | Yes — v1 → v2, revert | Yes — install unchanged |
Where does that portability earn its keep? Say a team has 5 expert workflows they want running across 4 agent hosts (illustrative). Hand-porting means writing and re-syncing a bespoke prompt for every pairing — 5 × 4 = 20 bespoke prompts, and every time one expert refines a rule you re-edit it in four places. As skill packages it's 5 portable packages, each installed unchanged on all four hosts, and a refinement is one margin edit, versioned once — not four silent re-syncs that drift out of agreement. The win isn't a benchmark number (the paper is a systems contribution, not an accuracy result); it's that expertise stops being copy-pasted prose and starts behaving like code.
Goes deeper in: AI Agents → Tool Use & Function Calling → Skills
Related explainers
This was a single-concept pick, so it has no sibling pages from the same news item — but these existing explainers sit close by:
- EnvFactory — synthesizing tool environments — generating the environments agents practice their skills in.
- Tool Router — a contextual bandit over tools — choosing which packaged capability to invoke at run time.