NeMo Platform Skills Spec
Status: Draft.
This document defines the conventions for skills shipped with NeMo Platform: how they’re structured, what their frontmatter must contain, how they get tested, and where they live in the repo.
TL;DR: what are we actually solving for?
NeMo Platform is positioned as “the toolkit for safer, more accurate, cheaper agents.” Two distinct populations of users hit that pitch with very different starting points:
- The user who doesn’t have an agent yet. They have a problem and a use case. They want NeMo Platform to take them from idea → deployed working agent.
- The user who already has a deployed agent. They have metrics that aren’t good enough. They want NeMo Platform to make their agent measurably better.
Today’s catalog half-served both. There was no top-level orchestrator for the build journey (users had to guess after setup completed), and no overview / theory / outer-loop for the optimization journey (users had per-technique skills but no map of when to use which).
The two skills overhauls in flight this weekend fix this:
- Lifecycle skills serve the first population. Build journey.
- Optimization skills serve the second population. Improvement journey.
They don’t overlap. We need both. This doc proposes a unified spec so future skill authors aren’t choosing between two conventions.
The two user journeys, in detail
Journey A: “I want to build an agent for my use case”
Entry point: a coding agent (Claude Code, Cursor, Codex, OpenCode) opened inside a freshly installed NeMo Platform repo. The user has just run make bootstrap and nemo setup in their shell.
What they care about:
- Going from a fuzzy idea to a working, testable agent in their preferred coding agent.
- Not having to read every doc.
- The CLI telling them what to do next at every junction, not dumping them at a chat playground.
What they don’t care about:
- The platform’s internal architecture.
- Optimization theory.
- Which microservice does what.
The skills that serve them:
Where this journey breaks today (post-install CTA): after nemo setup finishes, the CLI tells the user to chat with the built-in calculator agent. That’s the wrong handoff for Journey A: the user came here to build their agent, not chat with someone else’s. The CTA should be goal-oriented (“what do you want to build?” or “ask me to design an agent for X”). The CLI is the journey transition point, not the README. Most users never read the README all the way through.
Journey B: “I have an agent. Make it better.”
Entry point: a coding agent opened inside a NeMo Platform repo where they have a deployed agent. Metrics aren’t where they want them.
What they care about:
- Knowing what kinds of optimization exist and when each applies.
- A reproducible loop (baseline → analyze → suggest → apply → re-evaluate → promote).
- Honest measurement: did the change actually help, or did I imagine it?
What they don’t care about:
- How to deploy an agent (they already have one).
- The first-touch flow.
The skills that serve them:
Why both matter
The two journeys are sequential, not parallel. A user typically does A first, then B weeks or months later. Without A, no one has an agent to optimize. Without B, agents stay at their initial quality forever.
Without a coherent post-setup CTA, the user finishes A and never starts B. They get stuck at the calculator chat and walk away.
Required frontmatter fields
Every skill ships with the following YAML frontmatter at the top of SKILL.md. Coding agents (Claude Code, Cursor, Codex, OpenCode) read these fields directly: description and triggers drive routing, not-for prevents collisions, allowed-tools tells the agent harness which tools the skill expects to invoke.
Library-prefix naming (nemo-*) is required for user-invocable skills (skills install into shared agent catalogs alongside skills from other plugins; the prefix prevents collisions). It’s optional for internal helpers.
Canonical location: packages/nemo_platform_ext/src/nemo_platform_ext/skills/<name>/. Skills there ship with pip install nemo-platform[all].
The merged frontmatter, by example
Playbook: how to write a new skill
The same checklist that scripts/skill-cli-lint.py and scripts/skill-test.py enforce at PR time.
1. Decide if it’s user-invocable
- User-invocable: the user (or their coding agent) directly says “do X” and this skill answers. Examples:
nemo-build-agent,nemo-status. - Internal helper: only invoked by other skills. Examples: shared validators, template lookups.
If user-invocable, the nemo-* prefix is required. If internal, optional.
2. Pick the name
- Kebab-case.
nemo-<verb>-<noun>ornemo-<noun>for user-invocable. Verb is preferred for action skills (nemo-build-agent, notnemo-agent-build).- Run
nemo skills listto confirm no existing name collides.
3. Pick the canonical location
- User-invocable:
packages/nemo_platform_ext/src/nemo_platform_ext/skills/<name>/(ships with the platform package). - Plugin-owned (skill is specific to a plugin):
plugins/<plugin-name>/src/<plugin_module>/skills/<name>/. - Internal-only dev skills:
.agents/skills/<name>/(gitignored from skills install by default).
4. Write the frontmatter
Required fields:
namedescription(verbose, embed natural-language trigger phrases for the LLM router)triggers(explicit list, ≥ 3 phrases, for the audit script)not-for(≥ 2 sibling-skill names with the reason to use them instead)maturity(alpha | beta | active | deprecated)license(Apache-2.0 unless overridden)user-invocable(true or false)allowed-tools(list of tools the skill expects to invoke)
Optional:
compatibility(only when there are real environment constraints)
5. Write the SKILL.md body
- Lead with one-sentence purpose.
- Step-by-step instructions in the order the agent should run them.
- One verification step after any state-changing action. Skills must not claim success without verification.
- Lean: under 500 lines. Lift detail into
references/<topic>.mdif needed. - Use real
nemoCLI commands; never improvise flags.scripts/skill-cli-lint.pyenforces this againstnemo --help.
6. Write tests.json (four-mode routing tests)
Every skill needs four kinds of routing tests:
- Explicit: user names the skill directly.
"Use nemo-build-agent to deploy my agent." - Implicit: user describes the intent without naming the skill.
"Scaffold and deploy my agent." - Contextual: user describes the intent with surrounding situation.
"I have a spec at agents/calculator-spec/AGENT-SPEC.md and a working dev cluster. Take it from here." - Negative-control: unrelated request that should NOT route to this skill.
"Set up a new Postgres database with seed data."
At least 3 examples per mode. scripts/skill-test.py runs them and fails if routing diverges.
7. Add references/ only if needed
If the skill body needs reference material (template files, troubleshooting tables, deep-dive documentation), put it under references/<topic>.md and point at it from the body. Don’t pre-emptively create references/ for short skills.
8. Local verification before PR
All four must pass before opening the PR.
9. PR conventions
- One PR per skill (or per closely-related group, e.g. all eight lifecycle skills as a single set).
- Label:
skills. - Reviewers: one platform owner + one author of an adjacent skill (collision check).
- CI: skill-cli-lint and skill-test both run automatically.
Out of scope
- The post-setup CTA in the CLI (the journey-A transition point); separate from the skills spec.
- Dependency management for NAT and related packages; covered elsewhere.
- A
nemo skills newscaffolder that emits skeletons following this spec; planned as a follow-up.