NeMo Platform Skills Spec | NVIDIA NeMo Platform

Status: Draft.

This document defines the conventions for skills shipped with NeMo Platform: how they’re structured, what their frontmatter must contain, how they get tested, and where they live in the repo.

TL;DR: what are we actually solving for?

NeMo Platform is positioned as “the toolkit for safer, more accurate, cheaper agents.” Two distinct populations of users hit that pitch with very different starting points:

The user who doesn’t have an agent yet. They have a problem and a use case. They want NeMo Platform to take them from idea → deployed working agent.
The user who already has a deployed agent. They have metrics that aren’t good enough. They want NeMo Platform to make their agent measurably better.

Today’s catalog half-served both. There was no top-level orchestrator for the build journey (users had to guess after setup completed), and no overview / theory / outer-loop for the optimization journey (users had per-technique skills but no map of when to use which).

The two skills overhauls in flight this weekend fix this:

Lifecycle skills serve the first population. Build journey.
Optimization skills serve the second population. Improvement journey.

They don’t overlap. We need both. This doc proposes a unified spec so future skill authors aren’t choosing between two conventions.

The two user journeys, in detail

Journey A: “I want to build an agent for my use case”

Entry point: a coding agent (Claude Code, Cursor, Codex, OpenCode) opened inside a freshly installed NeMo Platform repo. The user has just run make bootstrap and nemo setup in their shell.

What they care about:

Going from a fuzzy idea to a working, testable agent in their preferred coding agent.
Not having to read every doc.
The CLI telling them what to do next at every junction, not dumping them at a chat playground.

What they don’t care about:

The platform’s internal architecture.
Optimization theory.
Which microservice does what.

The skills that serve them:

Step	Skill	What it does
1	`nemo-skill-selection`	Router: parses natural-language intent, picks the right downstream skill
2	`nemo-explore`	Design conversation: captures goal, audience, tools, constraints
3	`nemo-spec`	Writes the design to `agents/<name>-spec/AGENT-SPEC.md`
4	`nemo-build-agent`	Scaffolds NAT workflow YAML and deploys
5	`nemo-try-agent`	Sends queries to the deployed agent
6	`nemo-status`	Read-only platform health dashboard
7	`nemo-teardown`	Guided shutdown

Where this journey breaks today (post-install CTA): after nemo setup finishes, the CLI tells the user to chat with the built-in calculator agent. That’s the wrong handoff for Journey A: the user came here to build their agent, not chat with someone else’s. The CTA should be goal-oriented (“what do you want to build?” or “ask me to design an agent for X”). The CLI is the journey transition point, not the README. Most users never read the README all the way through.

Journey B: “I have an agent. Make it better.”

Entry point: a coding agent opened inside a NeMo Platform repo where they have a deployed agent. Metrics aren’t where they want them.

What they care about:

Knowing what kinds of optimization exist and when each applies.
A reproducible loop (baseline → analyze → suggest → apply → re-evaluate → promote).
Honest measurement: did the change actually help, or did I imagine it?

What they don’t care about:

How to deploy an agent (they already have one).
The first-touch flow.

The skills that serve them:

Step	Skill	What it does
1	`nemo-platform-overview`	Map of surfaces, plugins, and which skill owns each capability
2	`agent-optimization-concepts` (proposed rename: `nemo-optimization-concepts`)	Tool-agnostic theory: four axes (quality/cost/latency/safety), decision tree, eval-as-ground-truth
3	`optimize-loop` (proposed rename: `nemo-optimize-loop`)	Orchestrator: baseline → analyze → suggest → apply via sibling-and-deploy → re-evaluate → promote
4+	`nemo-agents-optimize`, `nemo-agent-skills-optimization`, `nemo-agents-secure`, `nemo-evaluator`, `nemo-auditor`, `nemo-guardrails`	Per-technique deep-dives, called from the loop

Why both matter

The two journeys are sequential, not parallel. A user typically does A first, then B weeks or months later. Without A, no one has an agent to optimize. Without B, agents stay at their initial quality forever.

Without a coherent post-setup CTA, the user finishes A and never starts B. They get stuck at the calculator chat and walk away.

Required frontmatter fields

Every skill ships with the following YAML frontmatter at the top of SKILL.md. Coding agents (Claude Code, Cursor, Codex, OpenCode) read these fields directly: description and triggers drive routing, not-for prevents collisions, allowed-tools tells the agent harness which tools the skill expects to invoke.

Field	Required	Purpose
`name`	Yes	Kebab-case identifier matching the directory name.
`description`	Yes	One paragraph with embedded trigger phrases. Gives the router enough semantic surface to match user intent.
`triggers`	Yes	Explicit list of natural-language phrases. Used by `scripts/skill-cli-lint.py` and the collision audit, not directly by the router.
`not-for`	Yes	What this skill is NOT (other skills it could collide with, situations to bail out of). Prevents the wrong skill from firing as the catalog grows.
`compatibility`	Optional	Real constraints only (Python version, OS, GPU, required platform components).
`maturity`	Yes	`alpha`, `beta`, `active`, or `deprecated`. Makes lifecycle explicit to users.
`license`	Yes	Default `Apache-2.0`.
`user-invocable`	Yes	`true` for skills users invoke directly; `false` for internal helpers. Distinguishes user-facing skills from internal helpers.
`allowed-tools`	Yes	List of tool names this skill is permitted to use (e.g. `[Bash, Read, Write, Edit]`). Real safety signal: the agent harness honors this.

Library-prefix naming (nemo-*) is required for user-invocable skills (skills install into shared agent catalogs alongside skills from other plugins; the prefix prevents collisions). It’s optional for internal helpers.

Canonical location: packages/nemo_platform_ext/src/nemo_platform_ext/skills/<name>/. Skills there ship with pip install nemo-platform[all].

The merged frontmatter, by example

1 ---
2 name: nemo-build-agent
3 description: >
4   Scaffold and deploy a NeMo Agent Toolkit (NAT) agent from a spec.
5   Generates the workflow YAML, creates the agent, deploys it, optionally
6   wires guardrails and an eval suite. Use when the user says "build me an
7   agent," "scaffold from this spec," "deploy my agent," or has finished
8   `nemo-explore` and is ready to ship.
9 triggers:
10   - build the agent
11   - scaffold from spec
12   - deploy my agent
13 not-for:
14   - nemo-explore (use for design conversation before a spec exists)
15   - nemo-try-agent (use to test an already-deployed agent)
16   - nemo-agents-optimize (use to improve a deployed agent's metrics)
17 compatibility: nemo-platform >= 0.1.0; macOS or Linux; requires `agents/<name>-spec/AGENT-SPEC.md`
18 maturity: active
19 license: Apache-2.0
20 user-invocable: true
21 allowed-tools: [Bash, Read, Write, Edit]
22 ---

Playbook: how to write a new skill

The same checklist that scripts/skill-cli-lint.py and scripts/skill-test.py enforce at PR time.

1. Decide if it’s user-invocable

User-invocable: the user (or their coding agent) directly says “do X” and this skill answers. Examples: nemo-build-agent, nemo-status.
Internal helper: only invoked by other skills. Examples: shared validators, template lookups.

If user-invocable, the nemo-* prefix is required. If internal, optional.

2. Pick the name

Kebab-case.
nemo-<verb>-<noun> or nemo-<noun> for user-invocable. Verb is preferred for action skills (nemo-build-agent, not nemo-agent-build).
Run nemo skills list to confirm no existing name collides.

3. Pick the canonical location

User-invocable: packages/nemo_platform_ext/src/nemo_platform_ext/skills/<name>/ (ships with the platform package).
Plugin-owned (skill is specific to a plugin): plugins/<plugin-name>/src/<plugin_module>/skills/<name>/.
Internal-only dev skills: .agents/skills/<name>/ (gitignored from skills install by default).

4. Write the frontmatter

Required fields:

name
description (verbose, embed natural-language trigger phrases for the LLM router)
triggers (explicit list, ≥ 3 phrases, for the audit script)
not-for (≥ 2 sibling-skill names with the reason to use them instead)
maturity (alpha | beta | active | deprecated)
license (Apache-2.0 unless overridden)
user-invocable (true or false)
allowed-tools (list of tools the skill expects to invoke)

Optional:

compatibility (only when there are real environment constraints)

5. Write the SKILL.md body

Lead with one-sentence purpose.
Step-by-step instructions in the order the agent should run them.
One verification step after any state-changing action. Skills must not claim success without verification.
Lean: under 500 lines. Lift detail into references/<topic>.md if needed.
Use real nemo CLI commands; never improvise flags. scripts/skill-cli-lint.py enforces this against nemo --help.

6. Write `tests.json` (four-mode routing tests)

Every skill needs four kinds of routing tests:

Explicit: user names the skill directly. "Use nemo-build-agent to deploy my agent."
Implicit: user describes the intent without naming the skill. "Scaffold and deploy my agent."
Contextual: user describes the intent with surrounding situation. "I have a spec at agents/calculator-spec/AGENT-SPEC.md and a working dev cluster. Take it from here."
Negative-control: unrelated request that should NOT route to this skill. "Set up a new Postgres database with seed data."

At least 3 examples per mode. scripts/skill-test.py runs them and fails if routing diverges.

7. Add `references/` only if needed

If the skill body needs reference material (template files, troubleshooting tables, deep-dive documentation), put it under references/<topic>.md and point at it from the body. Don’t pre-emptively create references/ for short skills.

8. Local verification before PR

$ .venv/bin/python scripts/skill-cli-lint.py --root packages/nemo_platform_ext/src/nemo_platform_ext/skills/<your-skill>
$ .venv/bin/python scripts/skill-test.py --skill <your-skill>
$ nemo skills list | grep <your-skill>
$ nemo skills show <your-skill>

All four must pass before opening the PR.

9. PR conventions

One PR per skill (or per closely-related group, e.g. all eight lifecycle skills as a single set).
Label: skills.
Reviewers: one platform owner + one author of an adjacent skill (collision check).
CI: skill-cli-lint and skill-test both run automatically.

Out of scope

The post-setup CTA in the CLI (the journey-A transition point); separate from the skills spec.
Dependency management for NAT and related packages; covered elsewhere.
A nemo skills new scaffolder that emits skeletons following this spec; planned as a follow-up.