NeMo Platform Skills Spec

View as Markdown

Status: Draft.

This document defines the conventions for skills shipped with NeMo Platform: how they’re structured, what their frontmatter must contain, how they get tested, and where they live in the repo.


TL;DR: what are we actually solving for?

NeMo Platform is positioned as “the toolkit for safer, more accurate, cheaper agents.” Two distinct populations of users hit that pitch with very different starting points:

  1. The user who doesn’t have an agent yet. They have a problem and a use case. They want NeMo Platform to take them from idea → deployed working agent.
  2. The user who already has a deployed agent. They have metrics that aren’t good enough. They want NeMo Platform to make their agent measurably better.

Today’s catalog half-served both. There was no top-level orchestrator for the build journey (users had to guess after setup completed), and no overview / theory / outer-loop for the optimization journey (users had per-technique skills but no map of when to use which).

The two skills overhauls in flight this weekend fix this:

  • Lifecycle skills serve the first population. Build journey.
  • Optimization skills serve the second population. Improvement journey.

They don’t overlap. We need both. This doc proposes a unified spec so future skill authors aren’t choosing between two conventions.


The two user journeys, in detail

Journey A: “I want to build an agent for my use case”

Entry point: a coding agent (Claude Code, Cursor, Codex, OpenCode) opened inside a freshly installed NeMo Platform repo. The user has just run make bootstrap and nemo setup in their shell.

What they care about:

  • Going from a fuzzy idea to a working, testable agent in their preferred coding agent.
  • Not having to read every doc.
  • The CLI telling them what to do next at every junction, not dumping them at a chat playground.

What they don’t care about:

  • The platform’s internal architecture.
  • Optimization theory.
  • Which microservice does what.

The skills that serve them:

StepSkillWhat it does
1nemo-skill-selectionRouter: parses natural-language intent, picks the right downstream skill
2nemo-exploreDesign conversation: captures goal, audience, tools, constraints
3nemo-specWrites the design to agents/<name>-spec/AGENT-SPEC.md
4nemo-build-agentScaffolds NAT workflow YAML and deploys
5nemo-try-agentSends queries to the deployed agent
6nemo-statusRead-only platform health dashboard
7nemo-teardownGuided shutdown

Where this journey breaks today (post-install CTA): after nemo setup finishes, the CLI tells the user to chat with the built-in calculator agent. That’s the wrong handoff for Journey A: the user came here to build their agent, not chat with someone else’s. The CTA should be goal-oriented (“what do you want to build?” or “ask me to design an agent for X”). The CLI is the journey transition point, not the README. Most users never read the README all the way through.

Journey B: “I have an agent. Make it better.”

Entry point: a coding agent opened inside a NeMo Platform repo where they have a deployed agent. Metrics aren’t where they want them.

What they care about:

  • Knowing what kinds of optimization exist and when each applies.
  • A reproducible loop (baseline → analyze → suggest → apply → re-evaluate → promote).
  • Honest measurement: did the change actually help, or did I imagine it?

What they don’t care about:

  • How to deploy an agent (they already have one).
  • The first-touch flow.

The skills that serve them:

StepSkillWhat it does
1nemo-platform-overviewMap of surfaces, plugins, and which skill owns each capability
2agent-optimization-concepts (proposed rename: nemo-optimization-concepts)Tool-agnostic theory: four axes (quality/cost/latency/safety), decision tree, eval-as-ground-truth
3optimize-loop (proposed rename: nemo-optimize-loop)Orchestrator: baseline → analyze → suggest → apply via sibling-and-deploy → re-evaluate → promote
4+nemo-agents-optimize, nemo-agent-skills-optimization, nemo-agents-secure, nemo-evaluator, nemo-auditor, nemo-guardrailsPer-technique deep-dives, called from the loop

Why both matter

The two journeys are sequential, not parallel. A user typically does A first, then B weeks or months later. Without A, no one has an agent to optimize. Without B, agents stay at their initial quality forever.

Without a coherent post-setup CTA, the user finishes A and never starts B. They get stuck at the calculator chat and walk away.


Required frontmatter fields

Every skill ships with the following YAML frontmatter at the top of SKILL.md. Coding agents (Claude Code, Cursor, Codex, OpenCode) read these fields directly: description and triggers drive routing, not-for prevents collisions, allowed-tools tells the agent harness which tools the skill expects to invoke.

FieldRequiredPurpose
nameYesKebab-case identifier matching the directory name.
descriptionYesOne paragraph with embedded trigger phrases. Gives the router enough semantic surface to match user intent.
triggersYesExplicit list of natural-language phrases. Used by scripts/skill-cli-lint.py and the collision audit, not directly by the router.
not-forYesWhat this skill is NOT (other skills it could collide with, situations to bail out of). Prevents the wrong skill from firing as the catalog grows.
compatibilityOptionalReal constraints only (Python version, OS, GPU, required platform components).
maturityYesalpha, beta, active, or deprecated. Makes lifecycle explicit to users.
licenseYesDefault Apache-2.0.
user-invocableYestrue for skills users invoke directly; false for internal helpers. Distinguishes user-facing skills from internal helpers.
allowed-toolsYesList of tool names this skill is permitted to use (e.g. [Bash, Read, Write, Edit]). Real safety signal: the agent harness honors this.

Library-prefix naming (nemo-*) is required for user-invocable skills (skills install into shared agent catalogs alongside skills from other plugins; the prefix prevents collisions). It’s optional for internal helpers.

Canonical location: packages/nemo_platform_ext/src/nemo_platform_ext/skills/<name>/. Skills there ship with pip install nemo-platform[all].


The merged frontmatter, by example

1---
2name: nemo-build-agent
3description: >
4 Scaffold and deploy a NeMo Agent Toolkit (NAT) agent from a spec.
5 Generates the workflow YAML, creates the agent, deploys it, optionally
6 wires guardrails and an eval suite. Use when the user says "build me an
7 agent," "scaffold from this spec," "deploy my agent," or has finished
8 `nemo-explore` and is ready to ship.
9triggers:
10 - build the agent
11 - scaffold from spec
12 - deploy my agent
13not-for:
14 - nemo-explore (use for design conversation before a spec exists)
15 - nemo-try-agent (use to test an already-deployed agent)
16 - nemo-agents-optimize (use to improve a deployed agent's metrics)
17compatibility: nemo-platform >= 0.1.0; macOS or Linux; requires `agents/<name>-spec/AGENT-SPEC.md`
18maturity: active
19license: Apache-2.0
20user-invocable: true
21allowed-tools: [Bash, Read, Write, Edit]
22---

Playbook: how to write a new skill

The same checklist that scripts/skill-cli-lint.py and scripts/skill-test.py enforce at PR time.

1. Decide if it’s user-invocable

  • User-invocable: the user (or their coding agent) directly says “do X” and this skill answers. Examples: nemo-build-agent, nemo-status.
  • Internal helper: only invoked by other skills. Examples: shared validators, template lookups.

If user-invocable, the nemo-* prefix is required. If internal, optional.

2. Pick the name

  • Kebab-case.
  • nemo-<verb>-<noun> or nemo-<noun> for user-invocable. Verb is preferred for action skills (nemo-build-agent, not nemo-agent-build).
  • Run nemo skills list to confirm no existing name collides.

3. Pick the canonical location

  • User-invocable: packages/nemo_platform_ext/src/nemo_platform_ext/skills/<name>/ (ships with the platform package).
  • Plugin-owned (skill is specific to a plugin): plugins/<plugin-name>/src/<plugin_module>/skills/<name>/.
  • Internal-only dev skills: .agents/skills/<name>/ (gitignored from skills install by default).

4. Write the frontmatter

Required fields:

  • name
  • description (verbose, embed natural-language trigger phrases for the LLM router)
  • triggers (explicit list, ≥ 3 phrases, for the audit script)
  • not-for (≥ 2 sibling-skill names with the reason to use them instead)
  • maturity (alpha | beta | active | deprecated)
  • license (Apache-2.0 unless overridden)
  • user-invocable (true or false)
  • allowed-tools (list of tools the skill expects to invoke)

Optional:

  • compatibility (only when there are real environment constraints)

5. Write the SKILL.md body

  • Lead with one-sentence purpose.
  • Step-by-step instructions in the order the agent should run them.
  • One verification step after any state-changing action. Skills must not claim success without verification.
  • Lean: under 500 lines. Lift detail into references/<topic>.md if needed.
  • Use real nemo CLI commands; never improvise flags. scripts/skill-cli-lint.py enforces this against nemo --help.

6. Write tests.json (four-mode routing tests)

Every skill needs four kinds of routing tests:

  • Explicit: user names the skill directly. "Use nemo-build-agent to deploy my agent."
  • Implicit: user describes the intent without naming the skill. "Scaffold and deploy my agent."
  • Contextual: user describes the intent with surrounding situation. "I have a spec at agents/calculator-spec/AGENT-SPEC.md and a working dev cluster. Take it from here."
  • Negative-control: unrelated request that should NOT route to this skill. "Set up a new Postgres database with seed data."

At least 3 examples per mode. scripts/skill-test.py runs them and fails if routing diverges.

7. Add references/ only if needed

If the skill body needs reference material (template files, troubleshooting tables, deep-dive documentation), put it under references/<topic>.md and point at it from the body. Don’t pre-emptively create references/ for short skills.

8. Local verification before PR

$.venv/bin/python scripts/skill-cli-lint.py --root packages/nemo_platform_ext/src/nemo_platform_ext/skills/<your-skill>
$.venv/bin/python scripts/skill-test.py --skill <your-skill>
$nemo skills list | grep <your-skill>
$nemo skills show <your-skill>

All four must pass before opening the PR.

9. PR conventions

  • One PR per skill (or per closely-related group, e.g. all eight lifecycle skills as a single set).
  • Label: skills.
  • Reviewers: one platform owner + one author of an adjacent skill (collision check).
  • CI: skill-cli-lint and skill-test both run automatically.

Out of scope

  • The post-setup CTA in the CLI (the journey-A transition point); separate from the skills spec.
  • Dependency management for NAT and related packages; covered elsewhere.
  • A nemo skills new scaffolder that emits skeletons following this spec; planned as a follow-up.