Optimize Agents
Use the Agent Optimizer to analyze a deployed agent and act on improvement suggestions. The optimizer inspects the agent’s config, the workspace model catalog, any prior optimizer snapshots, and optional evaluation baselines, then writes suggestions you can review from the CLI or hand off to a coding agent.
This page covers the main path: establish a baseline, generate optimization suggestions, apply a candidate change to a sibling agent, and review the evaluation result before promotion.
What the Optimizer Checks
Optimizer state is stored in the nemo-agent-optimizer fileset:
optimizer_suggestions.jsonl: one suggestion per line, including applied state.optimizer_snapshot.json: model and agent names from the latest run.
Security-oriented suggestions such as missing guardrails, PII exposure, or leaked secrets are covered in Secure Agents.
Prerequisites
Before running the optimizer, make sure you have:
- Local services running (
nemo services run). - The agents plugin installed. For local development from this repository:
- A workspace with at least one model provider and discovered model entities.
- At least one deployed platform-managed agent.
- An evaluation baseline before promoting a candidate agent.
If you need a demo agent, start the platform and create the ReAct example:
In another terminal:
The example agent uses nvidia-nemotron-3-nano-30b-a3b, so it can produce a
model optimization suggestion when the workspace model catalog contains a
smaller compatible model.
Optimize with Switchyard Routing
Switchyard is the inference middleware that lets a virtual model split traffic across multiple backend models. The common optimization pattern is to create a virtual model with a strong model and a weaker, cheaper model, then evaluate whether the route split preserves application quality.
Run nemo models list first and replace the placeholders below with model
entity names from your workspace that use the OPENAI_CHAT backend format.
CLI
Skill
Python SDK
The command below creates a virtual model that sends 80% of traffic to the strong model and 20% to the weak one.
Before wiring the virtual model to an agent, smoke-test the route by
making several minimal chat-completions calls and checking the returned
model name. The observed split should roughly match strong_probability.
Optimize Skills
Skill optimization applies when the agent depends on local skill files and has an evaluation suite. The loop runs evaluations, analyzes failures, lets the coding agent edit only the configured skills directory, reruns verification, and keeps the change only when the evaluation result improves.
CLI
Skill
Python SDK
Set open_pr: true in the YAML when you want the loop to prepare a
reviewable branch.
A sample .agent-improver.yml is in
plugins/nemo-agents/examples/agent-improver.example.yml.
Inspect Saved Results
Use the Files service to inspect what the optimizer saved:
Telemetry is optional. If agents use the nemo_files telemetry exporter, trace
files are written to nemo-agent-telemetry, and the optimizer samples the
largest JSONL file:
Run Prompt and Parameter Tuning
The nemo agents optimize run command runs the NAT optimizer path for
parameter or prompt tuning. Use it when you already have a NAT optimization
YAML and want to run nat optimize through the Agents plugin.
For the ReAct example:
CLI
Skill
Python SDK
When --agent is a platform-managed agent name, the job fetches the stored
agent config, merges it with the optimization config, injects the Inference
Gateway URL, and runs trials locally. When --agent is a raw HTTP endpoint,
the endpoint is treated as an opaque remote service, so local parameter sweeps
do not change the remote agent behavior.
Troubleshooting
No suggestions appear. Confirm the workspace has agents, model entities, and a model catalog entry smaller than the agent’s current model. New-model suggestions require a previous optimizer snapshot, so they do not appear on the first run.
The model evaluation fails. Confirm the judge model in the eval config is available through the workspace Inference Gateway. You can replace the eval files in <agent-name>-eval with your own evaluation config and dataset.
Data safety suggestions do not appear. Telemetry is optional. The optimizer only scans nemo-agent-telemetry when that fileset exists and contains JSONL trace files.
Next steps
- Agent overview: review how platform-managed agents are registered, deployed, invoked, evaluated, and optimized.
- Agent evaluation: configure agents as online evaluation targets and choose the right agent response mapping.
- CLI reference: look up complete command options and global CLI flags for scripted workflows.