Local and Subprocess Execution

View as Markdown

Run NeMo Safe Synthesizer on your machine’s GPU with nemo safe-synthesizer run-local. The public command is a local subprocess wrapper: the main NeMo CLI starts a separate Safe Synthesizer runtime Python, and that runtime executes the synthesis task module.

This page covers local execution only. Platform job submission uses the Jobs API or SDK; the nemo safe-synthesizer CLI exposes run-local and runtime.

Prerequisites

  • CUDA-capable NVIDIA GPU on the host (80GB+ VRAM recommended; check with nvidia-smi). See Getting Started.
  • NeMo Platform repository checkout with the Safe Synthesizer plugin installed.
  • No running platform required for a typical local run when you pass --data-source — the NSS runtime can download base models from Hugging Face directly.
$# From the NeMo Platform repository root
$BOOTSTRAP_LOCAL_PLUGIN_DIRS=plugins/nemo-safe-synthesizer make bootstrap-python
$uv run nemo safe-synthesizer runtime setup
$uv run nemo safe-synthesizer runtime info

Confirm the CLI surface:

$uv run nemo safe-synthesizer --help
$# Commands: run-local, runtime

Execution modes

There are two local paths:

ModeCommandUse it when
Managed local subprocessuv run nemo safe-synthesizer run-local ...You want the supported plugin CLI. This creates the parent CLI process, then launches the runtime Python subprocess.
Direct local task\<runtime-python\> -m nemo_safe_synthesizer_plugin.tasks.safe_synthesizer run-local ...You are debugging the task process itself and want to bypass the parent CLI wrapper.

Both modes run on the host GPU and write artifacts to the local filesystem. Both accept the same task arguments: --spec-file, --workspace, --output-dir, and optional --data-source.

Run with the managed local subprocess

Use a job spec JSON (example in plugins/nemo-safe-synthesizer/src/nemo_safe_synthesizer_plugin/nss-job.json) and a local input file:

$uv run nemo safe-synthesizer run-local \
> --workspace default \
> --spec-file ./nss-job.json \
> --data-source ./input.csv \
> --output-dir ./nss-output
FlagRole
--spec-fileJob spec JSON (data_source, config, …)
--data-sourceLocal CSV (or other supported file) used instead of downloading from data_source in the spec
--output-dirWhere artifacts are written (default ./nss-output)
--workspaceWorkspace label for spec fields that reference workspaces (default default)

The parent command launches a subprocess equivalent to:

$\<runtime-python\> -m nemo_safe_synthesizer_plugin.tasks.safe_synthesizer run-local \
> --workspace default \
> --spec-file ./nss-job.json \
> --data-source ./input.csv \
> --output-dir ./nss-output

Find the configured runtime Python with:

$uv run nemo safe-synthesizer runtime info

Run the local task directly

Direct task execution is useful when you need to reproduce a subprocess failure without the parent CLI wrapper.

$$(uv run nemo safe-synthesizer runtime info | awk -F': ' '/^python:/ \{print $2\}') \
> -m nemo_safe_synthesizer_plugin.tasks.safe_synthesizer run-local \
> --workspace default \
> --spec-file ./nss-job.json \
> --data-source ./input.csv \
> --output-dir ./nss-output

If the runtime Python does not exist, run uv run nemo safe-synthesizer runtime setup first.

If you omit --data-source, the task downloads data_source from the platform Files service. Use --data-source for offline local files.

Output layout

PathDescription
nss-output/synthetic-data.csvGenerated records
nss-output/summary.jsonTiming and run summary
nss-output/evaluation-report.htmlPresent when evaluation is enabled
nss-output/adapter/LoRA adapter directory when synthesis training ran

Reuse a prior adapter (generation only)

Adapter reuse always skips training and runs generate + evaluate only — the same path as the OSS library’s load_from_save_path().generate().

Run-local

Point config.training.pretrained_model at a prior run’s adapter directory or work tree:

Run 1 — train and write an adapter:

$uv run nemo safe-synthesizer run-local \
> --spec-file ./job1-spec.json \
> --data-source ./input.csv \
> --output-dir ./nss-output-1

Run 2 — generate more records from that adapter:

1{
2 "data_source": "default/placeholder#input.csv",
3 "config": {
4 "enable_synthesis": true,
5 "enable_replace_pii": false,
6 "training": {
7 "pretrained_model": "./nss-output-1/adapter"
8 },
9 "generation": {
10 "num_records": 100
11 }
12 }
13}

The plugin resolves ./nss-output-1/adapter to the prior run under ./nss-output-1/work. The work/ tree must still exist from run 1.

You can also point at ./nss-output-1/work or a specific run directory under it.

Platform jobs (pretrained_model_job)

For platform jobs, set pretrained_model_job to a completed job that has an adapter result stored in Files:

1{
2 "pretrained_model_job": "my-first-synth-job",
3 "config": {
4 "generation": {
5 "num_records": 100
6 }
7 }
8}

Do not set config.training.pretrained_model when using pretrained_model_job.

Training runs embed safe-synthesizer-config.json in the adapter artifact uploaded to Files so subsequent generation-only jobs can reload the prior run configuration.

Use an absolute path for local pretrained_model if you run from a different working directory.

Runtime commands

$# One-time: create the NSS engine/CUDA venv
$uv run nemo safe-synthesizer runtime setup
$
$# Inspect configured runtime paths and Python
$uv run nemo safe-synthesizer runtime info
$
$# Recreate the venv after driver or package changes
$uv run nemo safe-synthesizer runtime setup --force

Automated tests

Unit tests (no GPU)

From plugins/nemo-safe-synthesizer:

$uv run pytest plugins/nemo-safe-synthesizer/tests/unit/test_local_run.py -v

Opt-in host-local E2E (GPU)

$cd /path/to/nemo-platform
$RUN_NSS_LOCAL_E2E=1 uv run pytest \
> plugins/nemo-safe-synthesizer/tests/e2e/test_local_synthesis.py \
> -v -m e2e

Optional: NSS_LOCAL_E2E_TIMEOUT_SECONDS (default 3600).

Requires RUN_NSS_LOCAL_E2E=1, CUDA, and nemo safe-synthesizer runtime setup.

Troubleshooting

SymptomCheck
run-local not in nemo safe-synthesizer --helpBOOTSTRAP_LOCAL_PLUGIN_DIRS=plugins/nemo-safe-synthesizer make bootstrap-python; no duplicate top-level generated safe-synthesizer CLI
runtime setup / CUDA errorsuv run nemo safe-synthesizer runtime info and nvidia-smi
Model download failuresHugging Face access from the NSS runtime venv; network and disk space
Reuse run fails to load adapterPrior run’s work/ tree still exists (run-local), or adapter artifact in Files includes metadata_v2.json and embedded safe-synthesizer-config.json (platform)
Use either 'pretrained_model_job' or 'config.training.pretrained_model'For run-local-only workflows, use only config.training.pretrained_model
  • Getting Started — GPU and local runtime prerequisites
  • Plugin README: plugins/nemo-safe-synthesizer/README.md