Getting Started with NeMo Safe Synthesizer

Get started with NeMo Safe Synthesizer for generating private synthetic versions of sensitive tabular datasets on a host GPU.

Prerequisites

Before using NeMo Safe Synthesizer, complete Setup to install the CLI/SDK.

NeMo Safe Synthesizer has the following additional requirements:

An NVIDIA GPU on the host machine with 80GB+ VRAM (check with nvidia-smi). This is separate from any GPU inside a NIM container; Safe Synthesizer training runs directly on the host.
Sufficient disk space for generated datasets (50GB+ recommended)

For general platform troubleshooting (port conflicts, health checks, and so on), refer to Setup.

nemo setup pre-configures a default/nvidia-build model provider during local startup. This provider routes inference requests to models hosted on build.nvidia.com using the API base URL https://integrate.api.nvidia.com and the NGC API key with Public API Endpoints permissions provided during deployment.

You can verify this provider exists by running nemo inference providers list --workspace default.

The tutorials in these docs use this provider for inference, but you can alternatively create your own and use it instead.

Host-local CLI

For GPU development on your machine, install the Safe Synthesizer plugin from this repository and use nemo safe-synthesizer run-local (see Local and Subprocess Execution):

$ BOOTSTRAP_LOCAL_PLUGIN_DIRS=plugins/nemo-safe-synthesizer make bootstrap-python
$ uv run nemo safe-synthesizer runtime setup
$ uv run nemo safe-synthesizer run-local \
>   --spec-file ./nss-job.json \
>   --data-source ./input.csv \
>   --output-dir ./nss-output

The run-local command launches the Safe Synthesizer task in a separate runtime Python subprocess. The nemo safe-synthesizer CLI today exposes run-local and runtime only; platform job submission uses the Jobs API or SDK.

Next Steps

Create your first synthetic dataset:

Safe Synthesizer 101 Tutorial - a beginner-friendly introduction
Local and Subprocess Execution - local CLI and runtime task details
SDK Resources - Python SDK methods for jobs, builders, logs, and results