Getting Started with NeMo Safe Synthesizer
Get started with NeMo Safe Synthesizer for generating private synthetic versions of sensitive tabular datasets on a host GPU.
Prerequisites
Before using NeMo Safe Synthesizer, complete Setup to install the CLI/SDK.
NeMo Safe Synthesizer has the following additional requirements:
- An NVIDIA GPU on the host machine with 80GB+ VRAM (check with
nvidia-smi). This is separate from any GPU inside a NIM container; Safe Synthesizer training runs directly on the host. - Sufficient disk space for generated datasets (50GB+ recommended)
For general platform troubleshooting (port conflicts, health checks, and so on), refer to Setup.
nemo setup pre-configures a default/nvidia-build model provider during local startup.
This provider routes inference requests to models hosted on build.nvidia.com using the API base URL https://integrate.api.nvidia.com
and the NGC API key with Public API Endpoints permissions provided during deployment.
You can verify this provider exists by running nemo inference providers list --workspace default.
The tutorials in these docs use this provider for inference, but you can alternatively create your own and use it instead.
Host-local CLI
For GPU development on your machine, install the Safe Synthesizer plugin from this repository and use nemo safe-synthesizer run-local (see Local and Subprocess Execution):
The run-local command launches the Safe Synthesizer task in a separate runtime Python subprocess. The nemo safe-synthesizer CLI today exposes run-local and runtime only; platform job submission uses the Jobs API or SDK.
Next Steps
Create your first synthetic dataset:
- Safe Synthesizer 101 Tutorial - a beginner-friendly introduction
- Local and Subprocess Execution - local CLI and runtime task details
- SDK Resources - Python SDK methods for jobs, builders, logs, and results