Parameters Reference
Parameters Reference
This page summarizes the main configuration groups available when creating NeMo Safe Synthesizer jobs. For generated REST API schema details, see the Safe Synthesizer API Reference.
Job Spec (Plugin / REST)
Top-level fields on the Safe Synthesizer job spec (alongside config):
For host-local runs, see Local and Subprocess Execution. Reuse a local adapter with config.training.pretrained_model, not pretrained_model_job.
Top-Level Configuration
The SafeSynthesizerParameters schema defines the main configuration structure for Safe Synthesizer jobs.
SafeSynthesizerParameters
All fields are optional at the top level. For nested field constraints, see the Safe Synthesizer API Reference and search for the schema name in the Type column.
Data Parameters
Configuration for how to shape or use the input data, including grouping, ordering, and holdout settings.
Training Parameters
Hyperparameters for model fine-tuning, including learning rate, batch size, and LoRA configuration.
Generation Parameters
Configuration for synthetic data generation after training, including number of records, temperature, and structured generation options.
Differential Privacy Parameters
Hyperparameters for differential privacy during training using DP-SGD. Enable these for formal privacy guarantees.
Evaluation Parameters
Configuration for synthetic data quality and privacy assessment, including MIA, AIA, and PII replay detection.
PII Replacement Configuration
Configuration for PII detection and replacement. See pii-replacement for conceptual documentation.
Column Classification Config (replace_pii.globals.classify)
Column classification is configured via the SDK builder’s .with_classify_model_provider(provider_name) method. The provider name can be unqualified (the builder prepends the current workspace) or fully-qualified as workspace/provider_name.
If omitted, column classification is skipped and PII detection falls back to heuristic defaults, which may reduce accuracy.
Example Configuration
Here’s an example showing a complete job configuration using the Python SDK:
Related Topics
- data-synthesis - Learn about synthesis concepts
- evaluation - Learn about evaluation metrics
- index - Hands-on tutorials