Evaluation Tutorials
Use these tutorials to become familiar with evaluation with NeMo Platform.
Before You Start
Set up a local instance of the platform for the following tutorials.
Run an LLM Judge Eval
Learn how to evaluate a fine-tuned model using the LLM Judge metric with a custom dataset.
intermediate custom evaluation llm judge nemo-evaluatorDefine and Run Custom Python Metrics
Learn how to write a domain-specific Python metric, test it locally, and run it through the Evaluator service.
intermediate custom metric remote execution nemo-evaluatorHow It Works
For the conceptual overview of how Evaluator separates definition (library) from execution (platform), see About Evaluating → How It Works. For runnable SDK examples, see SDK Resources.