Adding Safety Checks to Multimodal Data
Use the NeMo Guardrails API with vision-capable models to perform safety checks on image content. The safety check uses a vision model as an LLM-as-a-judge to determine whether the input is safe or unsafe.
The tutorial uses the Meta Llama 3.2 90B Vision Instruct model for the main LLM and as the judge model. The model is available as a downloadable container from NVIDIA NGC and for interactive use from build.nvidia.com.
About Multimodal Data
You can configure guardrails with multimodal data and vision reasoning models to perform safety checks on image data. You can apply the safety check to either input or output rails. The image reasoning model acts as an LLM-as-a-judge to classify content as safe or unsafe.
The OpenAI, Llama Vision, and Llama Guard models can accept multimodal input and act as a judge model. Depending on the image reasoning model, you can specify the image to check as a base64 encoded data or as a URL.
Prerequisites
Before you begin:
- You have access to a running NeMo Platform.
NMP_BASE_URLis set to the NeMo Platform base URL.- A
ModelProvideris configured with an LLM provider. Follow Setup if you haven’t done this yet.
This tutorial uses the following NIM, available on build.nvidia.com:
mainmodel:meta/llama-3.2-90b-vision-instruct
Step 1: Configure the Client
Instantiate the platform client.
Step 2: Create a Guardrail Configuration
Create a guardrail configuration that uses the vision model for content safety checks. This example applies the safety check as part of the input rails.
Step 3: Create a VirtualModel
Create a VirtualModel that routes inference through the guardrails middleware. Since multimodal safety uses input rails only, only request_middleware is needed.
CLI
Python SDK
Step 4: Verify Allowed Content
Send a safe request that includes a base64-encoded image and confirm you receive a non-blocked response.
Download an image of a street scene with traffic signs. You can use street-scene.jpg from the tutorial assets, or source a similar image from https://commons.wikimedia.org/wiki/Main_Page.

Example Response
Step 5: Verify Blocked Content
Send an unsafe request and confirm you receive a blocked response.
Download an image depicting car audio theft. You can use car-audio-theft.jpg from the tutorial assets.
