Fiddler Guardrails
Enforce Policies Within Your Own Secure Boundaries
Fiddler Guardrails is the industry’s fastest policy enforcement layer for agentic AI.
Define the safety, security, and compliance policies that govern your agents, and Fiddler enforces them at runtime in under 80ms, powered by purpose-built, fine-tuned, and task-specific Fiddler Centor Models (formerly Fiddler Trust Models). Every prompt and response is evaluated against your thresholds for hallucinations, jailbreaks, PII exposure, and unsafe content, and intercepted before it reaches the user or downstream system. You stay in complete control of how your agents behave.
Fiddler Guardrails Delivers Industry-Leading Policy Enforcement
Enforce Policies Across Safety, Security, and Privacy
Fiddler Guardrails is powered by the Fiddler Centor Models — purpose-built models that evaluate agent and LLM inputs and outputs in real-time, running entirely in your cloud and VPC environments. These "batteries-included" models remove the unpredictable, hidden costs of external API calls for agent and LLM evaluation. Because everything runs in your infrastructure, you get enterprise-grade protection, policy enforcement, and complete data privacy with full control over your environment.
These Centor Models, including Safety, PII, and Faithfulness models, help power Fiddler's AI Observability and Security solutions: Guardrails, LLM Observability, and Agentic Observability.
How Fiddler Guardrails Works
1. Define Your Policies
Choose the safety, security, privacy, and accuracy policies your agents need to follow, including thresholds and the specific behaviors you want to allow or block.
Simply write three to five lines of code to initialize your HTTP client.
2. Connect to Fiddler Centor Models
Call the Fiddler Guardrails API to enforce your policies at runtime, with under 80ms latency and full evaluation inside your environment.
3. Choose a Framework
Access our API using pre-built code examples in NodeJS, Python, cURL, or or any HTTP library of your choice.
Take advantage of our out-of-the-box integrations with NVIDIA NeMo Guardrails.
Featured Resources
Frequently Asked Questions About Guardrails
What is a guardrail in AI?
A guardrail in AI is a real-time policy enforcement mechanism that governs how AI systems behave. Guardrails evaluate every input and output against the policies an organization has defined for safety, security, privacy, and accuracy, and block or intercept anything that falls outside those policies. For enterprise LLM and agentic applications, guardrails are the runtime control layer that keeps AI within defined ethical, legal, and compliance boundaries.
What are the 3 general types of guardrails?
The four general types of AI guardrails are best understood as policy categories that enterprises enforce on their AI systems:
- Content Safety Policies: Prevent toxic, harmful, or inappropriate content generation.
- Security Policies: Protect AI systems from adversarial attacks such as prompt injections or jailbreaks.
- Accuracy Policies: Ensure outputs are accurate, reliable, and contextually relevant, preventing hallucinations or misinformation.
Together, these policies form a comprehensive framework for runtime AI governance.
These guardrails form a comprehensive framework for responsible AI and LLM monitoring.
What is a guardrail in programming?
In programming, a guardrail is a control or safeguard built into software to prevent errors, security vulnerabilities, or misuse. In the context of AI and LLMs, programming guardrails often take the form of runtime checks, validation layers, and filters that regulate model inputs and outputs to uphold system integrity and compliance.
What are guardrail metrics for LLM?
Guardrail metrics for LLMs are measurable indicators used to assess large language model outputs' safety, reliability, and security. Common metrics include detection rates for:
- Jailbreak attempts that try to bypass restrictions.
- Toxicity and harmful content levels.
- Faithfulness and groundedness to source data.

