Skip to main content

Documentation Index

Fetch the complete documentation index at: https://koreai-v2-home-nav.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Guardrails help detect unsafe, abusive, harmful, or non-compliant content in agent input and output interactions. Guardrails support:
  • Input and output safety enforcement
  • Prompt injection detection
  • PII protection and redaction
  • Provider-based moderation
  • Runtime enforcement controls
  • Streaming response protection
  • Centralized policy management
Depending on configuration, guardrails can block content, warn users, redact sensitive information, escalate interactions, request rephrasing, or automatically sanitize responses.

Typical Runtime Flow

┌────────────┐
│ User Input │
└──────┬─────┘

┌────────────────────┐
│ Input Guardrails   │
│ • PII detection    │
│ • Prompt injection │
│ • Topic checks     │
└──────┬─────────────┘

┌────────────────────┐
│ Agent / Model      │
│ Processing         │
└──────┬─────────────┘

┌────────────────────┐
│ Output Guardrails  │
│ • Toxicity checks  │
│ • PII redaction    │
│ • Content filtering│
└──────┬─────────────┘

┌────────────────────┐
│ Final Response     │
│ Returned to User   │
└────────────────────┘

Guardrail Configuration Levels

Guardrails can be configured at:
  • The project level using centralized guardrail policies
  • The agent level using agent-specific guardrails

Project Guardrails vs. Agent Guardrails

Project-level policies apply in addition to agent-specific guardrails.
ScopePurposeTypical usage
Project guardrailsCentralized governance and reusable safety policiesEnterprise-wide safety enforcement across agents
Agent guardrailsAgent-specific runtime safety checksLocalized rules for individual agents

Project Guardrails

Project guardrails are managed from: Govern > Guardrails. Project guardrails provide:
  • Reusable safety policies across agents
  • Centralized provider management
  • Runtime execution settings
  • Streaming response enforcement
  • Cross-agent governance controls
Use project guardrails when you want:
  • Consistent governance across multiple agents
  • Shared moderation providers
  • Organization-wide safety controls
  • Centralized runtime management

Agent Guardrails

Guardrails configured directly within an agent. Agent guardrails are managed from: Agent > Guardrails. Agent guardrails provide:
  • Agent-specific safety checks
  • Runtime rule configuration
  • Input/output rule behavior
  • Rule-level actions and messages
Use agent guardrails when:
  • Safety rules are specific to one agent
  • Runtime behavior must be customized locally
  • Shared project-level governance is not required

Understand DSL and UI mapping

The platform maintains a one-to-one mapping between the UI configuration and the DSL/ABL definition. This allows you to:
  • Configure guardrails visually
  • Manage guardrails as code
  • Version and compare configuration changes
  • Switch between UI and DSL-based editing workflows
For example, when you add a guardrail rule in the UI, the platform generates the corresponding GUARDRAILS: block in the DSL/ABL. Similarly, updating the GUARDRAILS: block directly in the DSL/ABL updates the same rule configuration in the UI. For detailed guardrail syntax, runtime semantics, and advanced ABL examples, see the Guardrails section in the ABL Reference Guide.

Policy Scopes

Guardrail policies can be applied at different scopes:

Project-Level Scope

Apply the policy to all agents in the project.
{
  "scopeType": "project"
}

Agent-Level Scope

Apply the policy only to a specific agent.
{
  "scopeType": "agent",
  "agentDefId": "agent-definition-id"
}

Guardrail Policies

Policies are reusable governance containers that define runtime safety behavior across agents and projects. Policies can be applied at:
  • Project level
  • Agent level
Go to: Govern> Guardrails> Policies. Policies contain one or more rules. Each rule defines:
  • What to evaluate
  • Where to evaluate it
  • Which provider to use
  • What action to take when triggered
Rules can support:
  • Input and output evaluation
  • Streaming responses
  • Pattern matching
  • Model-based moderation
  • LLM-based classification

Create a Guardrail Policy

  1. Go to Govern > Guardrails.
  2. On the Policies tab, click Create policy.
  3. Enter Policy name and Description.
  4. Select whether the policy applies to all the agents in the project or only to a specific agent.
  5. Configure the required rules and runtime settings.
  6. Save the policy.

Rules

FieldDescription
Applies ToSelect where the rule is evaluated: Input, Output, or Both.
ActionSelect what happens when the rule is triggered, such as Block, Warn, Redact, Escalate, Fix, Reask, or Filter.
ProviderSelect the provider used for guardrail evaluation.
CategoryDefine the safety or content category evaluated by the rule.
Severity ThresholdSet the threshold level used to trigger the configured action.
Action MessageEnter the message shown or logged when the rule is triggered.

Runtime Settings

SettingDescription
Fail ModeControls whether execution continues or is blocked if guardrail evaluation fails. Fail-open allows execution to continue if guardrail evaluation fails or times out. Fail-closed blocks execution when guardrail evaluation cannot be completed successfully. Use fail-closed behavior for high-security or compliance-sensitive applications.
Local TimeoutDefines how long the platform waits for local guardrail evaluation.
Model TimeoutDefines how long the platform waits for model-based provider evaluation.
LLM TimeoutDefines how long the platform waits for LLM-based evaluation.
Streaming EvaluationEnables guardrail evaluation while responses are streamed.
Chunk IntervalDefines whether streamed responses are evaluated by sentence, token, or chunk size.
Early TerminationStops evaluation on the first guardrail trigger.
Only one policy can be active per project at a time, and activating a new policy automatically deactivates the previously active policy.

Custom Guardrail Policies

Custom guardrail policies provide centralized, organization-wide safety enforcement across agents and projects. Policies support reusable rules, provider-based moderation, streaming evaluation, budget controls, and scoped runtime enforcement. Custom guardrail policies support:
  • Project-level and agent-level scopes
  • Streaming guardrails
  • Budget controls
  • Constitution principles
  • External moderation providers
When configured budgets are exceeded, guardrails can fall back to pattern-based checks. For API payloads, policy schemas, and advanced configuration examples, see the Guardrail Policy API Reference in the ABL Reference Guide.

Guardrail Providers

Providers are evaluation engines used to classify or inspect content during runtime. Providers can:
  • Detect unsafe content
  • Identify PII
  • Classify toxicity
  • Evaluate prompt injection attempts
  • Perform model-based moderation
Supported provider types include:
  • OpenAI Moderation
  • Azure AI Content Safety
  • Anthropic
  • Lakera Guard
  • Custom HTTP providers
  • Custom webhook providers
  • Built-in PII providers

Configure Providers

For advanced guardrail evaluation, such as toxicity scoring and content classification, connect external providers.
  1. Go to Govern> Guardrails.
  2. Open the Providers tab.
  3. Click Add provider.
  4. Configure the following fields and save the provider.
FieldDescription
Adapter TypeSelect the integration type used for guardrail evaluation, such as OpenAI Moderation, Custom HTTP, Custom Webhook, or Custom LLM.
HostingSelect the provider hosting model, such as Cloud API, Self-Hosted, or Managed Service.
Endpoint URLEnter the provider API endpoint URL.
ModelEnter or select the model used for guardrail evaluation.
AuthenticationEnable and select an authentication profile for the provider connection. Raw API keys are not accepted. Use an Auth Profile for providers that require credentials.
Default CategoryDefine the default moderation or safety category evaluated by the provider.
Default ThresholdDefine the default score threshold that triggers enforcement actions.
Circuit breakerConfigure provider failure handling settings:

Max Failures — Defines how many consecutive failures are allowed before the circuit breaker activates.

Reset Timeout — Defines how long the platform waits before retrying a disabled provider.
Retry ConfigurationConfigure retry behavior for temporary provider failures:

Max Retries — Defines how many retry attempts are made when provider evaluation fails.

Backoff Strategy — Configures the retry delay behavior between failed attempts.

Input Guardrails

Input guardrails evaluate user messages before they reach the LLM. Use input guardrails to detect unsafe content, identify prompt injection attempts, protect sensitive information, and enforce topic or policy restrictions. Use kind: input to evaluate user messages before they reach the LLM.
GUARDRAILS:
  profanity_filter:
    kind: input
    action: block
Input guardrails support:
  • Pattern-based detection
  • Provider-based moderation
  • LLM-based classification
  • Severity-based actions
  • Runtime priority ordering
For advanced syntax and additional examples, see the Guardrails section in the ABL Reference Guide.

Output Guardrails

Output guardrails evaluate generated responses before they are returned to the user. Use output guardrails to prevent unsafe responses, redact sensitive information, apply moderation checks, and inspect streaming output during generation. Use kind: output to evaluate generated responses before they are returned to the user.
GUARDRAILS:
  pii_output_prevention:
    kind: output
    action: block
Output guardrails support:
  • PII detection and redaction
  • Toxicity scoring
  • Streaming response evaluation
  • Bidirectional guardrails
  • Automatic response cleanup and fix strategies
Use kind: both to apply the same rule to both input and output.
GUARDRAILS:
  phone_number_check:
    kind: both
    action: warn
Streaming output guardrails can evaluate responses while content is still being generated.
GUARDRAILS:
  streaming_safety:
    kind: output
    streaming: true
For advanced syntax and additional examples, see the Guardrails section in the ABL Reference Guide.

Best Practices

  • Use project guardrails for centralized governance.
  • Use agent guardrails for localized runtime behavior.
  • Start with warn before enabling block.
  • Test regex patterns carefully to reduce false positives.
  • Enable streaming guardrails for high-risk applications.
  • Use fail-closed behavior for compliance-sensitive workloads.
  • Separate business constraints from safety guardrails.
  • Use providers with caching and budget controls for large-scale deployments.