Safeguard your agentic AI applications with the Amazon Bedrock Guardrails InvokeGuardrailChecks API

Today, we’re announcing a new API with Amazon Bedrock Guardrails. With this API, you can apply individual safeguards, also referred to as safety checks, at any point in your agentic AI applications without creating guardrail resources. The new InvokeGuardrailChecks API gives you the flexibility to invoke supported safeguards at any turn in the agentic loop and take the required action in your application logic. The API operates in detect-only mode and returns numeric scores for each safeguard. You can define custom thresholds and actions in your applications to block, bypass, retry, or log results for auditing purposes based on your specific requirements.
Amazon Bedrock Guardrails provides configurable safeguards to help you build safe generative AI applications. With comprehensive safety controls across foundation models, Amazon Bedrock Guardrails helps you detect and filter undesirable content and protect sensitive information in both user inputs and model responses.
The new InvokeGuardrailChecks API extends these capabilities for agentic AI applications with multi-turn workflows. AI agents plan tasks, invoke tools, process outputs, and iterate through loops, often without direct user interaction. Each step in this loop carries a different risk profile and requires different safeguards. With the InvokeGuardrailChecks API, you can apply the checks you need, where you need them, without the operational overhead of provisioning separate guardrail resources for each stage. The API returns a numeric score that helps you define your own threshold and action for your application. In this post, we walk through how the InvokeGuardrailChecks API works and how to use it to build safe, multi-turn agentic AI applications.
Why agentic AI needs targeted safety controls
Generative AI applications typically follow a familiar pattern: a user sends a prompt, the model responds, and a guardrail evaluates both. You create one guardrail resource, configure your policies, and apply it uniformly.
AI agents work differently. They operate in loops, receiving input, generating a response, and repeating multiple turns in a conversation. A single user session might involve 10, 20, or more turns. Each turn has two stages where safety checks matter: before the content goes to the model (input), and before the model response goes back to the user (output).
Consider a multi-turn customer support agent that handles varied requests across a conversation:
- User sends initial question (risk: prompt injection issues).
- Model generates a plan or response asking for details (risk: model output might contain harmful content influencing the model’s reasoning).
- User sends follow-up with account details (risk: input might contain sensitive information, that is, personally identifiable information (PII)).
- Model generates final response (risk: harmful or inappropriate content in the reply).
Each step has a distinct risk profile. Creating and applying separate guardrail resources for each step creates operational overhead that scales poorly as you deploy hundreds of agents.
The InvokeGuardrailChecks API gives you granular, per-request control over which safeguards to run at each step of the agent loop. It returns numeric scores so you can define the appropriate thresholds and actions in your application logic, such as retry, block, or bypass, based on what suits your use case.
How it works
The InvokeGuardrailChecks API uses a structured messages schema, where each content block has a required role such as system, user, or assistant. This is how agent interactions operate in loops. These roles provide the context the safeguard needs to evaluate the content precisely. This aspect is critical for multi-turn agentic workflows.
The InvokeGuardrailChecks API offers the following capabilities:
Resourceless: You don’t need to create guardrail resources upfront. There’s no CreateGuardrail step, no guardrail IDs to track, and no versions to manage. You specify which safeguards to run directly in each API request. This makes it straightforward to add, remove, or adjust checks as your workflows evolve.
Consider the following scenario. Without a resourceless API, applying a safeguard at an ephemeral step in an agentic loop requires multiple lifecycle calls. For example, suppose you want to validate a tool’s output before passing it to the next iteration. You first create a guardrail resource, invoke it, and then delete it after the invocation to avoid resource sprawl. When a single agentic user query triggers dozens of loop iterations, each with different safety requirements, this create-invoke-delete lifecycle becomes untenable. The InvokeGuardrailChecks API avoids this. You call the API with the safeguard you need.
Detect-only: The API doesn’t block, mask, or rewrite content. It returns findings with numeric scores for each safeguard, and you decide what action your application should take. With your custom threshold, you have full control to implement context-aware logic. For example, you can block high-confidence threats, route ambiguous findings to human review, or log low-confidence results for audits.
Symmetric request-response: The safeguards you configure in your request are the same keys returned in the response. If you request contentFilter and sensitiveInformation, only those two appear in results. This makes it straightforward to map findings back to the safeguards that produced them.
Independent prompt attack detection: Unlike the ApplyGuardrail API, where prompt attack detection is bundled inside content filters, the InvokeGuardrailChecks API separates prompt attack detection as its own standalone check. You can invoke prompt attack detection independently without running content filters. Additionally, you can specify individual categories such as jailbreak, prompt injection, or prompt leakage to get fine-grained control.
The InvokeGuardrailChecks API supports the following safeguards:
| Safeguard | What it detects | Score type |
| Content filters | Harmful content across categories: HATE, VIOLENCE, SEXUAL, INSULTS, MISCONDUCT | Severity score (0–1) with discrete scores |
| Prompt attack detection | Jailbreaks, prompt injection, and prompt leakage attempts | Severity score (0–1) with discrete scores |
| Sensitive information filters | PII entities including email, phone, SSN, credit card numbers (31 entity types) | Confidence score (0–1) with discrete scores |
The API returns two types of scores depending on the check:
- Severity score (content filters and prompt attack): A discrete value in the set {0, 0.2, 0.4, 0.6, 0.8, 1.0} that represents how strongly the content matches the safeguard criteria. A score of 1.0 indicates the strongest match. A score of 0 indicates benign content. This score measures the severity of the content itself, not the certainty of the underlying model.
- Confidence score (sensitive information): A discrete value in the set {0, 0.2, 0.4, 0.6, 0.8, 1.0} that represents how certain the model is about the presence of a specific PII entity. Each finding also includes
messageIndex,contentIndex, and character offsets (beginOffset,endOffset) for precise location within the content.
Getting started with the InvokeGuardrailChecks API
In this section, we walk through how to use the InvokeGuardrailChecks API in your application.
Prerequisites
- An AWS account with Amazon Bedrock access.
- An AWS Identity and Access Management (IAM) role with
bedrock:InvokeGuardrailCheckspermission. - AWS Command Line Interface (AWS CLI) or AWS SDK (Boto3 for Python) installed.
- Basic familiarity with agentic AI concepts.
Step 1: Set up IAM permission
Because the InvokeGuardrailChecks API is resourceless, there’s no guardrail ARN to scope. Attach the following identity-based policy to your IAM role or user:
Why use Resource: "*"? The InvokeGuardrailChecks API is resourceless by design. There’s no guardrail ARN associated with any call. The wildcard is the only valid value for this field. This doesn’t grant access to other Amazon Bedrock resources. It applies solely to the bedrock:InvokeGuardrailChecks action.
To further restrict access, combine with condition keys such as the following:
aws:SourceIporaws:SourceVpcto limit calls to specific networks.aws:PrincipalTagto restrict to specific teams or roles (for example,"aws:PrincipalTag/team": "agent-safety").aws:RequestedRegionto constrain to specific AWS Regions (as shown in the preceding policy).
Step 2: Apply content filters to user’s input
When your agent receives a user’s message, check for harmful content before sending it to a model. The following example evaluates content for violence and misconduct:
The following is the example output:
The high severity scores indicate that the content strongly matches harmful categories. Your application decides the action, such as block, log, or escalate.
Step 3: Detect prompt attacks on system and user pairs
AI agents often have system instructions that bad actors might try to override. You can evaluate a system-user message pair for jailbreaks and prompt leakage attempts:
The following is the example output:
Step 4: Run multiple checks on tool output
When a tool returns results from a web search or database query, you can apply multiple checks in a single call. The API executes checks in parallel:
The following is the example output:
The sensitive information results include character offsets, giving you precise location data for client-side masking or redaction.
Step 5: Build adaptive response logic with scores
The InvokeGuardrailChecks API uses scores to drive context-aware decisions. The following pattern shows adaptive response logic:
With this pattern, you can implement thresholds that match your business context. A financial services application might block at 0.4, although a creative writing tool might only block at 0.8.
Step 6: Integrate with an agent framework
The InvokeGuardrailChecks API integrates naturally with agent frameworks that expose lifecycle hooks. The following example uses Strands Agents, which provides hooks at key stages of the agent loop:
InvokeGuardrailChecks compared to ApplyGuardrail: When to use each
You can use either the InvokeGuardrailChecks or ApplyGuardrail API offered by Amazon Bedrock Guardrails, depending on your use case and application. The following table provides details and pointers on when to use which API.
| InvokeGuardrailChecks | ApplyGuardrail | |
| Use case | Targeted checks at specific points or turns in workflows | Uniform enforcement across your application |
| Resource model | Resourceless. Checks specified inline per request using your own control plane | Create, version, and manage guardrails resources upfront |
| Decision logic | Detect only. Returns numeric scores so you decide the action for your application logic | Automatic block, mask, or bypass based on pre-configured thresholds |
| Targeted toward | Agentic AI workflows requiring per-step safety requirements | Traditional request-response AI applications |
Clean up
The InvokeGuardrailChecks API is resourceless, so no persistent resources are created. To clean up after testing, complete the following steps:
- Remove any IAM policies or roles.
- Delete any Amazon CloudWatch log groups if you configured logging during development.
Conclusion
The InvokeGuardrailChecks API complements current Amazon Bedrock Guardrails capabilities with composable safety building blocks for agentic AI. Here are some additional takeaways:
- Granular control – Apply only the safeguards that you need at each stage of your agent loop without creating individual guardrail resources for each stage. This reduces operational overhead as you scale to hundreds of agents.
- Application-driven decisions – Numeric severity and confidence scores replace opaque pass-or-fail outcomes. They support adaptive logic that matches your business context and give you control based on your use case.
- Minimal overhead – No guardrail resources to create, version, or manage. Specify checks inline and evolve your safety posture as workflows change.
To get started, see the InvokeGuardrailChecks API reference and apply individual safety checks across your agentic AI applications.
About the authors



