Multi-Tenant Bedrock Agents Security with Cedar

Jun 6, 2026 · 12 min read · AWS Bedrock Agents AgentCore Cedar Amazon Verified Permissions Multi-Tenant Zero Trust SaaS Security ·

Share on:

Overview

TL;DR (30-Second Read)

With Amazon Bedrock AgentCore now generally available — including AgentCore Identity for agent authentication and AgentCore Policy, which enforces Cedar rules by intercepting every tool call before execution — the security design for multi-tenant SaaS on Bedrock Agents has reached an inflection point. This blueprint addresses the hardest problem in agentic SaaS: a Large Language Model (LLM) that, through prompt injection or hallucination, crosses tenant boundaries or escalates privileges. The answer is a zero-trust, low-latency, two-layer authorization architecture with real-time quota enforcement, built on Amazon Verified Permissions (AVP) and the Cedar policy engine.

Who this is for: Cloud architects and senior security/backend engineers building or running multi-tenant SaaS on Amazon Bedrock Agents.
If you read only one section: jump to Two-Layer Authorization Architecture, the most directly actionable part of the design.

1. Threat Model

When you build enterprise multi-tenant AI agent SaaS, the agent's dynamic decision-making and autonomous reasoning introduce security threats that traditional web applications never faced. This blueprint defends against three core failure modes:

Impersonation and tenant injection. An attacker uses prompt injection to tamper with the agent's runtime context (for example, session attributes), tricking a tool Lambda into performing a cross-tenant data operation.
Permission creep. Across multi-step reasoning, an agent — driven by hallucination or a malicious prompt — accumulates excessive privilege between iterations, or chains individually harmless tools to reach a high-risk operation.
Data isolation failure. Inconsistent resource identification or bolt-on authorization logic causes the Policy Enforcement Point (PEP) to skip the tenant context during a check, leaking data across tenants.

2. Trust Boundaries and Trust Anchors

To eliminate attacks that exploit mutability, the system must establish a single trusted source — a trust anchor — for both the principal and resource identity, and treat them as intrinsic identity rather than runtime-mutable context.

Identity Dimension	Trusted Source	Implementation and Tamper-Resistance
Principal Tenant ID	Session Metadata Store (immutable)	1. Abandon `sessionAttributes`, which are vulnerable to mutation. 2. Use an external Session Metadata Store (such as DynamoDB) keyed by `session_id`, binding `tenant_id`, `cognito_sub`, and `created_at`. 3. The orchestrator writes this record after JWT verification; the tool and PEP have read-only access, and the agent has no write permission.
Resource Tenant ID	Resource Prefix Convention (intrinsic)	1. Enforce a tenant-prefix convention on resource identifiers (such as `tenant-{id}/resources/{resource-id}`). 2. The tenant ID is part of the resource identity, not a bolt-on field. 3. Even if the agent omits tenant context in its parameters, the PEP can extract it by parsing the resource ID, preventing escalation.

flowchart TD
    Client[Client JWT] -->|1. Request| Orch[Orchestrator]
    Orch -->|2. Verify and Write| Meta[(Session Metadata Store)]
    Orch -->|3. Invoke| Bedrock[Bedrock Agents]
    Bedrock -->|4. Tool Call| PreCheck[Pre-Call Check Lambda]
    Meta -.->|5. Read immutable tenant_id| PreCheck
    PreCheck -->|6. IsAuthorized| AVP[AVP Cedar PDP]
    PreCheck -->|7. Forward if Allowed| Tool[Tool Lambda PEP]

Important

Never trust a tenant_id produced by the LLM or passed through sessionAttributes. Every security check must compare against the Session Metadata Store (the principal trust anchor) and the prefix convention (the resource trust anchor).

3. Cedar Policy Patterns

Using Amazon Verified Permissions (AVP) as the Policy Decision Point (PDP), declarative Cedar policies express three tenant-control models.

Pattern A: Owner-Isolated (Strong Tenant Isolation)

The bottom-line hard boundary: a principal may only act on resources belonging to the same tenant.

 1// Pattern A: Owner-Isolated (strong tenant isolation)
 2permit (
 3    principal,
 4    action in [
 5        Action::"GetDocument",
 6        Action::"UpdateDocument",
 7        Action::"DeleteDocument"
 8    ],
 9    resource
10)
11when {
12    principal.tenant_id == resource.tenant_id
13};

Pattern B: Role-Tiered (Fine-Grained Roles Within a Tenant)

Within the isolation boundary, grant different permissions to Admin, Member, and Guest roles.

 1// Pattern B: Role-Tiered (fine-grained roles within a tenant)
 2// 1. Admin, Member, and Guest in the tenant may read documents
 3permit (
 4    principal,
 5    action in [Action::"ReadDocument"],
 6    resource
 7)
 8when {
 9    principal.tenant_id == resource.tenant_id &&
10    principal.role in ["Admin", "Member", "Guest"]
11};
12
13// 2. Only Admin and Member may write or modify documents
14permit (
15    principal,
16    action in [Action::"WriteDocument", Action::"CreateDocument"],
17    resource
18)
19when {
20    principal.tenant_id == resource.tenant_id &&
21    principal.role in ["Admin", "Member"]
22};
23
24// 3. Only Admin may perform high-risk configuration operations
25permit (
26    principal,
27    action in [Action::"DeleteTenantSpace", Action::"ConfigureIntegrations"],
28    resource
29)
30when {
31    principal.tenant_id == resource.tenant_id &&
32    principal.role == "Admin"
33};

Pattern C: Quota-Bounded (Plan-Aware Real-Time Quota Enforcement)

Dynamically block over-quota requests. The tenant's quota state is not stored in Cedar; the PEP fetches it in real time and passes it to the PDP through context.

 1// Pattern C: Quota-Bounded (plan-aware real-time quota enforcement)
 2// Enterprise tenants may call expensive tools directly
 3permit (
 4    principal,
 5    action in [Action::"InvokePremiumTool"],
 6    resource
 7)
 8when {
 9    principal.tenant_id == resource.tenant_id &&
10    principal.tier == "Enterprise"
11};
12
13// Standard tenants must pass a context-supplied quota check
14permit (
15    principal,
16    action in [Action::"InvokePremiumTool"],
17    resource
18)
19when {
20    principal.tenant_id == resource.tenant_id &&
21    principal.tier == "Standard" &&
22    context.monthly_api_calls < context.api_call_limit
23};

Tip

For concurrent quota updates, use a DynamoDB conditional update for atomicity, and perform a read-after-write (strongly consistent, no-cache) read before evaluation to prevent over-quota window bypass.

4. Two-Layer Authorization Architecture

To balance defense and runtime cost, the system uses a two-layer design: a pre-call check (first-line entrance filter) and a post-call PEP (second-line in-Lambda safety net). This mirrors how AgentCore Policy itself intercepts tool calls before execution — the pre-call layer is the place to do it cheaply, while the post-call PEP remains the defense-in-depth backstop.

 1+--------------------------------------------------------------------------+
 2|                       Orchestrator (Cognito JWT)                         |
 3+--------------------------------------------------------------------------+
 4                                     |
 5                                     v
 6+--------------------------------------------------------------------------+
 7| 1. Pre-Call Check (Action Group Entrance)                                |
 8|    - Intercept structured params from Bedrock Agents (OpenAPI schema)    |
 9|    - Map to an (Action, Resource) tuple                                  |
10|    - Read principal.tenant_id from the Session Metadata Store            |
11|    - Extract resource.tenant_id from the resource prefix                 |
12|    - AVP Evaluate -> DENY: return a unified 403 JSON to the runtime      |
13+--------------------------------------------------------------------------+
14                                     |
15                                  ALLOW
16                                     v
17+--------------------------------------------------------------------------+
18| 2. Post-Call PEP (Tool Lambda PEP) - Safety Net                          |
19|    - Guards against pre-call misses or bypass                            |
20|    - Re-evaluates by calling the AVP PDP again                           |
21|    - DENY: block execution, return structured 403, trip circuit + audit  |
22+--------------------------------------------------------------------------+

A. Layer 1: Pre-Call Check (Entrance Interception)

Responsibility: intercept after a Bedrock Agents action group fires but before the real business Lambda (the tool) executes.
Mechanism:
1. Configure a single OpenAPI schema on the action group so Bedrock Agents calls carry structured JSON parameters.
2. The pre-check Lambda reads those structured JSON parameters directly, extracting the resource identifier (such as an S3 key or DynamoDB key). It does not rely on extracting identifiers from free-form LLM output.
3. Extract the tenant ID via the prefix convention and compare it against the Session Metadata.
4. Unified DENY response. On DENY, the pre-call check returns exactly the same structured 403 payload as the post-call PEP (see 4.B), which the Bedrock Agents runtime interprets as tool output — keeping the API response contract uniform.
5. Prefix parsing and Cedar resource construction (implementation detail). After parsing the structured parameters, the PEP must split the resource ID (for example tenant-corp-99/doc-a1b2c3) into a (tenant_id, resource_id) tuple, construct the matching Cedar resource entity Document::"tenant-corp-99:doc-a1b2c3", and explicitly set the attribute resource.tenant_id = "tenant-corp-99". This mapping from a naming convention to Cedar policy evaluation is the crux — the Cedar engine does not parse prefixes on its own.
Benefits:
- Avoids the cold-start and execution cost of tool Lambdas triggered by malicious or redundant calls.
- Cuts off illegal calls early, saving substantial LLM tokens.
- Produces clearer, more direct audit signals.

B. Layer 2: Post-Call PEP (Safety Net)

Responsibility: the PEP logic inside the tool Lambda, the defense-in-depth backstop.
Unified 403 DENY response shape. Whether triggered at the pre-call check or the post-call PEP, a DENY does not crash. It returns a single structured 403 payload to Bedrock Agents:
```
1{
2  "status": "error",
3  "code": "AccessDenied",
4  "message": "Security policy violation: operation not permitted for this tenant context."
5}
```
This lets the Bedrock Agents runtime recognize an "unauthorized" outcome and present the permission limit gracefully to the user instead of crashing the system.

C. Iteration Limits and Circuit Breaker

Max iterations. To stop Bedrock Agents from entering a "hallucination retry loop" when blocked (repeatedly swapping resource IDs to bypass policy), the orchestrator enforces class-aware defaults:
- Write or sensitive workflows: a small default (such as 5). This conservative default comes from injection-and-retry patterns common in production; raise it to match the maximum depth of legitimate business workflows in your deployment.
- Read-heavy or research workflows: allow 10–15 iterations to preserve complex reasoning chains.
PEP-level circuit breaker.
- Trigger: if a session triggers 3 consecutive DENY results during tool calls, trip the breaker.
- State marking: on trip, mark the session compromised=true in the Session Metadata Store.
- Cheapest enforcement path: read and check the compromised flag as the first step of the pre-call check Lambda. If true, block immediately and return a hard-fail (SESSION_REVOKED) — no need to make an expensive AVP (Cedar PDP) call.
- Audit evidence chain: on trip, write a circuit_breaker_tripped event to the audit trail and record the N denials that caused it as evidence.

5. Structured Audit Trail Schema

To satisfy independent audit requirements such as SOC 2 and ISO 27001, every authorization decision (pre-call and post-call), plus circuit-breaker and over-quota events, must be emitted as standard JSON audit logs.

Audit Log JSON Schema — Extension Fields

event_type: event type (AgentAuthorizationEvaluation / circuit_breaker_tripped).
deny_reason: reason for denial (policy_denied / quota_exceeded / quota_store_unreachable / circuit_breaker_active).
determining_policies: the specific Cedar policy IDs that determined the decision. This maps to the AVP IsAuthorized response field determiningPolicies. Note an important AVP semantic: on an implicit DENY (no matching policy), determiningPolicies is an empty array — so absence of a policy ID is itself a meaningful audit signal, not a gap.
execution_status: an execution status code that records the final decision outcome.

Allowed Values for `execution_status`

To trace the final outcome of a request's lifecycle in the audit trail, execution_status must be one of the following enumerated values:

Allowed Value	Meaning	When It Fires
`PROCESSED`	Normal evaluation	Request was permitted (`ALLOW`) by AVP, or denied (`DENY`) by a normal policy decision.
`DENY`	PEP interception	Blocked by the PEP inside the tool.
`SESSION_REVOKED`	Session revoked / circuit tripped	PEP-level circuit breaker tripped, or the session `compromised` flag is active; subsequent calls are rejected directly in this state.
`SYSTEM_FALLBACK_DENY`	System-failure fail-closed block	AVP / Cedar PDP is down, or the Session Metadata Store is unavailable, triggering a fail-closed block.

Example Audit Log Events

Example 1: Quota Exceeded Deny

 1{
 2  "timestamp": "2026-06-06T02:30:15Z",
 3  "event_type": "AgentAuthorizationEvaluation",
 4  "tenant_id": "tenant-corp-99",
 5  "session_id": "bedrock-session-47609bf2",
 6  "principal": "User::tenant-corp-99:user-alex",
 7  "action": "Action::InvokePremiumTool",
 8  "resource": "Document::tenant-corp-99:doc-a1b2c3",
 9  "decision": "DENY",
10  "deny_reason": "quota_exceeded",
11  "quota_metric": "monthly_api_calls",
12  "determining_policies": ["policy-quota-limit-standard"],
13  "execution_status": "PROCESSED"
14}

Example 2: Circuit Breaker Tripped

 1{
 2  "timestamp": "2026-06-06T02:31:05Z",
 3  "event_type": "circuit_breaker_tripped",
 4  "tenant_id": "tenant-corp-99",
 5  "session_id": "bedrock-session-47609bf2",
 6  "principal": "User::tenant-corp-99:user-alex",
 7  "circuit_breaker_deny_history": [
 8    {
 9      "timestamp": "2026-06-06T02:30:45Z",
10      "action": "Action::UpdateDocument",
11      "resource": "Document::tenant-corp-99:doc-888"
12    },
13    {
14      "timestamp": "2026-06-06T02:30:52Z",
15      "action": "Action::UpdateDocument",
16      "resource": "Document::tenant-corp-99:doc-889"
17    },
18    {
19      "timestamp": "2026-06-06T02:31:01Z",
20      "action": "Action::UpdateDocument",
21      "resource": "Document::tenant-corp-99:doc-890"
22    }
23  ],
24  "execution_status": "SESSION_REVOKED"
25}

6. Failure Modes and Observability Matrix

Failure Scenario	Potential Impact	Protection and Degradation Strategy	Audit Log and Alerting Metric
AVP / Cedar PDP down or evaluation fails	Authorization decision blocked	Fail-closed (hard block): the PEP catches the exception and defaults to `DENY`, refusing all access.	Raise a `CedarPDPUnreachable` critical alarm; write an audit log with decision source `SYSTEM_FALLBACK_DENY`.
Quota Store (DynamoDB) hot partition / throttling	Cannot confirm real-time quota state	Tier-aware fail-safe: 1. Free / Standard tier: allow read-only / standard tools that do not depend on quota; block only quota-dependent premium tools (return `QuotaStatusUnknown`). 2. Enterprise tier: under a strong SLA commitment, force fail-closed, rejecting all premium tool calls with `QuotaStatusUnknown`.	1. Raise a `QuotaStoreThrottle` alarm, plus a new `QuotaStoreThrottleAffectingFreeTier` metric (track free-tier degradation error rate to monitor churn risk). 2. Mark `deny_reason` as `quota_store_unreachable` (distinct from `quota_exceeded`).
Cognito JWT expired or malformed	Unverified identity	Fail-closed (unauthorized block): the orchestrator rejects at the entrance, making no downstream calls.	Raise an `AuthTokenValidationFailed` metric; block and emit an HTTP 401 audit log at the outermost gateway.
Session Metadata Store (DynamoDB) down	Lost tenant binding	Fail-closed (hard block): unable to recover the principal's true tenant identity, the pre-check blocks all downstream calls.	Raise a `SessionMetadataUnreachable` critical alarm; record decision `SYSTEM_FALLBACK_DENY` in the audit log.

7. Conclusion

Multi-tenant isolation for AI agents is not the same problem as multi-tenant isolation for a CRUD API. The agent reasons, retries, and composes tools at runtime, so the security boundary has to be explicit, external, and fail-closed — never inferred from anything the model produced. The three anchors of this blueprint hold that line:

Identity comes from a trust anchor, not the model. Principal tenant ID lives in an immutable Session Metadata Store; resource tenant ID is intrinsic to the resource name. The LLM never gets a vote.
Authorization is two layers, decided by Cedar. A cheap pre-call check stops most bad calls before any tool runs; a post-call PEP is the defense-in-depth backstop. Both call the same AVP PDP and return the same structured 403.
Every decision is auditable and every failure is fail-closed. determiningPolicies, execution_status, and the circuit-breaker evidence chain give auditors a complete record, while PDP or store outages degrade to DENY, never to open access.

Now that AgentCore Identity and AgentCore Policy are generally available, much of this can lean on managed building blocks — AgentCore Policy already intercepts tool calls with Cedar, and AgentCore Identity handles inbound JWT authorization (see how an API Gateway façade closes the OAuth gaps for AgentCore Gateway + Cognito). The patterns here remain the design contract you enforce on top of them. If you are weighing where the isolation boundary should sit in the first place, the serverless multi-tenant OpenHands on AWS post works the same problem from the infrastructure side.