Agent Toolkit for AWS: What It Changes for Claude Code
Overview
If you've been using Claude Code for AWS development, you've probably seen the pattern: you paste a CloudFormation snippet into your session, Claude suggests something plausible, you deploy it, and the stack events stream lights up with CREATE_FAILED on a property the model couldn't have known about — because its training data stopped months ago.
The usual workaround has been hand-rolling context into CLAUDE.md: copying service endpoint quirks, IAM condition key syntax, and PrivateLink DNS formats that the model gets wrong. It works, but it's manual, fragile, and grows without bound.
AWS shipped the Agent Toolkit for AWS (GA) on May 6, 2026 — the official, AWS-supported path forward from the community-grade AWS Labs MCP servers, skills, and plugins. Three plugins (aws-core, aws-agents, aws-data-analytics) bundle 30+ curated skills and the AWS MCP Server, now also GA. Plugins ship for Claude Code and Codex out of the box; Kiro and other agents connect through direct MCP server configuration.
This is the fix. But installing it without thinking will burn tokens or open your IAM scope wider than you want. This post is the integration guide for developers already running Claude Code on Bedrock.
Why Claude Code Fails on AWS
The failure modes are consistent:
Outdated knowledge. CloudFormation resource schemas evolve. New condition keys appear. Deprecations happen. A model trained before late 2025 won't know about recent additions like AWS::Bedrock::AgentCore or new IAM session-tag conditions. It will confidently write something that parses but doesn't deploy.
Multi-service wiring drift. A real workload involves IAM, VPC, security groups, CloudWatch, and the actual service — five or six resources that must reference each other in exactly the right way. Claude gets the first two right and starts fabricating ARN formats by resource three.
No environment awareness. The model doesn't know your account ID, your VPC endpoint DNS suffixes, or which AZs your subnets are in. Every context injection you forget is a hallucinated placeholder.
The usual answer — more CLAUDE.md context — is a patch, not a solution. What you actually need is live AWS knowledge wired into the tool-call loop.
What's in the Toolkit (and How It Differs from AWS Labs)
Before the toolkit, the ecosystem looked like this:
flowchart TD
subgraph "Before: AWS Labs MCP era"
CC1[Claude Code] -->|tool calls| LABS[awslabs MCP servers]
LABS --> CFN[CloudFormation MCP]
LABS --> CDK[CDK MCP]
LABS --> DOCS[Docs MCP]
CC1 -->|manual context| MD[CLAUDE.md]
end
AWS Labs MCP servers were community-grade: useful, but inconsistently maintained, with no end-to-end skill evaluation and no official IAM scoping guidance.
The Agent Toolkit formalizes three layers:
flowchart LR
CC[Claude Code] -->|MCP protocol| MCP[AWS MCP Server managed]
CC -->|skill discovery| SKILLS[Curated Skills]
CC -->|installer| PLUGINS[Plugins aws-core aws-agents aws-data-analytics]
MCP --> APIS[300+ AWS APIs]
MCP --> KMCP[Knowledge MCP docs search]
MCP --> SCRIPT[Sandboxed script exec]
SKILLS --> CFN1[CloudFormation patterns]
SKILLS --> SVL[Serverless EDA]
SKILLS --> AGT[AgentCore]
Skills are curated packages of instructions and reference materials — validated CloudFormation patterns, Well-Architected serverless heuristics, CDK idioms. They don't make API calls. They constrain the model toward patterns that actually work, and they're loaded on demand so unused skills don't burn context.
AWS MCP Server is a managed remote endpoint (https://aws-mcp.<region>.api.aws/mcp) reached through the mcp-proxy-for-aws stdio shim. It exposes three things to the agent:
- Full AWS API coverage across 300+ services through one authenticated endpoint
- Sandboxed Python execution for multi-step operations
- Real-time documentation search via the Knowledge MCP (
https://knowledge-mcp.global.api.aws), which needs no AWS credentials
Plugins are the delivery mechanism. Three of them at GA:
| Plugin | Coverage |
|---|---|
aws-core | Service selection, CDK/CloudFormation, serverless, containers, storage, observability, billing, SDK usage, deployment. Start here. |
aws-agents | Building AI agents on AWS with Amazon Bedrock and AgentCore. |
aws-data-analytics | Data lake, analytics, ETL with S3 Tables, Glue, Athena. |
The concrete improvement over AWS Labs:
- IAM context keys that distinguish agent actions from human actions — you can write a policy that allows write-actions for the human role but read-only when reached through the MCP server, even if the underlying role permits writes.
- CloudWatch metrics and CloudTrail audit logging on every MCP request — agent activity is observable, not invisible.
- Skills with end-to-end evaluation — not just "someone wrote a markdown file."
AWS Labs continues to accept contributions; over time, the best of it transitions into the toolkit.
Installing in Claude Code
Three commands. Run them inside a Claude Code session:
1# Add the marketplace
2/plugin marketplace add aws/agent-toolkit-for-aws
3
4# Install the core plugin (start here)
5/plugin install aws-core@agent-toolkit-for-aws
6
7# Optional: agents and analytics plugins
8/plugin install aws-agents@agent-toolkit-for-aws
9/plugin install aws-data-analytics@agent-toolkit-for-aws
The plugin ships an .mcp.json that registers the AWS MCP Server through uvx mcp-proxy-for-aws@latest. So you need uv installed locally:
1# macOS / Linux
2curl -LsSf https://astral.sh/uv/install.sh | sh
For documentation search and skill discovery, no AWS credentials are needed. The Knowledge MCP is unauthenticated. For API calls and sandboxed script execution, you do need credentials configured locally — see my earlier post on credential_process for a pattern that doesn't leave plaintext keys in ~/.aws/credentials.
Codex and Other Agents
The same marketplace works in Codex:
1codex plugin marketplace add aws/agent-toolkit-for-aws
For agents without plugin support (Kiro and others), configure the AWS MCP Server directly in the agent's MCP config and install skills from the toolkit repo with npx skills add aws/agent-toolkit-for-aws/skills.
Scoping IAM Without Footguns
The biggest mistake I've watched developers make: pointing the AWS MCP Server at credentials with AdministratorAccess. The toolkit's most important IAM feature exists precisely so you don't have to.
The agent-vs-human IAM distinction. Every request the AWS MCP Server forwards on your behalf carries the aws:ViaAWSMCPService context key, which is true when an MCP server makes the call and false when the principal calls AWS directly. You can write IAM policies that gate behavior on that key — read-only on the agent path, full access on the human path — without splitting roles or vending separate credentials.
A minimal policy that denies destructive actions when reached through the MCP server, while leaving the underlying role's permissions untouched for direct human use:
1{
2 "Version": "2012-10-17",
3 "Statement": [
4 {
5 "Sid": "DenyDestructiveActionsViaMCP",
6 "Effect": "Deny",
7 "Action": [
8 "cloudformation:DeleteStack",
9 "iam:Delete*",
10 "iam:Put*",
11 "s3:DeleteBucket",
12 "dynamodb:DeleteTable"
13 ],
14 "Resource": "*",
15 "Condition": {
16 "Bool": {
17 "aws:ViaAWSMCPService": "true"
18 }
19 }
20 }
21 ]
22}
This is the canonical pattern from the AWS IAM docs — the same condition key drives every "agent vs. human" boundary you might want to enforce.
If you want the agent to deploy (not just validate), bound the blast radius by narrowing the resource ARN to a stack-name prefix and constraining the resource types the change set can touch:
1{
2 "Version": "2012-10-17",
3 "Statement": [
4 {
5 "Sid": "AgentDeployBoundedStackNames",
6 "Effect": "Allow",
7 "Action": [
8 "cloudformation:CreateChangeSet",
9 "cloudformation:ExecuteChangeSet",
10 "cloudformation:DescribeChangeSet"
11 ],
12 "Resource": "arn:aws:cloudformation:*:*:stack/claude-code-*/*",
13 "Condition": {
14 "ForAllValues:StringEquals": {
15 "cloudformation:ResourceTypes": [
16 "AWS::Lambda::*",
17 "AWS::IAM::Role",
18 "AWS::Logs::*",
19 "AWS::Events::*",
20 "AWS::DynamoDB::Table"
21 ]
22 }
23 }
24 }
25 ]
26}
Two caveats — both important.
ForAllValues:StringEquals evaluates as true when the multi-valued key is absent from the request. So if the caller doesn't pass ResourceTypes, this policy imposes no resource-type restriction at all. To actually enforce the constraint, either require the caller to always pass ResourceTypes (workflow convention or service control policy), or pair the existing condition with a Null block that denies when the key is missing:
1"Condition": {
2 "Null": {
3 "cloudformation:ResourceTypes": "false"
4 },
5 "ForAllValues:StringEquals": {
6 "cloudformation:ResourceTypes": [ "AWS::Lambda::*", "..." ]
7 }
8}
And if you want this Allow to apply only to agent-initiated changes, add an aws:ViaAWSMCPService: true condition — the same context key as the Deny policy above, used here to scope the allow to MCP traffic instead of denying it.
Use short-lived credentials. Don't bake long-lived keys into ~/.aws/credentials. Assume a role and vend session tokens through credential_process — the same pattern works for the MCP server because it just reads from the local credential chain.
Before/After: Building an EventBridge Pipe
Illustrative composite — wiring a DynamoDB stream into SQS via EventBridge Pipes, with the IAM role to make it work. The failure modes below are real ones I've watched models produce; the token figures are rough orders of magnitude, not measurements.
Without the toolkit. Claude tends to produce templates with one or more of:
DependsOnthat creates a circular reference- DynamoDB table missing
StreamSpecification(StreamViewType: NEW_AND_OLD_IMAGESnot set), so Pipes has no stream to source from - A Pipes role trust policy that targets the wrong service principal —
events.amazonaws.cominstead ofpipes.amazonaws.com - Trust policy missing the
aws:SourceArn/aws:SourceAccountconfused-deputy conditions
Two or three rounds of correction. Order of 8K input tokens before the template deploys cleanly.
With aws-core installed. The relevant skills activate — aws-serverless (which has an EventBridge Pipes reference under orchestration.md and a DynamoDB Streams → Pipes → Lambda pattern in architecture.md), aws-messaging-and-streaming (Pipes service-principal and confused-deputy guidance), and aws-iam (trust policy correctness). The Knowledge MCP can confirm the current AWS::Pipes::Pipe schema if needed. Claude produces a deployable template on the first pass with the correct pipes.amazonaws.com principal and proper aws:SourceArn scoping.
Order of 3K input tokens, one round.
The gain compounds across a session. A complex multi-service stack (API Gateway → Lambda → DynamoDB → EventBridge → SQS, with the right IAM for each hop) used to be a multi-hour back-and-forth. With skills active, it's one generation plus review.
A caveat worth measuring on your own workload: the Knowledge MCP isn't free in tokens. Documentation search returns prose, and prose is expensive. For trivial single-resource changes, the search overhead can exceed what it saves. Benchmark before turning it on for every session — the AWS MCP Server emits CloudWatch metrics that make this measurable.
Layering on Top of Your Existing CLAUDE.md
If you've been hand-rolling AWS context into CLAUDE.md, don't throw it out. The toolkit and your custom context are complementary — but split them across two files so you don't leak account topology into the public repo.
CLAUDE.md (committed) holds conventions that anyone working on the project needs: naming patterns, design rules, what the toolkit handles, project-specific overrides.
CLAUDE.local.md (gitignored) holds environment-specific values that shouldn't be in source control: account IDs, VPC IDs, subnet IDs, security group IDs, on-call contacts. Add CLAUDE.local.md and .local.* to .gitignore if they aren't already.
A clean layering pattern:
1# CLAUDE.md (committed)
2
3## What the Agent Toolkit Handles
4The aws-core plugin is installed. It provides:
5- CloudFormation resource schemas (do not override)
6- Serverless and EDA patterns
7- IAM best practices
8- Live documentation search via the Knowledge MCP
9
10## Project Conventions
11- Naming convention: {team}-{service}-{env}-{resource}
12- Lambda timeout: 30s max (SLA constraint)
13- DynamoDB: always use on-demand billing (cost policy)
14- Primary region: ap-northeast-1
15- For account ID, VPC, and subnets, see CLAUDE.local.md
1# CLAUDE.local.md (gitignored — never commit)
2
3## Environment Values
4- AWS Account: 123456789012 (production)
5- VPC ID: vpc-0a1b2c3d4e5f (do not create new VPCs)
6- Private subnets: subnet-aaa, subnet-bbb, subnet-ccc
7- Security group for outbound: sg-0f1a2b3c
Skills load automatically. The two CLAUDE.md files fill in what the toolkit can't know. The split keeps your repo shareable while still giving the agent everything it needs in your local checkout.
The toolkit also ships recommended rules files you can drop into a project to tell agents how to use AWS most effectively — for example, querying the MCP server before fabricating a service capability, or discovering an applicable skill before writing code from scratch. Worth borrowing into your own CLAUDE.md even if you don't adopt them wholesale.
A Builder's Perspective: What the Toolkit Got Right
I've been running a similar system for several months — an open-source aws-skills plugin set with aws-cdk, aws-cost-ops, serverless-eda, aws-agentic-ai, and a shared aws-common dependency. The architecture maps closely to what AWS shipped:
- Skills are separate from execution. Injecting expert context into the model is different from making API calls. Keep them distinct so you can use one without the other.
- One plugin per domain, not a monolith. A CDK-heavy project doesn't need the serverless-EDA patterns in context. Loading only what's relevant matters at scale.
- The MCP server is the live signal, not the only signal. Static skills stay valuable even when documentation search is available. Patterns and idioms don't change as fast as service features.
What the official toolkit has that the community version didn't: the IAM context keys are the real differentiator (these need AWS-side support — you can't bolt them on), the MCP server is production-grade and officially supported, and the Knowledge MCP is backed by the actual documentation index, not a scrape.
If you're starting fresh, install aws-core and don't look back. If you've already invested in a custom plugin set like mine, the migration path is to keep your project-specific skills (they capture conventions the toolkit can't know) and let aws-core replace the generic ones.
What's Still Missing
Honest gaps as of May 2026:
No CDK construct-level validation. Skills cover CloudFormation resource schemas. CDK L2/L3 constructs that wrap those resources get less coverage. If you're generating CDK code, still run cdk synth and review the output.
Account topology isn't auto-discovered. The MCP server can run describe calls when asked, but it doesn't automatically prime context with your VPC, subnet, and security group IDs at session start. You still need to put those in CLAUDE.md or feed them in early.
Knowledge MCP token cost. Documentation search isn't always cheaper than the model guessing. For simple operations, it adds tokens. Measure before enabling unconditionally.
Plugin support beyond Claude Code and Codex. Plugins are agent-specific. For Kiro and other agents, you configure the MCP server directly and install skills, but you don't get the bundled experience.
Watch For
The toolkit will move quickly. Things worth tracking:
- More plugins beyond the initial three (security, networking, migration are obvious gaps)
- CDK construct-level skills, not just CloudFormation
- Cross-toolkit composition with AgentCore so agents can call other agents inside the MCP boundary
Summary
The Agent Toolkit for AWS is a meaningful upgrade for Claude Code + AWS workflows. Validated skills give better first-pass accuracy on CloudFormation. The Knowledge MCP keeps the model current with breaking schema changes. The IAM context keys make it safe to give agents real credentials.
Install aws-core. Scope the IAM with the agent-vs-human distinction. Layer it on top of your existing CLAUDE.md rather than replacing it. Benchmark the Knowledge MCP token cost for your specific workload before turning it on for every session.
If you want to see how this fits with a custom plugin set, my aws-skills repo is open source — it predates the official toolkit, the architecture transfers, and the companion post walks through the same workflow with the community-grade version.
Related posts: Build on AWS Faster with Claude Code and AWS Skills · Secure AWS Credentials with credential_process · Claude Code Cost Per Project on AWS