Xiaozhi ESP32 MCP Gateway with Amazon Bedrock AgentCore
Overview
The Xiaozhi hardware is an impressive ESP32-based AI voice assistant capable of offline wake-up, multi-language support, and cloud connectivity. But what if you want your Xiaozhi device to access multiple AI tools, APIs, and services without managing complex integrations on the hardware side? This is where Amazon Bedrock AgentCore Gateway shines as a unified aggregation layer for Model Context Protocol (MCP) servers.
In this guide, I'll walk you through building a distributed MCP architecture that connects Xiaozhi hardware to multiple cloud services through a single WebSocket connection, leveraging AgentCore Gateway to aggregate tools ranging from simple calculators to complex RESTful APIs like real-time football data.
The Challenge: Connecting Edge Devices to Multiple AI Tools
Xiaozhi hardware excels at voice interaction and local control, but extending its capabilities to access dozens of cloud services presents several challenges:
- Connection Management: Each MCP server requires its own connection, protocol handling, and authentication
- Resource Constraints: ESP32 devices have limited memory and processing power for managing multiple connections
- API Key Security: Storing numerous API keys on edge devices poses security risks
- Scalability: Adding new tools requires firmware updates and device reconfiguration
The solution? Use a gateway pattern to aggregate all MCP servers into a single endpoint that your Xiaozhi device can access through one WebSocket connection.
Architecture Overview
Our architecture consists of five key components working together:
graph TD
A[Xiaozhi ESP32 Hardware] -->|WebSocket| B[MCP Pipe Bridge]
B -->|stdio| C[MCP Proxy for AWS]
C -->|HTTPS + SigV4| D[AgentCore Gateway]
D -->|Proxies| E1[Calculator MCP Server]
D -->|Proxies| E2[Football API OpenAPI]
D -->|Proxies| E3[Other MCP Tools]
style A fill:#E37222,stroke:#d66820,stroke-width:3px,color:#fff
style B fill:#4a5568,stroke:#3d4555,stroke-width:2px,color:#fff
style C fill:#4a5568,stroke:#3d4555,stroke-width:2px,color:#fff
style D fill:#232F3E,stroke:#00A4A6,stroke-width:3px,color:#fff
style E1 fill:#3d4555,stroke:#545B64,stroke-width:2px,color:#fff
style E2 fill:#3d4555,stroke:#545B64,stroke-width:2px,color:#fff
style E3 fill:#3d4555,stroke:#545B64,stroke-width:2px,color:#fff
Component Breakdown:
- Xiaozhi Hardware: ESP32-based voice assistant that connects via WebSocket to the MCP endpoint
- MCP Pipe Bridge: A Python-based bidirectional bridge that translates between WebSocket and stdio protocols, managing multiple MCP server processes
- MCP Proxy for AWS: A specialized proxy that translates MCP stdio protocol to AWS HTTP/SSE with SigV4 authentication
- Amazon Bedrock AgentCore Gateway: Managed service that aggregates multiple MCP targets into a unified interface with IAM-based authentication
- Gateway Targets: Individual MCP servers (local tools) and OpenAPI endpoints (RESTful APIs)
Implementation Guide
Part 1: Setting Up the MCP Pipe Bridge
The MCP Pipe Bridge is the crucial component that connects your Xiaozhi hardware to the cloud-based gateway. It manages WebSocket connections and spawns MCP server processes.
Configuration File
Create mcp_config.json to define your MCP servers:
1{
2 "mcpServers": {
3 "aws-proxy-gateway": {
4 "type": "stdio",
5 "command": "uvx",
6 "args": [
7 "mcp-proxy-for-aws@latest",
8 "https://YOUR-GATEWAY-ID.gateway.bedrock-agentcore.us-west-2.amazonaws.com/mcp",
9 "--region",
10 "us-west-2",
11 "--log-level",
12 "DEBUG",
13 "--service",
14 "bedrock-agentcore"
15 ]
16 }
17 }
18}
The Bridge Implementation
The MCP Pipe (mcp_pipe.py) handles bidirectional communication between WebSocket and stdio:
1import asyncio
2import websockets
3import json
4import subprocess
5from typing import Dict, List
6
7class MCPPipe:
8 def __init__(self, websocket_url: str, config: dict):
9 self.websocket_url = websocket_url
10 self.config = config
11 self.processes: Dict[str, subprocess.Popen] = {}
12
13 async def start_server(self, name: str, server_config: dict):
14 """Launch an MCP server process"""
15 process = subprocess.Popen(
16 [server_config['command']] + server_config.get('args', []),
17 stdin=subprocess.PIPE,
18 stdout=subprocess.PIPE,
19 stderr=subprocess.PIPE,
20 text=True
21 )
22 self.processes[name] = process
23 return process
24
25 async def bridge_messages(self, websocket, process):
26 """Bidirectional message forwarding"""
27 async def ws_to_stdio():
28 async for message in websocket:
29 # Forward WebSocket messages to process stdin
30 process.stdin.write(message + '\n')
31 process.stdin.flush()
32
33 async def stdio_to_ws():
34 while True:
35 # Read from process stdout and send to WebSocket
36 line = process.stdout.readline()
37 if line:
38 await websocket.send(line.strip())
39 await asyncio.sleep(0.01)
40
41 await asyncio.gather(ws_to_stdio(), stdio_to_ws())
Key Features:
- Auto-reconnection: Exponential backoff (1s → 600s max) for resilience
- Multi-server management: Spawns and monitors multiple child processes
- Bidirectional streaming: Real-time message forwarding in both directions
Docker Deployment
For production deployment, use Docker with systemd auto-start:
1FROM python:3.13-slim
2
3# Install uv package manager
4RUN curl -LsSf https://astral.sh/uv/install.sh | sh
5
6# Copy application files
7WORKDIR /app
8COPY requirements.txt mcp_pipe.py mcp_config.json ./
9RUN pip install --no-cache-dir -r requirements.txt
10
11# Run as non-root user
12RUN useradd -m -u 1000 mcpuser
13USER mcpuser
14
15CMD ["python", "mcp_pipe.py"]
Docker Compose (docker-compose.yml):
1services:
2 mcp-pipe:
3 build: .
4 restart: always
5 network_mode: "host" # Access EC2 instance metadata
6 env_file: .env
7 environment:
8 - AWS_REGION=us-west-2
9 - MCP_ENDPOINT=wss://api.xiaozhi.me/mcp/?token=${XIAOZHI_TOKEN}
10 volumes:
11 - ./mcp_config.json:/app/mcp_config.json:ro
Part 2: Deploying Amazon Bedrock AgentCore Gateway
Amazon Bedrock AgentCore Gateway provides the aggregation layer that combines multiple MCP targets into a single endpoint.
Gateway Creation
The gateway can be created through AWS Console or CLI. The key configuration includes:
1# Create gateway
2aws bedrock-agentcore create-gateway \
3 --name xiaozhi-gateway \
4 --description "MCP aggregation gateway for Xiaozhi hardware" \
5 --region us-west-2
6
7# Note the gateway identifier returned (example format)
8# Example output: YOUR-GATEWAY-ID
IAM Permissions
The gateway requires specific IAM permissions to access credential providers and secrets:
1{
2 "Version": "2012-10-17",
3 "Statement": [
4 {
5 "Sid": "GetWorkloadAccessToken",
6 "Effect": "Allow",
7 "Action": ["bedrock-agentcore:GetWorkloadAccessToken"],
8 "Resource": "*"
9 },
10 {
11 "Sid": "GetResourceApiKey",
12 "Effect": "Allow",
13 "Action": ["bedrock-agentcore:GetResourceApiKey"],
14 "Resource": "*"
15 },
16 {
17 "Sid": "GetCredentials",
18 "Effect": "Allow",
19 "Action": ["secretsmanager:GetSecretValue"],
20 "Resource": ["arn:aws:secretsmanager:*:*:secret:bedrock-agentcore-identity!*"]
21 }
22 ]
23}
Connecting via MCP Proxy for AWS
The mcp-proxy-for-aws package handles authentication and protocol translation:
1# Install and run the proxy
2uvx mcp-proxy-for-aws@latest \
3 https://YOUR-GATEWAY-ID.gateway.bedrock-agentcore.us-west-2.amazonaws.com/mcp \
4 --region us-west-2 \
5 --service bedrock-agentcore \
6 --log-level DEBUG
The proxy automatically:
- Uses EC2 instance profile or local AWS credentials
- Signs requests with AWS Signature Version 4 (SigV4)
- Translates MCP stdio protocol to HTTP/SSE
- Handles streaming responses from the gateway
Part 3: Adding Gateway Targets
Now let's add actual functionality by configuring gateway targets.
Example 1: Local Calculator MCP Server
A simple calculator tool demonstrates local MCP server integration:
1from fastmcp import FastMCP
2import math
3import ast
4import operator
5
6mcp = FastMCP("Calculator")
7
8# Safe operators for mathematical expressions
9SAFE_OPERATORS = {
10 ast.Add: operator.add,
11 ast.Sub: operator.sub,
12 ast.Mult: operator.mul,
13 ast.Div: operator.truediv,
14 ast.Pow: operator.pow,
15 ast.USub: operator.neg,
16}
17
18def safe_eval_math(expression: str) -> float:
19 """
20 Safely evaluate mathematical expressions using AST parsing.
21 Only allows basic arithmetic operations and whitelisted math functions.
22 """
23 tree = ast.parse(expression, mode='eval')
24
25 safe_funcs = {
26 'sqrt': math.sqrt,
27 'sin': math.sin,
28 'cos': math.cos,
29 'tan': math.tan,
30 'log': math.log,
31 'abs': abs,
32 }
33
34 def eval_node(node):
35 if isinstance(node, ast.Num):
36 return node.n
37 elif isinstance(node, ast.BinOp):
38 left = eval_node(node.left)
39 right = eval_node(node.right)
40 return SAFE_OPERATORS[type(node.op)](left, right)
41 elif isinstance(node, ast.UnaryOp):
42 operand = eval_node(node.operand)
43 return SAFE_OPERATORS[type(node.op)](operand)
44 elif isinstance(node, ast.Call):
45 if isinstance(node.func, ast.Name):
46 func_name = node.func.id
47 if func_name in safe_funcs:
48 args = [eval_node(arg) for arg in node.args]
49 return safe_funcs[func_name](*args)
50 raise ValueError("Function not allowed")
51 else:
52 raise ValueError("Unsupported operation")
53
54 return eval_node(tree.body)
55
56@mcp.tool()
57def calculator(expression: str) -> dict:
58 """
59 Safely evaluate mathematical expressions.
60 Supports: +, -, *, /, **, and math functions (sqrt, sin, cos, tan, log, abs)
61
62 Examples:
63 - "25 * 17"
64 - "sqrt(144)"
65 - "2 ** 10"
66 """
67 try:
68 result = safe_eval_math(expression)
69 return {"success": True, "result": result}
70 except Exception as e:
71 return {"success": False, "error": str(e)}
72
73if __name__ == "__main__":
74 mcp.run(transport="stdio")
Security Note: This implementation uses AST (Abstract Syntax Tree) parsing to safely evaluate mathematical expressions without the security risks of arbitrary code execution. It only permits whitelisted operations and functions, preventing code injection attacks.
Usage Flow:
- User asks Xiaozhi: "What is 25 times 17?"
- Request flows: Xiaozhi → WebSocket → MCP Pipe → Calculator
- Calculator safely evaluates:
safe_eval_math("25 * 17")→ 425 - Response returns through the chain
- Xiaozhi responds: "The result is 425"
Example 2: Football API via OpenAPI Target
For external APIs, use OpenAPI targets with credential providers.
Step 1: Create Credential Provider
1# Create provider for API key storage
2aws bedrock-agentcore-control create-api-key-credential-provider \
3 --name FootballAPICredentialProvider \
4 --description "RapidAPI Football API Key"
5
6# Store the API key
7aws bedrock-agentcore-control update-api-key-credential-provider \
8 --name FootballAPICredentialProvider \
9 --api-key "YOUR_RAPIDAPI_KEY"
Step 2: Define OpenAPI Schema
Create football-api-openapi.yaml with essential league IDs embedded in descriptions to reduce API calls:
1openapi: 3.0.3
2info:
3 title: Football API
4 version: 1.0.0
5 description: |
6 Access live football data including fixtures, standings, and statistics.
7
8 **Common League IDs** (use directly to avoid extra API calls):
9 - Premier League: 39
10 - La Liga: 140
11 - Bundesliga: 78
12 - Serie A: 135
13 - Champions League: 2
14 - Europa League: 3
15
16servers:
17 - url: https://api-football-v1.p.rapidapi.com/v3
18
19paths:
20 /standings:
21 get:
22 operationId: getStandings
23 summary: Get league standings
24 parameters:
25 - name: league
26 in: query
27 required: true
28 schema:
29 type: integer
30 description: League ID (e.g., 39 for Premier League)
31 - name: season
32 in: query
33 required: true
34 schema:
35 type: integer
36 description: Season year (e.g., 2025)
37 responses:
38 '200':
39 description: Standings data
40
41 /fixtures:
42 get:
43 operationId: getFixtures
44 summary: Get match fixtures
45 parameters:
46 - name: league
47 in: query
48 schema:
49 type: integer
50 - name: season
51 in: query
52 required: true
53 schema:
54 type: integer
55 - name: date
56 in: query
57 schema:
58 type: string
59 description: Date in YYYY-MM-DD format
60 responses:
61 '200':
62 description: Fixtures data
Step 3: Configure Gateway Target
The gateway target configuration links the OpenAPI schema with credential injection:
1{
2 "gatewayIdentifier": "YOUR-GATEWAY-ID",
3 "name": "FootballAPITarget",
4 "targetConfiguration": {
5 "mcp": {
6 "openApiSchema": {
7 "s3": {
8 "uri": "s3://your-bucket/football-api-openapi.yaml"
9 }
10 }
11 }
12 },
13 "credentialProviderConfigurations": [{
14 "credentialProviderType": "API_KEY",
15 "credentialProvider": {
16 "apiKeyCredentialProvider": {
17 "providerArn": "arn:aws:bedrock-agentcore:us-west-2:YOUR-ACCOUNT-ID:token-vault/default/apikeycredentialprovider/FootballAPICredentialProvider",
18 "credentialLocation": "HEADER",
19 "credentialParameterName": "x-rapidapi-key"
20 }
21 }
22 }]
23}
How It Works:
- LLM recognizes "Premier League" and league ID 39 from schema description
- Generates request:
getStandings({ league: 39, season: 2025 }) - Gateway retrieves API key from credential provider
- Injects headers:
x-rapidapi-keyandx-rapidapi-host - Proxies to:
https://api-football-v1.p.rapidapi.com/v3/standings?league=39&season=2025 - Returns response through the chain to Xiaozhi
This approach reduces API calls by 50% by embedding common league IDs directly in the schema documentation.
Request Flow Walkthrough
Let's trace a complete request from voice query to spoken response:
sequenceDiagram
participant U as User
participant X as Xiaozhi Hardware
participant W as WebSocket
participant P as MCP Pipe
participant M as MCP Proxy
participant G as AgentCore Gateway
participant T as Target (API/Tool)
U->>X: "Show Premier League standings"
X->>W: JSON-RPC Request
W->>P: WebSocket Message
P->>M: stdio Message
M->>G: HTTPS + SigV4 Auth
G->>T: Proxied Request + API Key
T-->>G: API Response
G-->>M: JSON Response
M-->>P: stdio Response
P-->>W: WebSocket Message
W-->>X: JSON-RPC Response
X-->>U: Speaks standings
Step-by-Step:
- User speaks: "Show me Premier League standings"
- Xiaozhi processes: Converts speech to text, sends to LLM
- LLM determines tool: Recognizes need for
getStandingswith league ID 39 - Request propagates: Xiaozhi → WebSocket → MCP Pipe → MCP Proxy → Gateway
- Gateway routes: Identifies Football API target, retrieves API key
- API call: Gateway proxies request to RapidAPI with authentication
- Response flows back: API → Gateway → MCP Proxy → MCP Pipe → WebSocket → Xiaozhi
- Xiaozhi speaks: "Here are the Premier League standings: Manchester City is first with 65 points..."
Use Cases and Benefits
This distributed MCP architecture enables powerful use cases:
Use Cases
- Voice-Controlled Calculations: "What's the square root of 12,345?"
- Real-Time Sports Data: "When is Manchester United's next match?"
- Smart Home Integration: Control devices through natural language
- Personal Productivity: "Schedule a meeting for tomorrow at 3 PM"
- Information Retrieval: "What's the weather forecast for Tokyo?"
Key Benefits
1. Unified Interface
- Single WebSocket connection for all tools
- No complex client-side integration logic
- Consistent error handling and retry mechanisms
2. Scalability
- Add new tools without firmware updates
- Independent scaling of gateway and targets
- Parallel request processing
3. Security
- API keys stored in AWS Secrets Manager
- IAM-based authentication and authorization
- No sensitive data on edge devices
- Safe execution environments for tools
4. Resilience
- Automatic reconnection with exponential backoff
- Target-level health monitoring
- Graceful degradation if individual tools fail
5. Cost Optimization
- Schema optimization reduces API calls by 50%
- Shared credential providers across gateways
- Pay-per-use pricing for gateway requests
Troubleshooting Tips
When issues arise, check these common areas:
- Connection Issues: Verify WebSocket connectivity, authentication tokens, and network firewall rules
- Gateway Errors: Ensure IAM permissions are correctly configured on the gateway service role
- Tool Failures: Validate input schemas, check API rate limits, and review credential provider settings
- Logs: Use
docker compose logs -f mcp-pipeto monitor the bridge process in real-time
Conclusion
By combining Xiaozhi hardware, MCP Pipe bridge, and Amazon Bedrock AgentCore Gateway, you've built a robust distributed AI architecture that brings cloud-scale capabilities to edge devices. This pattern demonstrates how Model Context Protocol can unify diverse tools and APIs into a single, manageable interface.
The architecture is extensible—you can add more MCP servers, integrate additional APIs through OpenAPI targets, or even connect multiple Xiaozhi devices to the same gateway. The gateway pattern provides a clean separation of concerns: hardware focuses on voice interaction, the bridge handles connectivity, and the gateway manages tool aggregation and authentication.
As MCP adoption grows, this architecture positions you to leverage new tools and services as they become available, all without touching your edge device firmware.
Resources
- Xiaozhi ESP32 Hardware Repository - Open-source AI voice assistant hardware based on ESP32
- MCP Calculator Sample with Pipe & Docker - Example implementation of MCP Pipe bridge and Docker deployment for Xiaozhi
- AgentCore Gateway Football API Target - CDK implementation for deploying Football API as an OpenAPI target to AgentCore Gateway
- MCP Proxy for AWS - Official AWS proxy for connecting MCP clients to Amazon Bedrock AgentCore Gateway with SigV4 authentication
- Amazon Bedrock AgentCore Gateway Quick Start - Official AWS documentation for getting started with Amazon Bedrock AgentCore Gateway
- Model Context Protocol Specification - Complete MCP protocol specification and documentation