Xiaozhi ESP32 MCP Gateway with Amazon Bedrock AgentCore

Overview

The Xiaozhi hardware is an impressive ESP32-based AI voice assistant capable of offline wake-up, multi-language support, and cloud connectivity. But what if you want your Xiaozhi device to access multiple AI tools, APIs, and services without managing complex integrations on the hardware side? This is where Amazon Bedrock AgentCore Gateway shines as a unified aggregation layer for Model Context Protocol (MCP) servers.

In this guide, I'll walk you through building a distributed MCP architecture that connects Xiaozhi hardware to multiple cloud services through a single WebSocket connection, leveraging AgentCore Gateway to aggregate tools ranging from simple calculators to complex RESTful APIs like real-time football data.

The Challenge: Connecting Edge Devices to Multiple AI Tools

Xiaozhi hardware excels at voice interaction and local control, but extending its capabilities to access dozens of cloud services presents several challenges:

  1. Connection Management: Each MCP server requires its own connection, protocol handling, and authentication
  2. Resource Constraints: ESP32 devices have limited memory and processing power for managing multiple connections
  3. API Key Security: Storing numerous API keys on edge devices poses security risks
  4. Scalability: Adding new tools requires firmware updates and device reconfiguration

The solution? Use a gateway pattern to aggregate all MCP servers into a single endpoint that your Xiaozhi device can access through one WebSocket connection.

Architecture Overview

Our architecture consists of five key components working together:

graph TD
    A[Xiaozhi ESP32 Hardware] -->|WebSocket| B[MCP Pipe Bridge]
    B -->|stdio| C[MCP Proxy for AWS]
    C -->|HTTPS + SigV4| D[AgentCore Gateway]
    D -->|Proxies| E1[Calculator MCP Server]
    D -->|Proxies| E2[Football API OpenAPI]
    D -->|Proxies| E3[Other MCP Tools]

    style A fill:#E37222,stroke:#d66820,stroke-width:3px,color:#fff
    style B fill:#4a5568,stroke:#3d4555,stroke-width:2px,color:#fff
    style C fill:#4a5568,stroke:#3d4555,stroke-width:2px,color:#fff
    style D fill:#232F3E,stroke:#00A4A6,stroke-width:3px,color:#fff
    style E1 fill:#3d4555,stroke:#545B64,stroke-width:2px,color:#fff
    style E2 fill:#3d4555,stroke:#545B64,stroke-width:2px,color:#fff
    style E3 fill:#3d4555,stroke:#545B64,stroke-width:2px,color:#fff

Component Breakdown:

  1. Xiaozhi Hardware: ESP32-based voice assistant that connects via WebSocket to the MCP endpoint
  2. MCP Pipe Bridge: A Python-based bidirectional bridge that translates between WebSocket and stdio protocols, managing multiple MCP server processes
  3. MCP Proxy for AWS: A specialized proxy that translates MCP stdio protocol to AWS HTTP/SSE with SigV4 authentication
  4. Amazon Bedrock AgentCore Gateway: Managed service that aggregates multiple MCP targets into a unified interface with IAM-based authentication
  5. Gateway Targets: Individual MCP servers (local tools) and OpenAPI endpoints (RESTful APIs)

Implementation Guide

Part 1: Setting Up the MCP Pipe Bridge

The MCP Pipe Bridge is the crucial component that connects your Xiaozhi hardware to the cloud-based gateway. It manages WebSocket connections and spawns MCP server processes.

Configuration File

Create mcp_config.json to define your MCP servers:

 1{
 2  "mcpServers": {
 3    "aws-proxy-gateway": {
 4      "type": "stdio",
 5      "command": "uvx",
 6      "args": [
 7        "mcp-proxy-for-aws@latest",
 8        "https://YOUR-GATEWAY-ID.gateway.bedrock-agentcore.us-west-2.amazonaws.com/mcp",
 9        "--region",
10        "us-west-2",
11        "--log-level",
12        "DEBUG",
13        "--service",
14        "bedrock-agentcore"
15      ]
16    }
17  }
18}

The Bridge Implementation

The MCP Pipe (mcp_pipe.py) handles bidirectional communication between WebSocket and stdio:

 1import asyncio
 2import websockets
 3import json
 4import subprocess
 5from typing import Dict, List
 6
 7class MCPPipe:
 8    def __init__(self, websocket_url: str, config: dict):
 9        self.websocket_url = websocket_url
10        self.config = config
11        self.processes: Dict[str, subprocess.Popen] = {}
12
13    async def start_server(self, name: str, server_config: dict):
14        """Launch an MCP server process"""
15        process = subprocess.Popen(
16            [server_config['command']] + server_config.get('args', []),
17            stdin=subprocess.PIPE,
18            stdout=subprocess.PIPE,
19            stderr=subprocess.PIPE,
20            text=True
21        )
22        self.processes[name] = process
23        return process
24
25    async def bridge_messages(self, websocket, process):
26        """Bidirectional message forwarding"""
27        async def ws_to_stdio():
28            async for message in websocket:
29                # Forward WebSocket messages to process stdin
30                process.stdin.write(message + '\n')
31                process.stdin.flush()
32
33        async def stdio_to_ws():
34            while True:
35                # Read from process stdout and send to WebSocket
36                line = process.stdout.readline()
37                if line:
38                    await websocket.send(line.strip())
39                await asyncio.sleep(0.01)
40
41        await asyncio.gather(ws_to_stdio(), stdio_to_ws())

Key Features:

  • Auto-reconnection: Exponential backoff (1s → 600s max) for resilience
  • Multi-server management: Spawns and monitors multiple child processes
  • Bidirectional streaming: Real-time message forwarding in both directions

Docker Deployment

For production deployment, use Docker with systemd auto-start:

 1FROM python:3.13-slim
 2
 3# Install uv package manager
 4RUN curl -LsSf https://astral.sh/uv/install.sh | sh
 5
 6# Copy application files
 7WORKDIR /app
 8COPY requirements.txt mcp_pipe.py mcp_config.json ./
 9RUN pip install --no-cache-dir -r requirements.txt
10
11# Run as non-root user
12RUN useradd -m -u 1000 mcpuser
13USER mcpuser
14
15CMD ["python", "mcp_pipe.py"]

Docker Compose (docker-compose.yml):

 1services:
 2  mcp-pipe:
 3    build: .
 4    restart: always
 5    network_mode: "host"  # Access EC2 instance metadata
 6    env_file: .env
 7    environment:
 8      - AWS_REGION=us-west-2
 9      - MCP_ENDPOINT=wss://api.xiaozhi.me/mcp/?token=${XIAOZHI_TOKEN}
10    volumes:
11      - ./mcp_config.json:/app/mcp_config.json:ro

Part 2: Deploying Amazon Bedrock AgentCore Gateway

Amazon Bedrock AgentCore Gateway provides the aggregation layer that combines multiple MCP targets into a single endpoint.

Gateway Creation

The gateway can be created through AWS Console or CLI. The key configuration includes:

1# Create gateway
2aws bedrock-agentcore create-gateway \
3  --name xiaozhi-gateway \
4  --description "MCP aggregation gateway for Xiaozhi hardware" \
5  --region us-west-2
6
7# Note the gateway identifier returned (example format)
8# Example output: YOUR-GATEWAY-ID

IAM Permissions

The gateway requires specific IAM permissions to access credential providers and secrets:

 1{
 2  "Version": "2012-10-17",
 3  "Statement": [
 4    {
 5      "Sid": "GetWorkloadAccessToken",
 6      "Effect": "Allow",
 7      "Action": ["bedrock-agentcore:GetWorkloadAccessToken"],
 8      "Resource": "*"
 9    },
10    {
11      "Sid": "GetResourceApiKey",
12      "Effect": "Allow",
13      "Action": ["bedrock-agentcore:GetResourceApiKey"],
14      "Resource": "*"
15    },
16    {
17      "Sid": "GetCredentials",
18      "Effect": "Allow",
19      "Action": ["secretsmanager:GetSecretValue"],
20      "Resource": ["arn:aws:secretsmanager:*:*:secret:bedrock-agentcore-identity!*"]
21    }
22  ]
23}

Connecting via MCP Proxy for AWS

The mcp-proxy-for-aws package handles authentication and protocol translation:

1# Install and run the proxy
2uvx mcp-proxy-for-aws@latest \
3  https://YOUR-GATEWAY-ID.gateway.bedrock-agentcore.us-west-2.amazonaws.com/mcp \
4  --region us-west-2 \
5  --service bedrock-agentcore \
6  --log-level DEBUG

The proxy automatically:

  • Uses EC2 instance profile or local AWS credentials
  • Signs requests with AWS Signature Version 4 (SigV4)
  • Translates MCP stdio protocol to HTTP/SSE
  • Handles streaming responses from the gateway

Part 3: Adding Gateway Targets

Now let's add actual functionality by configuring gateway targets.

Example 1: Local Calculator MCP Server

A simple calculator tool demonstrates local MCP server integration:

 1from fastmcp import FastMCP
 2import math
 3import ast
 4import operator
 5
 6mcp = FastMCP("Calculator")
 7
 8# Safe operators for mathematical expressions
 9SAFE_OPERATORS = {
10    ast.Add: operator.add,
11    ast.Sub: operator.sub,
12    ast.Mult: operator.mul,
13    ast.Div: operator.truediv,
14    ast.Pow: operator.pow,
15    ast.USub: operator.neg,
16}
17
18def safe_eval_math(expression: str) -> float:
19    """
20    Safely evaluate mathematical expressions using AST parsing.
21    Only allows basic arithmetic operations and whitelisted math functions.
22    """
23    tree = ast.parse(expression, mode='eval')
24
25    safe_funcs = {
26        'sqrt': math.sqrt,
27        'sin': math.sin,
28        'cos': math.cos,
29        'tan': math.tan,
30        'log': math.log,
31        'abs': abs,
32    }
33
34    def eval_node(node):
35        if isinstance(node, ast.Num):
36            return node.n
37        elif isinstance(node, ast.BinOp):
38            left = eval_node(node.left)
39            right = eval_node(node.right)
40            return SAFE_OPERATORS[type(node.op)](left, right)
41        elif isinstance(node, ast.UnaryOp):
42            operand = eval_node(node.operand)
43            return SAFE_OPERATORS[type(node.op)](operand)
44        elif isinstance(node, ast.Call):
45            if isinstance(node.func, ast.Name):
46                func_name = node.func.id
47                if func_name in safe_funcs:
48                    args = [eval_node(arg) for arg in node.args]
49                    return safe_funcs[func_name](*args)
50            raise ValueError("Function not allowed")
51        else:
52            raise ValueError("Unsupported operation")
53
54    return eval_node(tree.body)
55
56@mcp.tool()
57def calculator(expression: str) -> dict:
58    """
59    Safely evaluate mathematical expressions.
60    Supports: +, -, *, /, **, and math functions (sqrt, sin, cos, tan, log, abs)
61
62    Examples:
63    - "25 * 17"
64    - "sqrt(144)"
65    - "2 ** 10"
66    """
67    try:
68        result = safe_eval_math(expression)
69        return {"success": True, "result": result}
70    except Exception as e:
71        return {"success": False, "error": str(e)}
72
73if __name__ == "__main__":
74    mcp.run(transport="stdio")

Security Note: This implementation uses AST (Abstract Syntax Tree) parsing to safely evaluate mathematical expressions without the security risks of arbitrary code execution. It only permits whitelisted operations and functions, preventing code injection attacks.

Usage Flow:

  1. User asks Xiaozhi: "What is 25 times 17?"
  2. Request flows: Xiaozhi → WebSocket → MCP Pipe → Calculator
  3. Calculator safely evaluates: safe_eval_math("25 * 17") → 425
  4. Response returns through the chain
  5. Xiaozhi responds: "The result is 425"

Example 2: Football API via OpenAPI Target

For external APIs, use OpenAPI targets with credential providers.

Step 1: Create Credential Provider

1# Create provider for API key storage
2aws bedrock-agentcore-control create-api-key-credential-provider \
3  --name FootballAPICredentialProvider \
4  --description "RapidAPI Football API Key"
5
6# Store the API key
7aws bedrock-agentcore-control update-api-key-credential-provider \
8  --name FootballAPICredentialProvider \
9  --api-key "YOUR_RAPIDAPI_KEY"

Step 2: Define OpenAPI Schema

Create football-api-openapi.yaml with essential league IDs embedded in descriptions to reduce API calls:

 1openapi: 3.0.3
 2info:
 3  title: Football API
 4  version: 1.0.0
 5  description: |
 6    Access live football data including fixtures, standings, and statistics.
 7
 8    **Common League IDs** (use directly to avoid extra API calls):
 9    - Premier League: 39
10    - La Liga: 140
11    - Bundesliga: 78
12    - Serie A: 135
13    - Champions League: 2
14    - Europa League: 3
15
16servers:
17  - url: https://api-football-v1.p.rapidapi.com/v3
18
19paths:
20  /standings:
21    get:
22      operationId: getStandings
23      summary: Get league standings
24      parameters:
25        - name: league
26          in: query
27          required: true
28          schema:
29            type: integer
30          description: League ID (e.g., 39 for Premier League)
31        - name: season
32          in: query
33          required: true
34          schema:
35            type: integer
36          description: Season year (e.g., 2025)
37      responses:
38        '200':
39          description: Standings data
40
41  /fixtures:
42    get:
43      operationId: getFixtures
44      summary: Get match fixtures
45      parameters:
46        - name: league
47          in: query
48          schema:
49            type: integer
50        - name: season
51          in: query
52          required: true
53          schema:
54            type: integer
55        - name: date
56          in: query
57          schema:
58            type: string
59          description: Date in YYYY-MM-DD format
60      responses:
61        '200':
62          description: Fixtures data

Step 3: Configure Gateway Target

The gateway target configuration links the OpenAPI schema with credential injection:

 1{
 2  "gatewayIdentifier": "YOUR-GATEWAY-ID",
 3  "name": "FootballAPITarget",
 4  "targetConfiguration": {
 5    "mcp": {
 6      "openApiSchema": {
 7        "s3": {
 8          "uri": "s3://your-bucket/football-api-openapi.yaml"
 9        }
10      }
11    }
12  },
13  "credentialProviderConfigurations": [{
14    "credentialProviderType": "API_KEY",
15    "credentialProvider": {
16      "apiKeyCredentialProvider": {
17        "providerArn": "arn:aws:bedrock-agentcore:us-west-2:YOUR-ACCOUNT-ID:token-vault/default/apikeycredentialprovider/FootballAPICredentialProvider",
18        "credentialLocation": "HEADER",
19        "credentialParameterName": "x-rapidapi-key"
20      }
21    }
22  }]
23}

How It Works:

  1. LLM recognizes "Premier League" and league ID 39 from schema description
  2. Generates request: getStandings({ league: 39, season: 2025 })
  3. Gateway retrieves API key from credential provider
  4. Injects headers: x-rapidapi-key and x-rapidapi-host
  5. Proxies to: https://api-football-v1.p.rapidapi.com/v3/standings?league=39&season=2025
  6. Returns response through the chain to Xiaozhi

This approach reduces API calls by 50% by embedding common league IDs directly in the schema documentation.

Request Flow Walkthrough

Let's trace a complete request from voice query to spoken response:

sequenceDiagram
    participant U as User
    participant X as Xiaozhi Hardware
    participant W as WebSocket
    participant P as MCP Pipe
    participant M as MCP Proxy
    participant G as AgentCore Gateway
    participant T as Target (API/Tool)

    U->>X: "Show Premier League standings"
    X->>W: JSON-RPC Request
    W->>P: WebSocket Message
    P->>M: stdio Message
    M->>G: HTTPS + SigV4 Auth
    G->>T: Proxied Request + API Key
    T-->>G: API Response
    G-->>M: JSON Response
    M-->>P: stdio Response
    P-->>W: WebSocket Message
    W-->>X: JSON-RPC Response
    X-->>U: Speaks standings

Step-by-Step:

  1. User speaks: "Show me Premier League standings"
  2. Xiaozhi processes: Converts speech to text, sends to LLM
  3. LLM determines tool: Recognizes need for getStandings with league ID 39
  4. Request propagates: Xiaozhi → WebSocket → MCP Pipe → MCP Proxy → Gateway
  5. Gateway routes: Identifies Football API target, retrieves API key
  6. API call: Gateway proxies request to RapidAPI with authentication
  7. Response flows back: API → Gateway → MCP Proxy → MCP Pipe → WebSocket → Xiaozhi
  8. Xiaozhi speaks: "Here are the Premier League standings: Manchester City is first with 65 points..."

Use Cases and Benefits

This distributed MCP architecture enables powerful use cases:

Use Cases

  1. Voice-Controlled Calculations: "What's the square root of 12,345?"
  2. Real-Time Sports Data: "When is Manchester United's next match?"
  3. Smart Home Integration: Control devices through natural language
  4. Personal Productivity: "Schedule a meeting for tomorrow at 3 PM"
  5. Information Retrieval: "What's the weather forecast for Tokyo?"

Key Benefits

1. Unified Interface

  • Single WebSocket connection for all tools
  • No complex client-side integration logic
  • Consistent error handling and retry mechanisms

2. Scalability

  • Add new tools without firmware updates
  • Independent scaling of gateway and targets
  • Parallel request processing

3. Security

  • API keys stored in AWS Secrets Manager
  • IAM-based authentication and authorization
  • No sensitive data on edge devices
  • Safe execution environments for tools

4. Resilience

  • Automatic reconnection with exponential backoff
  • Target-level health monitoring
  • Graceful degradation if individual tools fail

5. Cost Optimization

  • Schema optimization reduces API calls by 50%
  • Shared credential providers across gateways
  • Pay-per-use pricing for gateway requests

Troubleshooting Tips

When issues arise, check these common areas:

  • Connection Issues: Verify WebSocket connectivity, authentication tokens, and network firewall rules
  • Gateway Errors: Ensure IAM permissions are correctly configured on the gateway service role
  • Tool Failures: Validate input schemas, check API rate limits, and review credential provider settings
  • Logs: Use docker compose logs -f mcp-pipe to monitor the bridge process in real-time

Conclusion

By combining Xiaozhi hardware, MCP Pipe bridge, and Amazon Bedrock AgentCore Gateway, you've built a robust distributed AI architecture that brings cloud-scale capabilities to edge devices. This pattern demonstrates how Model Context Protocol can unify diverse tools and APIs into a single, manageable interface.

The architecture is extensible—you can add more MCP servers, integrate additional APIs through OpenAPI targets, or even connect multiple Xiaozhi devices to the same gateway. The gateway pattern provides a clean separation of concerns: hardware focuses on voice interaction, the bridge handles connectivity, and the gateway manages tool aggregation and authentication.

As MCP adoption grows, this architecture positions you to leverage new tools and services as they become available, all without touching your edge device firmware.

Resources