Skip to content

How to Deploy Production AI Agents Securely: Amazon Bedrock AgentCore Guide

When I started working with organizations on agentic AI deployments over the past year, I expected the conversation to center on which framework to choose or which foundation model performed best. Instead, I discovered that the real barrier wasn’t building agents, it was making them safe enough for production. In a recent architecture review for a financial services organization, the question came immediately: “Can we actually deploy this without creating the next Asana incident?” That question crystallized everything.

⚠️ Disclaimer

The views and opinions expressed in this post are my own and do not necessarily reflect the official policy or position of Amazon Web Services. This content is based on my personal experiences working with AgentCore. For official AWS guidance, please refer to the AWS AgentCore documentation and contact your AWS account team.

The Asana MCP data leakage incident in May 2025 wasn’t just another security story. A vulnerability present for 34 days between May 1 and June 4, 2025, potentially exposed data from roughly 1,000 organizations to users from other tenants due to flawed session isolation in the Model Context Protocol server. Asana took the server offline on June 5 for immediate remediation, restoring service on June 17 after fixing the logic flaw in tenant isolation. The bug wasn’t sophisticated; it was a logic flaw in tenant isolation. But it revealed the fundamental challenge: AI agents need session boundaries that traditional application architectures never required.

An Agentic Infrastructure?

Amazon Bedrock AgentCore, which launched in preview on July 16, 2025, and reached general availability in October 2025, represents AWS’s response to these production challenges. The platform provides enterprise-grade infrastructure for deploying AI agents at scale while maintaining the security and compliance requirements that regulated industries demand.

This post examines Amazon Bedrock AgentCore’s architecture from the perspective of someone who deploys these systems with enterprise customers in the European market. If you’re an enterprise architect evaluating agentic AI for regulated environments, this is the production reality you need to understand.

The Production Problem Amazon Bedrock AgentCore Addresses

The gap between building an agent and deploying one reveals challenges that most teams discover too late. I’ve watched organizations spend three months building impressive demos, then another six months solving problems they didn’t know existed. These aren’t feature gaps; they’re fundamental infrastructure challenges that become blockers the moment you move beyond proof of concept.

Session isolation emerged as the primary concern after incidents like Asana’s MCP flaw demonstrated how cross-tenant data leakage can occur when agents lack proper isolation boundaries. Traditional stateless functions terminate after each request, sanitizing everything. AI agents maintain complex contextual state across multiple interactions; conversation history, tool permissions, intermediate computations, creating contamination risks that standard architectures don’t address. Each user session requires its own dedicated microVM with isolated compute, memory, and filesystem resources, terminated completely after session completion to eliminate cross-session contamination.

đź’ˇ Key Insight

Each user session requires its own dedicated microVM with isolated compute, memory, and filesystem resources, terminated completely after session completion to eliminate cross-session contamination. This is what differentiates AgentCore from Lambda: true session isolation with deterministic security boundaries, supporting workloads up to 8 hours.

Identity complexity represents another production barrier that prototype systems abstract away. A single agent invocation might require OAuth authentication from the user, IAM roles for AWS resources, and API keys for third-party services, all while maintaining proper permission boundaries. I’ve seen teams struggle for weeks implementing secure token management that Amazon Bedrock AgentCore handles as a primitive.

Agentic Workflow requires good timing

The agentic workflow duration requirement caught most teams by surprise. Research agents analyzing competitive intelligence or processing regulatory documents can’t complete in Lambda’s 15-minute window. Amazon Bedrock AgentCore (AgentCore from now on) supports ephemeral compute sessions lasting up to 8 hours, enabling multi-step agentic workflows where multiple calls to the same environment build upon previous context. This fundamentally changes what agents can accomplish in production.

Observability for non-deterministic systems represents perhaps the most underestimated challenge. When an agent produces unexpected results, you need to trace not just what happened, but why the foundation model made specific reasoning decisions across potentially dozens of tool invocations. Traditional application monitoring doesn’t capture this level of detail.

The Seven AgentCore Services

AgentCore provides a modular suite of services that can be used independently or together, working with any model inside or outside Amazon Bedrock and any open-source agent framework:

ServiceProduction ProblemWhy It Matters
RuntimeSecure execution at scaleMicroVM isolation with complete session separation prevents one user’s agent from accessing another’s data
GatewayTool integration complexityAutomatically converts APIs and Lambda functions into MCP-compatible tools without managing integrations
IdentityAuthentication across systemsOAuth-based identity and access management allows agents to act on behalf of users across systems like GitHub, Slack, or Salesforce
MemoryContext persistenceFully-managed memory infrastructure with both short-term and long-term storage that can be shared across agents and sessions within the same region
ObservabilityDebugging agentsBuilt-in dashboards, debugging, and telemetry tools with support for OpenTelemetry, LangSmith, and Datadog
BrowserWeb automationManaged browser instances with strict isolation in microVMs, each invocation securely torn down and terminated
Code InterpreterSafe code executionEphemeral Linux environment pre-loaded with pandas, NumPy, SciPy, and Matplotlib, supporting up to 25 concurrent sessions per account per region

The architecture insight that changed how I explain AgentCore to organizations: it isn’t competing with LangChain, CrewAI, or LlamaIndex. AgentCore is the infrastructure that those frameworks run on. Think Kubernetes for AI agents, you bring your framework and model, AgentCore provides the production-grade runtime, security, and operational tooling. This concept should feel familiar if you’ve read my previous post on building web services on AWS, where infrastructure abstraction enables developers to focus on business logic rather than operational complexity.

Production Reality: What AWS Docs Don’t Tell You

Integration with Enterprise Systems

In a recent architecture review for a financial services organization running SAP, the question came immediately: “Can AgentCore connect to our SAP ECC 6.0 system?” The system had been running for 18 years with custom ABAP code and no REST APIs. This is the reality of enterprise integration; most production systems weren’t designed for modern API consumption.

AgentCore Gateway transforms existing APIs and AWS Lambda functions into agent-ready tools, providing unified access across protocols, including MCP. For legacy SAP systems, the pattern that works in production involves Lambda as middleware. The agent invokes a tool through AgentCore Gateway, which routes to a Lambda function that handles SAP RFC calls or BAPI invocations. The Lambda function becomes your translation layer between the agent’s expectations and SAP’s proprietary protocols.

Authentication Flow blueprint

The production architecture for SAP integration demonstrates how AgentCore services orchestrate secure, multi-layered authentication while maintaining session isolation:

SAP Integration
Figure 1: Production architecture for integrating AgentCore with legacy SAP ECC 6.0 systems using Lambda as a secure translation layer with three-tier authentication (OAuth → IAM → SAP credentials). The Lambda function requires SAP NetWeaver RFC SDK and VPC deployment for on-premises connectivity.

This architecture handles the reality of 18-year-old SAP systems with custom ABAP code and proprietary RFC/BAPI protocols. The Lambda function acts as a translation layer, converting modern MCP tool calls into SAP-native function modules while managing the complexity of credential mapping and error handling.

# AgentCore Gateway tool definition for SAP integration
from agentcore import Gateway

sap_order_tool = Gateway.create_tool(
    name="check_order_status",
    description="Retrieve SAP order status using order number",
    lambda_function_arn="arn:aws:lambda:eu-central-1:123456:function:sap-rfc-connector",
    input_schema={
        "type": "object",
        "properties": {
            "order_number": {"type": "string", "description": "SAP sales order number"}
        },
        "required": ["order_number"]
    }
)

🎯 Expert Tip

The Lambda function requires the SAP NetWeaver RFC SDK (via node-rfc or PyRFC), VPC deployment for network connectivity to SAP, and careful error handling for RFC-specific error codes. OAuth token expiration during long-running sessions manifests as tool invocation failures after 60-90 minutes, implement token refresh logic in your Lambda middleware rather than relying on cached credentials.

What actually fails in production: network timeouts between Lambda and on-premises SAP systems, OAuth token refresh during long-running sessions, and SAP-specific error codes that agents need to interpret correctly. I’ve learned to build retry logic into the Lambda functions and provide agents with clear error descriptions rather than raw SAP messages.

Security and GDPR for European Markets

When evaluating AgentCore for regulated industries like banking, compliance considerations emerge immediately. The GDPR question always surfaces first: “How do we ensure compliance when agents store conversation history?” This question revealed a critical production consideration that development teams often miss.

AgentCore Runtime provides complete session isolation through dedicated microVMs powered by Firecracker, with each session terminated and memory sanitized after completion. For regulated industries, this isolation model addresses the fundamental security requirement, one organization’s data cannot leak to another, even if the foundation model hallucinates or an agent behaves unexpectedly.

Data residency requirements are straightforward to address. AgentCore is available in the Frankfurt region (eu-central-1), ensuring that data processing and storage remain within Germany for GDPR compliance. The audit trail comes through CloudWatch integration, capturing every agent invocation, tool call, and decision point.

⚠️ Production Reality

The critical gotcha I’ve seen catch multiple organizations: AgentCore Memory supports both short-term event retention and long-term, insight-driven storage, with file uploads currently limited to 250MB per session. By default, long-term memory persists indefinitely, you must configure time-to-live policies to comply with GDPR’s right to erasure. I recommend starting with 90-day retention for short-term memory and explicit deletion workflows for long-term storage.

Cost Reality

When an organization in the Frankfurt region asked about costs, I worked backward from their expected usage: 1,000 conversations daily, averaging 5 messages each, with 3 tool calls per message. Using EU (Frankfurt) regional pricing, the monthly breakdown looked like this:

Runtime charges at $0.0895 per vCPU-hour and $0.00945 per GB-hour, billed per second throughout the session, including boot, initialization, active processing, and idle periods. For their usage pattern with 2 vCPU and 4GB sessions averaging 8 minutes:

  • đź’° Runtime: ~$4,200/month

Gateway charges $0.005 per 1,000 tool API invocations:

  • đź”§ Gateway: ~$225/month (15,000 tool calls daily)

Memory costs $0.25 per 1,000 short-term memory events and $0.50 per 1,000 retrievals:

  • đź§  Memory: ~$375/month (5,000 events daily)

Observability through CloudWatch added approximately $100/month for their logging and metrics requirements:

  • 📊 Observability: ~$100/month

The total came to roughly $4,900 monthly, well below their initial budget concerns.

đź’° Cost Analysis

The comparison that clarified the value: building equivalent infrastructure in-house would require a senior engineer (€90K annually = €7,500/month) for three months of development, plus ongoing maintenance. The break-even point hit at three months, with AgentCore becoming increasingly cost-effective as usage scaled. The question isn’t “Can we afford AgentCore?” It’s “Can we afford six months of engineering time building session isolation, memory management, and observability systems that AWS has already solved?”

Decision Framework

After helping multiple organizations evaluate AgentCore, I’ve developed a framework for determining when it makes sense.

When AgentCore Makes Sense

Multi-tenant applications where session isolation is critical. If your agent will serve multiple organizations or business units, AgentCore’s microVM isolation model prevents cross-tenant data contamination without complex engineering work.

Regulated industries with audit requirements. Financial services, healthcare, and public sector organizations need comprehensive logging of agent decisions. AgentCore Observability provides real-time visibility with comprehensive monitoring dashboards powered by Amazon CloudWatch, tracking token usage, latency, session duration, and error rates.

Complex integrations across multiple backend systems. If your agent needs to orchestrate actions across SAP, Salesforce, ServiceNow, and internal databases, AgentCore Gateway’s automatic conversion of APIs and Lambda functions into MCP-compatible tools eliminates months of integration work.

OAuth identity requirements where agents act on behalf of users. AgentCore Identity integrates seamlessly with identity providers such as Amazon Cognito, Microsoft Entra ID, and Okta.

When It Doesn’t

High-frequency, sub-100ms latency requirements. Firecracker microVMs can launch in as little as 125 milliseconds, but cold starts still add latency. Typical cold start times range from 300-800ms. If you need a real-time response for simple queries, consider Bedrock Agents or Lambda with Bedrock directly.

Simple automation tasks. If you’re building a chatbot that queries a single database and returns results, Lambda with Bedrock provides better economics. AgentCore’s value emerges with complexity.

Budget constraints below $5K monthly. The consumption-based model means you pay for what you use, but the minimum viable deployment for production workloads typically runs $3-5K monthly once you factor in runtime, memory, and gateway usage.

⚠️ Production Reality

Requirements for complete infrastructure control. AgentCore is a managed service, you trade control for operational simplicity. If you need to modify how session isolation works or implement custom memory backends, you’ll need to build your own infrastructure. Don’t choose AgentCore if you need complete infrastructure control.

The “Start Simple” Approach

AgentCore is free to try until September 16, 2025. I recommend this phased approach based on successful deployments:

  1. Prototype in free tier (2-3 weeks): Build your agent using your preferred framework, deploy to AgentCore Runtime, integrate 2-3 tools through Gateway
  2. Pilot with limited users (100-500, 4-6 weeks): Monitor costs, observability, and user feedback; refine tool integrations
  3. Production rollout (gradual scale): Start with one use case, expand based on ROI; implement memory strategies and optimization
  4. Cost optimization (ongoing): Review CloudWatch metrics to identify expensive operations; implement caching where appropriate

The organizations that succeeded followed this pattern. The ones who struggled tried to migrate entire application portfolios at once without understanding the cost implications.

Regional Deployment Strategy

Before implementing AgentCore, you need to determine your regional deployment strategy. The decision impacts latency, compliance, and operational complexity. This decision framework helps architects evaluate the right approach for their organization:

AgentCore Regional Deployment
Figure 2: Strategic decision framework for AgentCore regional deployment. AgentCore is available in 9 AWS regions as of October 2025 (GA): US East (N. Virginia, Ohio), US West (Oregon), Europe (Frankfurt, Ireland), Asia Pacific (Mumbai, Singapore, Sydney, Tokyo). Start single-region; add multi-region only when disaster recovery justifies the operational complexity. Important: AgentCore Memory is regional with no native cross-region replication—multi-region deployments require custom memory synchronization architecture using DynamoDB Global Tables, Aurora Global Database, or S3 Cross-Region Replication.

For most European organizations I work with, the Frankfurt region provides optimal latency and GDPR compliance. Multi-region deployments add complexity and require custom memory replication strategies. Only pursue this when disaster recovery requirements justify the additional engineering work.

Implementation Quickstart

For architects evaluating AgentCore, here’s the minimal example that demonstrates the core concepts:

# Install AgentCore SDK
# pip install amazon-bedrock-agentcore

import json
from agentcore import Runtime, Gateway
from strands_agents import Agent, tool

# Define a simple tool
@tool
def get_weather(location: str) -> str:
    """Get current weather for a location"""
    # In production, call actual weather API
    return f"Weather in {location}: Sunny, 72°F"

# Create the agent
agent = Agent(
    name="weather-assistant",
    model="anthropic.claude-sonnet-4-20250514",
    tools=[get_weather],
    instructions="You help users check weather conditions."
)

# Deploy to AgentCore Runtime
runtime = Runtime.create(
    name="weather-agent-runtime",
    container_image="weather-agent:latest",  # Build with your agent code
    protocol="AGENT_CORE_RPC",
    memory_size_mb=4096,
    vcpus=2
)

# Invoke the agent
response = runtime.invoke(
    runtime_session_id="user-12345",
    payload=json.dumps({
        "prompt": "What's the weather in Munich?"
    }).encode()
)

print(response['content'])

The Dockerfile for deployment is straightforward:

FROM public.ecr.aws/amazonlinux/amazonlinux:2023

RUN yum install -y python3.11 python3.11-pip
WORKDIR /app

COPY requirements.txt .
RUN pip3.11 install -r requirements.txt

COPY agent.py .
CMD ["python3.11", "agent.py"]

This minimal example shows it’s actually simple to get started. The complexity emerges when you add multiple tools, implement proper error handling, and integrate with enterprise identity providers, but AgentCore provides the primitives to solve those problems systematically.

Common Production Issues and Solutions

The most valuable debugging pattern I’ve established: structured logging within agent code that captures decision points, tool selections, and reasoning steps. AgentCore supports OpenTelemetry, enabling integration with Datadog, Dynatrace, or LangSmith for detailed trace analysis.

Common production issues follow predictable patterns. OAuth token expiration during long-running sessions manifests as tool invocation failures after 60-90 minutes. The solution: implement token refresh logic in your Lambda middleware rather than relying on cached credentials. Memory retrieval returning irrelevant context often indicates poorly structured memory insertion, agents need explicit guidance on what information deserves long-term storage versus transient session state.

Cold start latency, which occurs while AgentCore launches microVMs in 125 milliseconds, can still impact the user experience for interactive agents. Implementing keep-alive mechanisms that maintain warm sessions during peak usage hours reduces perceived latency for users. This costs more but delivers a better experience for high-priority use cases.

Looking Forward

Throughout my infrastructure career spanning development to solution architecture, I’ve moved from building enterprise systems to now helping organizations deploy AI at scale. I’ve seen few technology launches as significant as AgentCore, and the timing matters: AgentCore launched just months ago, with relatively few in-depth technical analyses available. This represents a first-mover opportunity for organizations willing to invest in understanding the technology before the market saturates.

The shift AgentCore enables isn’t incremental. It changes the timeline for deploying production agents from six months of infrastructure work to six weeks of focused development. For organizations running SAP and other mission-critical systems, it means they can build agents that safely interact with core business processes without spending a year building security infrastructure. This mirrors the journey I described in my post on building production-ready healthcare NLP infrastructure, where the gap between research and production reveals fundamental engineering challenges that teams must solve systematically.

Key Takeaways from me

Three key takeaways define AgentCore’s production reality:

First, session isolation isn’t optional for multi-tenant agents. The Asana incident demonstrated what happens when isolation fails. AgentCore’s microVM isolation model provides deterministic security boundaries regardless of agent execution patterns, delivering the predictable security properties that enterprise deployments require.

Second, integration complexity compounds quickly. Every additional backend system adds authentication layers, error handling, and monitoring requirements. AgentCore Gateway’s automatic conversion of APIs and Lambda functions into agent-compatible tools eliminates months of undifferentiated engineering work.

Third, production agents require production infrastructure. Memory management, observability, and identity controls aren’t features you can add later, they’re foundational requirements that determine whether your agent succeeds or creates security incidents.

AgentCore isn’t for everyone. If you’re building simple automation or need sub-second latency, alternatives exist. But if you’re deploying agents in regulated industries with complex system integration requirements, AgentCore provides infrastructure that would take a year to build in-house.

The question for enterprise architects isn’t whether agentic AI will transform their operations, it’s whether they’ll build the infrastructure themselves or leverage managed services that solve these problems systematically. Based on what I’ve seen in production deployments across European markets, AgentCore represents the practical path from prototype to production.


Resources

Essential Starting Points

Production Implementation

Learning & Community

  • AgentCore Deep Dive Workshop – Comprehensive hands-on workshop with production scenarios, security patterns, and enterprise integration labs
  • AWS Discord Community – Join #bedrock-agentcore for community support and direct access to AWS AgentCore engineers

Related Content

Abraham Arellano Tavara
Senior Strategic Solutions Architect, AWS Munich
Connect on LinkedIn