Three days. That’s all it took to go from building a deep research agent to creating sophisticated AWS Strands multi-server agents orchestrating three specialized MCP servers. Using SAP Generative AI Hub with Amazon Bedrock, we built a financial intelligence demo that reduces analysis time by 30%. Not because the technology is simple, but because the right architectural decisions; combining AWS Strands agent orchestration with enterprise platforms, make complex scenarios tractable.
When Michelle Mei-Li Pfister and I sat down to create our Devtoberfest session on building multi-tool research agents, we wanted to showcase something real. Not a toy example destined to gather dust. We wanted to demonstrate how SAP’s Generative AI Hub integration with AWS Bedrock, combined with the AWS Strands SDK, enables enterprise developers to build production-grade agentic systems, fast.
The SAP Generative AI Hub in SAP AI Core provides enterprise-grade access to leading foundation models. By integrating with Amazon Bedrock to deliver models like Anthropic’s Claude 3.5 and Amazon Titan through a unified interface, it centrally enforces content filtering, SAP-specific risk mitigation, and safety guardrails across the SAP ecosystem.
Financial Analysis Time Reduction: Our demo system can reduce comprehensive financial analysis and report generation across multiple systems by approximately 30%, with individual stock analysis showing 10-20% efficiency gains.
Orchestration Efficiency: AWS Strands SDK’s automatic metrics tracking enables monitoring of token usage, latency, tool execution times, and success rates—essential for production readiness assessment.
The real story isn’t just the tech stack. It’s what happens when you combine proven enterprise platforms with modern agent orchestration patterns.
Watch the Full Session
Special thanks to Michelle Mei-Li Pfister for co-creating this session, and to Nora von Thenen for championing this work and making this Devtoberfest session possible.
The Foundation: Deep Research with Tavily and AWS Strands
Our first notebook demonstrates a research agent that can search, extract, crawl, and synthesize information from the web. By leveraging the Tavily API for web intelligence and AWS Strands Agents SDK, we created an agent that orchestrates multiple capabilities:
deep_researcher_agent = Agent(
model=bedrock_model,
system_prompt=SYSTEM_PROMPT,
tools=[
web_search,
web_extract,
web_crawl,
format_research_response,
],
)AWS Strands takes a model-driven approach to building AI agents. Rather than defining complex workflows, you provide a model, system prompt, and tools. The framework embraces state-of-the-art model capabilities to plan, chain thoughts, call tools, and reflect. This shifts complexity from code into the LLM’s weights.
Multiple teams at AWS use Strands for production AI agents, including Amazon Q Developer, AWS Glue, and VPC Reachability Analyzer. When internal AWS teams trust a framework for production workloads, that’s a signal worth noting.
The Research Agent Architecture
The system prompt guides behavior without constraining it. Like the architectural patterns in building web services on AWS, we define clear interfaces while allowing flexibility in implementation.
What makes this interesting? The agent reasons about which tools to use and when. It adapts its research strategy based on what it discovers. This emergent behavior comes from the model’s reasoning capabilities, not from explicit programming.
AWS Strands Built-In Observability
One of AWS Strands’ production-ready features is comprehensive observability using OpenTelemetry standards. The framework automatically tracks:
| Metric Category | What It Tracks | Production Value |
|---|---|---|
| Token Usage | Input tokens, output tokens, and total token consumption | Cost optimization and budget control |
| Performance Metrics | Latency and execution time measurements | SLA compliance monitoring |
| Tool Usage | Call counts, success rates, and execution times | Reliability and failure detection |
| Event Loop Cycles | Number of reasoning cycles and their durations | Agent efficiency optimization |
This telemetry integrates with AWS X-Ray for distributed tracing and Amazon CloudWatch for real-time monitoring. You can set up CloudWatch alarms on metrics like tool error rates or latency per agent call to alert operations teams of anomalies.
The Innovation: Multi-Server Financial Intelligence
After building the research agent, we had a working foundation. But financial analysis requires coordinating multiple specialized systems. That’s where the Model Context Protocol (MCP) enters the picture.
Architecture in Action: Four-Layer Intelligence

The diagram above illustrates our production-ready architecture. Enterprise users connect through an AWS Strands Agent powered by SAP GenAI Hub and Anthropic Claude, which orchestrates requests through the AWS Strands MCP Client. The MCP Session Manager maintains persistent connections across three specialized financial MCP servers; handling real-time data, document analysis, and risk analytics; before delivering comprehensive outputs including investment reports, risk matrices, and sentiment analysis to end users.
Understanding MCP: The Missing Standard
Anthropic open-sourced the Model Context Protocol in November 2024 to address a fundamental challenge: connecting AI assistants to data systems. Before MCP, each data source required custom integration. This created an “N×M” problem; every new model needed connectors to every data source.
MCP provides a universal standard. One protocol, any model, any data source. Major AI providers including OpenAI and Google DeepMind adopted it within months of the announcement.
Just as USB-C standardized device connectivity, MCP standardizes AI-data integration. Rather than building N×M custom connectors, you implement the protocol once and connect to any MCP-compatible system.
MCP uses a client-server architecture built on JSON-RPC 2.0. Servers expose three primitives: Tools (executable functions), Resources (structured data), and Prompts (instruction templates). Clients coordinate access to these capabilities on behalf of AI agents.
Three Specialized MCP Servers
For our demo, we built three specialized MCP servers, each handling distinct aspects of financial intelligence:
| Server | Port | Implementation | Purpose | Key Tools |
|---|---|---|---|---|
| Financial Data | 8001 | FastAPI (Manual JSON-RPC) | Real-time market data | Stock quotes, company fundamentals, health scoring |
| Document Analysis | 8002 | FastMCP Framework | Sentiment analysis | PDF parsing, report analysis, metric extraction |
| Analytics & Reporting | 8003 | FastMCP Framework | Advanced analytics | Comparison charts, risk assessment, trend analysis |
Why Two Implementation Approaches?
We deliberately chose different implementations for our demo to demonstrate both approaches:
- FastAPI (Manual): Full control over the JSON-RPC protocol. Perfect for learning MCP fundamentals or when you need precise control over message handling.
- FastMCP Framework: Streamlined development with automatic protocol handling. Better for rapid prototyping and when you want to focus on tool logic rather than protocol details.
Let’s dive deeper into those 2 open source frameworks:
| Aspect | Manual FastAPI (Port 8001) | FastMCP Framework (Ports 8002/8003) |
|---|---|---|
| Implementation Complexity | Full JSON-RPC 2.0 protocol implementation required | Automatic protocol handling via decorators |
| Code Lines for Basic Server | ~150-200 lines | ~50-75 lines |
| Development Speed | Slower, explicit control | 3-4x faster time-to-market |
| Protocol Understanding | Deep MCP protocol knowledge required | Abstracted away, focus on business logic |
| Best For | Learning MCP fundamentals, custom requirements | Production deployments, rapid prototyping |
| Production Readiness | Requires additional error handling and logging | Built-in production features |
Both approaches are production-viable. The choice depends on your specific requirements around control versus development velocity.
However, for a full end-to-end platform to develop and orchestrate MCP servers at scale, please review My AgentCore Guide!
The Session Manager Pattern
Managing connections to three MCP servers presented an interesting challenge. We needed persistent connections that work seamlessly across Jupyter cells without context manager complexity.
The solution? A custom MCPSessionManager that uses Python’s ExitStack to maintain persistent client connections:
mcp_manager = MCPSessionManager()
# Establish persistent connections to all servers
mcp_manager.start_sessions({
"financial_data": "http://127.0.0.1:8001/mcp",
"document_analysis": "http://127.0.0.1:8002/mcp",
"analytics_reporting": "http://127.0.0.1:8003/mcp"
})
# Aggregate tools from all servers
all_financial_tools = mcp_manager.get_all_tools()
# Create unified agent with cross-server capabilities
financial_intelligence_agent = Agent(
model=financial_model,
tools=all_financial_tools,
system_prompt=financial_expert_prompt
)The session manager eliminates connection boilerplate while maintaining enterprise requirements for connection pooling, error recovery, and audit logging. This pattern scales from demo notebooks to production deployments.
Cross-Server Orchestration in Action
The real power emerges when a single query triggers coordination across all three servers. When asked for comprehensive investment analysis, the agent automatically:
- Fetches current stock data (Financial Data Server)
- Analyzes sentiment from recent reports (Document Analysis Server)
- Calculates risk metrics and generates visualizations (Analytics Server)
- Synthesizes findings into an executive-ready report
No explicit orchestration logic. No hardcoded workflows. The agent reasons about which tools to use and coordinates across servers automatically.
Secure Enterprise Integration with SAP GenAI Hub
Throughout our demo, the SAP Generative AI Hub handles critical security and governance requirements. This isn’t just API access it’s a comprehensive orchestration layer that:
- Enforces Content Filtering: Screens both input prompts and LLM outputs for inappropriate content
- Implements Data Masking: Anonymizes personal and confidential information before sending to models
- Provides Centralized Governance: Maintains consistent policies across the SAP ecosystem
- Enables Compliance: Supports regulatory requirements through built-in safeguards
The Generative AI Hub sits between your application and Amazon Bedrock. This ensures data protection, regulatory compliance, and responsible AI usage across all SAP integrations. But never forget the basics of security at scale.
When to Use This Architecture
- Coordinate 3+ specialized systems or data sources
- Need rapid prototyping with a clear path to production
- Value maintainability and model-driven flexibility
- Have diverse tool capabilities that benefit from specialization
- Want to leverage standard protocols (MCP) for future extensibility
- Need built-in observability for production monitoring
What’s Next
Our demo system demonstrates cross-server orchestration at scale. The real power emerges when you consider enterprise possibilities:
- SAP Integration: Connect MCP servers directly to SAP business processes
- Multi-Tenant Deployments: Serve multiple organizations from shared MCP infrastructure
- Hybrid Architectures: Combine on-premises SAP systems with cloud-native AI services
- Domain-Specific Agents: Build specialized agents for procurement, finance, HR; each orchestrating their own MCP ecosystem
Try It Yourself
Both notebooks are available in our GitHub repository. The progression from research agent to multi-server orchestration provides a practical learning path for enterprise developers.
Key takeaways:
- Start Simple: Build single-agent systems first (notebook 05)
- Learn the Protocol: Implement manual MCP to understand fundamentals
- Scale Thoughtfully: Use FastMCP and session managers for production (notebook 06)
- Secure by Design: Implement proper authentication, authorization, and audit logging
- Monitor Everything: Leverage AWS Strands’ built-in observability for production readiness
Frequently Asked Questions
What is AWS Strands Agents SDK?
AWS Strands is an open-source SDK that takes a model-driven approach to building AI agents. Rather than requiring complex workflow definitions, you provide a model, system prompt, and tools. The framework leverages modern LLM capabilities for planning, reasoning, and tool coordination. AWS teams including Amazon Q Developer and AWS Glue use Strands in production.
How does MCP protocol enable agent orchestration?
MCP (Model Context Protocol) provides a standard for connecting AI assistants to data systems. Instead of building custom integrations for each data source, you implement the MCP protocol once. The protocol uses JSON-RPC 2.0 with a client-server architecture, enabling any MCP client to communicate with any MCP server. This eliminates the N×M integration problem.
What are the production requirements for multi-server agents?
Production multi-server agent systems require: comprehensive observability (traces, metrics, logs); persistent session management with error recovery; proper authentication and authorization across all servers; rate limiting and cost controls; audit logging for compliance; and integration with enterprise monitoring systems like CloudWatch or X-Ray. AWS Strands provides built-in observability using OpenTelemetry standards.
How does SAP GenAI Hub improve security?
SAP Generative AI Hub sits between your application and foundation models, providing centralized governance. It enforces content filtering on inputs and outputs, implements data masking for sensitive information, maintains consistent policies across the SAP ecosystem, and supports regulatory compliance requirements. This orchestration layer ensures responsible AI usage across all SAP integrations.
The future of enterprise AI isn’t monolithic models doing everything. It’s distributed intelligence: specialized agents working together, orchestrated through open protocols, secured by enterprise platforms.
Three days proved it’s possible. Now it’s your turn.
Abraham Arellano Tavara is a Senior Solutions Architect at AWS, specializing in enterprise AI solutions and SAP integrations. Connect on LinkedIn for more insights on building production-grade agentic systems.