Complete Architecture Structure
1. Intent Layer
2. Planning Layer
3. Execution Layer
4. Tool Layer
5. Memory Layer
6. Knowledge Layer
7. Policy & Guardrail Layer
8. Observability Layer
9. Persistence Layer
10. Governance Layer
1. First Layer: Intent Router
Router needs to determine whether to invoke Tools
Router Prompt:
prompt = '
You are a task classifier.
User query: {query}
Classify the user query into one of:
1. simple_qa
2. retrieval
3. tool_call
4. multi_step
Return JSON:
{
"type": "..."
}
’
2. Second Layer: ReAct Tool Core
1. simple_qa → LLM direct answer
2. retrieval → call vector database for direct retrieval / call Browser retrieval
3. tool_call → enter ReAct loop
4. multi_step → enter ReAct loop
ReAct Loop Architecture:
LLM decides next action
↓
Call one tool
↓
Append tool result
↓
Repeat
Sample Code:
while True:
response = llm(messages, tools=tool_schema)
if response.tool_calls:
result = execute_tool(...)
messages.append(tool_result)
else:
break
3. Third Layer: Structured Tool Layer
Here we adopt ChatGPT’s Function Calling concept, connecting LLM and Tools invocation by defining “Tools Schema”.
Enterprise-grade AIAgent: ReAct + Function calling: https://strictfrog.com/en/2026-02-17-enterprise-level-ai-agent-react-function-calling/
4. Fourth Layer: Guardrails & Policies
Maximum step limit:
max_steps = 15
Disallow sending sensitive emails:
if tool = send_mail:
require confirmation
5. Fifth Layer: Memory Layer
Unified Memory system (long-term + short-term)
1️⃣ Short-term memory (Session Memory)
• current task state
• tool invocation history
• intermediate decision result
2️⃣ Long-term memory (Persistent Memory)
• user preferences history
• historical task records
• decision result archives
• email sending logs
Otherwise the system cannot:
• track back
• analyze
• audit
• learn and optimize
7. Seventh Layer: Policy & Guardrail Layer
Policy & Safety Layer (Security Controls)
Now it can send emails automatically.
Risky issues:
• prompt injection?
• user sending mass emails?
• sensitive info leakage?
• calling dangerous APIs?
A mature system must have:
Guardrails
• tool invocation whitelist
• parameter validation
• risk scoring
• human confirmation checkpoints
• rate limiting
8. Eighth Layer: Observability Layer
Observability
Must be able to answer:
• Which tool fails the most?
• Average steps per task execution?
• LLM token consumption?
• Email success rate?
• Which step takes the most time?
This requires:
• structured logging
• step-level tracing
• metrics
• visual dashboards
Otherwise optimization is impossible.
9. Ninth Layer: Persistence Layer
Task Persistence
Must support:
• task interruption recovery
• asynchronous execution
• multi-user concurrency
• failure retry
• delayed execution
Requires:
• Redis / DB to store state
• unique ID for each task
• state machine persistence
Otherwise it's just a script, not a system.
Tool Governance
When tools exceed 10, issues appear:
• tool conflicts
• parameter confusion
• incorrect routing
• LLM picking wrong tools
Mature systems require:
• tool registry
• tool capability descriptions
• tool prioritization
• tool invocation logs
Error Recovery Strategy
Currently:
Error → Program crash
Mature systems must have:
• automatic retry
• backup models
• fallback strategy
• LLM self-healing prompts
Cost Control Layer
Mature systems must monitor:
• token usage
• tool invocation cost
• API fees
• model selection strategy (small models first)
Otherwise costs are uncontrollable.
Human-in-the-loop
When encountering:
• high-risk operations
• low confidence
• multiple ambiguous candidates
It must:
Agent → request human confirmation
Otherwise it cannot be used in real business.
Mature System Architecture Diagram
┌──────────────────┐
│ API Gateway │
└──────────────────┘
↓
┌──────────────────┐
│ Intent Router │
└──────────────────┘
↓
┌──────────────────┐
│ ReAct Engine │
└──────────────────┘
↓
┌───────────────┼────────────────┐
↓ ↓ ↓
Tool Layer Knowledge Layer Memory Layer
↓ ↓ ↓
┌──────────────────┐
│ Policy Layer │
└──────────────────┘
↓
┌──────────────────┐
│ Persistence DB │
└──────────────────┘
↓
┌──────────────────┐
│ Observability │
└──────────────────┘
Summary
Signs of a mature system
• Serving 100+ users concurrently
• Supporting long-chain tasks
• Interruption recovery
• Automatic failure repair
• Complete audit logging
• Cost accounting
• Security policies
• Human approval checkpoints