Agent Lifecycle
Understanding how agents process requests helps you build more effective AI workers.
Request Flow
When a message reaches an agent, here's what happens:
User Message
│
▼
┌─────────────────┐
│ 1. RECEIVE │ Validate input, check permissions
└────────┬────────┘
│
▼
┌─────────────────┐
│ 2. CONTEXT │ Load memory, fetch relevant context
└────────┬────────┘
│
▼
┌─────────────────┐
│ 3. PROCESS │ AI model generates response
└────────┬────────┘
│
▼
┌─────────────────┐
│ 4. TOOLS │ Execute any tool calls (may loop)
└────────┬────────┘
│
▼
┌─────────────────┐
│ 5. RESPOND │ Return final response to user
└────────┬────────┘
│
▼
┌─────────────────┐
│ 6. PERSIST │ Save memory, log interaction
└─────────────────┘
Stage Details
1. Receive
The agent receives and validates the request.
What happens:
- Authenticate the request (API key, session)
- Validate input format and length
- Check rate limits
- Verify agent is active
Possible outcomes:
- ✅ Proceed to Context
- ❌ Return 401/403 (auth error)
- ❌ Return 429 (rate limited)
- ❌ Return 400 (invalid input)
// Request validation
{
agentId: 'agent-123', // Must exist and be active
message: 'Hello', // Required, max 32K chars
conversationId: 'conv-1', // Optional, for context
userId: 'user-456' // Optional, for memory
}
2. Context
The agent gathers relevant context for the request.
What's loaded:
- Conversation history (if continuing)
- User-specific memory
- Team/app-level knowledge
- System prompt
Context assembly:
┌─────────────────────────────────────────┐
│ CONTEXT WINDOW │
├─────────────────────────────────────────┤
│ System Prompt │
│ ─────────────────────────────────────── │
│ You are an IT Helpdesk assistant... │
├─────────────────────────────────────────┤
│ Memory / Knowledge │
│ ─────────────────────────────────────── │
│ User's laptop: ThinkPad X1 Carbon │
│ Previous issue: VPN connection (solved) │
├─────────────────────────────────────────┤
│ Conversation History │
│ ─────────────────────────────────────── │
│ User: My email isn't working │
│ Agent: I can help with that... │
│ User: It says "connection failed" │
├─────────────────────────────────────────┤
│ Current Message │
│ ─────────────────────────────────────── │
│ User: Is Google down? │
└─────────────────────────────────────────┘
3. Process
The AI model generates a response.
What happens:
- Full context sent to model
- Model decides: respond directly OR use tools
- If using tools, generate tool call(s)
- If responding, generate text response
Model decision flow:
Context
│
▼
┌────────────────┐
│ AI Model │
└────────┬───────┘
│
┌───────────┴───────────┐
│ │
▼ ▼
┌─────────────┐ ┌─────────────┐
│ Text Reply │ │ Tool Call │
└─────────────┘ └─────────────┘
4. Tools (Loop)
If the model wants to use tools, they're executed.
Tool execution loop:
Model: "I should check Google's status"
│
▼
┌─────────────────────────────────────────┐
│ Tool: http_request │
│ Input: { url: "status.google.com" } │
└────────────────────┬────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ Tool Result │
│ { status: "all_services_normal" } │
└────────────────────┬────────────────────┘
│
▼
┌─────────────┐
│ AI Model │ ◄── Decides: more tools or respond?
└──────┬──────┘
│
┌───────────┴───────────┐
│ │
▼ ▼
Another tool call Generate response
(loop continues) (exit loop)
Tool limits:
- Max 10 tool calls per turn (configurable)
- Total timeout: 30 seconds
- Individual tool timeout: 10 seconds
5. Respond
The final response is returned to the user.
Response structure:
{
id: 'msg-789',
agentId: 'agent-123',
conversationId: 'conv-456',
content: 'Google services are currently operational...',
toolCalls: [
{
tool: 'http_request',
input: { url: 'status.google.com' },
output: { status: 'all_services_normal' }
}
],
usage: {
inputTokens: 1250,
outputTokens: 89,
totalTokens: 1339
},
createdAt: '2024-01-15T10:30:00Z'
}
6. Persist
After responding, data is saved.
What's persisted:
- Conversation history (user message + agent response)
- Memory updates (if any)
- Tool results (for debugging)
- Analytics/metrics
Async operations:
- Webhook notifications
- Analytics processing
- Memory indexing
Agent States
Agents can be in different states:
| State | Description | Can Receive Requests? |
|---|---|---|
| Active | Normal operation | ✅ Yes |
| Paused | Temporarily disabled | ❌ No |
| Maintenance | Being updated | ❌ No |
| Archived | Soft deleted | ❌ No |
// Check agent state
const agent = await deeployd.agents.get('agent-123');
console.log(agent.state); // 'active'
// Pause an agent
await deeployd.agents.pause('agent-123');
// Resume an agent
await deeployd.agents.resume('agent-123');
Conversation Lifecycle
Conversations also have a lifecycle:
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌──────────┐
│ New │───▶│ Active │───▶│ Idle │───▶│ Archived │
└─────────┘ └─────────┘ └─────────┘ └──────────┘
│ │
│ │
└──────────────┘
(new message)
| State | Description | Duration |
|---|---|---|
| New | Just created | Until first message |
| Active | Currently in use | During conversation |
| Idle | No recent activity | After 30min inactivity |
| Archived | Stored for reference | After 90 days |
Error Handling
Errors can occur at any stage:
Common Errors
| Error | Stage | Cause | Resolution |
|---|---|---|---|
AuthenticationError | Receive | Invalid API key | Check credentials |
RateLimitError | Receive | Too many requests | Wait or upgrade |
ContextTooLong | Context | History exceeds limit | Summarize or start new |
ModelError | Process | AI model failure | Retry or fallback |
ToolError | Tools | Tool execution failed | Check tool config |
TimeoutError | Any | Request took too long | Optimize or increase limit |
Error Recovery
try {
const response = await deeployd.agents.chat({
agentId: 'agent-123',
message: 'Hello'
});
} catch (error) {
if (error.code === 'RATE_LIMITED') {
// Wait and retry
await sleep(error.retryAfter);
return retry();
}
if (error.code === 'CONTEXT_TOO_LONG') {
// Start fresh conversation
return deeployd.agents.chat({
agentId: 'agent-123',
message: 'Hello',
conversationId: null // New conversation
});
}
throw error;
}
Performance Considerations
Latency Breakdown
Typical request latency:
| Stage | Typical Time | Range |
|---|---|---|
| Receive | 10ms | 5-50ms |
| Context | 50ms | 20-200ms |
| Process | 500ms | 200ms-5s |
| Tools | 0-2000ms | Per tool call |
| Respond | 10ms | 5-50ms |
| Persist | Async | N/A |
Total: 500ms - 7s depending on complexity
Optimization Tips
- Reduce context size - Summarize long conversations
- Limit tools - Only enable needed tools
- Use streaming - Get faster first-byte
- Cache when possible - Avoid redundant lookups
Next: Learn about the Execution Model to understand how tasks and workflows run.