Agent Assist Workbench Tutorial¶

When the intelligent customer service system decides that escalation to a human agent is needed, the session enters the pending state and joins the agent's pending queue. The agent assist workbench exposes 8 endpoints that support the full closed loop from "view pending → accept → communicate → assisted lookup → resolve → solution consolidation".

Prerequisites

Endpoint prefix is uniformly /api/v1/agent, auth header X-API-Key
Escalation has already been triggered via the chat endpoint (escalate_to_human=true), and the session agent_status is pending
The escalation context card EscalationCard has been generated and cached in the session

Closed-loop Overview¶

After an escalation occurs, the agent side forms a complete handling loop through 8 endpoints:

flowchart LR
    A[Chat endpoint triggers escalation<br/>agent_status=pending] --> B[1. GET sessions/pending<br/>view pending list]
    B --> C[2. GET sessions/:id<br/>view session details + EscalationCard]
    C --> D[3. POST sessions/:id/accept<br/>accept CAS]
    D --> E[4. POST sessions/:id/messages<br/>agent sends message]
    E --> F[5. POST knowledge-recommend<br/>knowledge recommendation]
    E --> G[6. POST business-assist<br/>business assist]
    F --> H[7. POST sessions/:id/resolve<br/>mark resolved]
    G --> H
    H --> I[8. POST sessions/:id/solution<br/>consolidate solution back to KB]
    I --> J[New FAQ added to KB<br/>next time the bot can match it]

State Machine¶

Session state transitions on the agent side are strictly guarded by CAS (Compare-And-Swap) to prevent concurrent accepts or duplicate operations:

stateDiagram-v2
    [*] --> None: Not escalated
    None --> pending: Escalation triggered
    pending --> assigned: Agent accepts (CAS)
    pending --> pending: Other agent's accept fails (409)
    assigned --> assigned: Send message / assist query
    assigned --> resolved: Mark resolved (CAS)
    pending --> resolved: Not allowed (409)
    resolved --> [*]

State	Meaning	Allowed Operations
`None`	Not escalated; pure bot session	Chat endpoint only
`pending`	Escalated; waiting for an agent	View details, accept
`assigned`	Agent has accepted	Send message, knowledge recommendation, business assist, mark resolved
`resolved`	Resolved	Solution consolidation

State transition constraints

pending cannot resolve directly; you must accept first
Only assigned can send messages; sending from pending/resolved returns 409
When multiple agents concurrently accept the same session, only one succeeds; the rest get 409

EscalationCard Structure¶

The context card generated at escalation lets the agent quickly understand the user's request and the solutions the bot has already tried before accepting, avoiding repeated questions.

{
  "session_id": "sess-9f3c2a1b",
  "user_id": "u_10086",
  "member_level": "gold",
  "history_ticket_count": 3,
  "turn_count": 4,
  "conversation_summary": "User asked about the shipment status of order ORD-001. The bot failed to query it and the user became agitated",
  "attempted_solutions": [
    "Suggested logging in to the My Orders page to check",
    "Provided the customer service number 400-xxx"
  ],
  "escalate_reason": "Consecutive failures reached the threshold and the user was agitated",
  "priority": "high"
}

Field	Description
`member_level`	Membership tier; VIP users get priority
`history_ticket_count`	Historical ticket count, reflecting the user's past requests
`conversation_summary`	Conversation summary: core problem and current status
`attempted_solutions`	List of suggestions the bot already gave, to avoid repeating solutions
`escalate_reason`	Reason for this escalation
`priority`	Priority: `highest/high/medium/low/info`

Endpoint Details¶

1. Pending List: GET /api/v1/agent/sessions/pending¶

Lists all pending sessions, sorted by EscalationPriority descending, for the workbench's first-screen queue display.

curl http://localhost:8000/api/v1/agent/sessions/pending \
  -H "X-API-Key: ${API_KEY}"

[
  {
    "session_id": "sess-9f3c2a1b",
    "user_id": "u_10086",
    "priority": "highest",
    "escalate_reason": "User explicitly requested transfer to a human",
    "turn_count": 4,
    "created_at": "2026-07-03T10:00:00Z",
    "agent_status": "pending",
    "assigned_agent_id": null
  }
]

Returns summary fields only

The list endpoint does not return the full history to avoid slowing down the first screen with large payloads. After clicking an entry, call the session details endpoint for the full context.

2. Session Details: GET /api/v1/agent/sessions/{session_id}¶

Returns full session information, including the EscalationCard and history. If the card cache is missing, it is rebuilt on the fly and written back to the cache.

curl http://localhost:8000/api/v1/agent/sessions/sess-9f3c2a1b \
  -H "X-API-Key: ${API_KEY}"

3. Accept Session: POST /api/v1/agent/sessions/{session_id}/accept¶

The agent accepts the session. CAS guards pending → assigned. With concurrent accepts from multiple agents, only one succeeds.

curl -X POST http://localhost:8000/api/v1/agent/sessions/sess-9f3c2a1b/accept \
  -H "Content-Type: application/json" \
  -H "X-API-Key: ${API_KEY}" \
  -d '{"agent_id": "agent-001"}'

{
  "session_id": "sess-9f3c2a1b",
  "agent_status": "assigned",
  "assigned_agent_id": "agent-001",
  "turn_count": 4,
  "escalation_card": {...},
  "history": [...]
}

agent_id is optional

agent_id defaults to agent-default, suitable when no agent identity system exists. In production, pass the real agent ID for ticket attribution and performance statistics.

4. Send Message: POST /api/v1/agent/sessions/{session_id}/messages¶

The agent sends a message in the original session context, appended to history. Only allowed in assigned state.

curl -X POST http://localhost:8000/api/v1/agent/sessions/sess-9f3c2a1b/messages \
  -H "Content-Type: application/json" \
  -H "X-API-Key: ${API_KEY}" \
  -d '{"content": "Hello, I am agent Li. I will help you check the shipment of order ORD-001"}'

{
  "message_id": "msg-uuid-xxx",
  "timestamp": "2026-07-03T10:05:00Z",
  "role": "assistant"
}

The agent enters a query to quickly retrieve related knowledge chunks, reusing HybridRetriever.retrieve (vector + BM25 hybrid retrieval). On miss, an empty list is returned without error.

curl -X POST http://localhost:8000/api/v1/agent/sessions/sess-9f3c2a1b/knowledge-recommend \
  -H "Content-Type: application/json" \
  -H "X-API-Key: ${API_KEY}" \
  -d '{"query": "Order shipment query process", "top_k": 5}'

{
  "chunks": [
    {
      "content": "Shipment can be checked via the order details page...",
      "score": 0.92,
      "source": "operation_manual.md"
    }
  ],
  "total": 1
}

Can be called before accepting

Knowledge recommendation does not require the assigned state. The agent can preview knowledge before accepting, making it easier to prepare solutions in advance.

6. Business Assist: POST /api/v1/agent/sessions/{session_id}/business-assist¶

Reuses BusinessAgent.execute (with data masking) to let the agent query business systems in natural language. Business exceptions do not raise 5xx; they degrade to a result.error field so the workbench is never interrupted.

curl -X POST http://localhost:8000/api/v1/agent/sessions/sess-9f3c2a1b/business-assist \
  -H "Content-Type: application/json" \
  -H "X-API-Key: ${API_KEY}" \
  -d '{"query": "Check the status of order ORD-001"}'

{
  "result": {
    "reply": "Order ORD-001 current status: shipped, estimated delivery July 5",
    "data": {"order_id": "ORD-001", "status": "shipped", "phone_masked": "138****1234"},
    "error": null,
    "need_confirmation": false,
    "scene": "order"
  },
  "masked_fields": ["phone_masked"]
}

Masked field indicator

masked_fields lists the names of masked fields (such as phone_masked); the frontend uses this to flag a "masked" hint. Write operations return need_confirmation=true and require a second confirmation from the agent.

7. Mark Resolved: POST /api/v1/agent/sessions/{session_id}/resolve¶

Marks the session as resolved. CAS guards assigned → resolved. Only assigned sessions can be marked.

curl -X POST http://localhost:8000/api/v1/agent/sessions/sess-9f3c2a1b/resolve \
  -H "Content-Type: application/json" \
  -H "X-API-Key: ${API_KEY}" \
  -d '{"note": "Checked the shipment and informed the user of the estimated delivery time"}'

{
  "session_id": "sess-9f3c2a1b",
  "agent_status": "resolved",
  "resolved_at": "2026-07-03T10:15:00Z"
}

8. Solution Consolidation: POST /api/v1/agent/sessions/{session_id}/solution¶

Records the human solution and consolidates it as a FAQ candidate. It enters the pending review queue and, after approval, is ingested as a FAQ. The next time the bot can retrieve and match it, forming the closed loop of "human handles → consolidate → bot answers next time".

curl -X POST http://localhost:8000/api/v1/agent/sessions/sess-9f3c2a1b/solution \
  -H "Content-Type: application/json" \
  -H "X-API-Key: ${API_KEY}" \
  -d '{
    "question": "Where is the shipment of order ORD-001?",
    "solution": "The order has shipped. Tracking number SF1234567890, estimated delivery July 5. You can check real-time shipment on the SF Express website.",
    "intent": "business_query"
  }'

{
  "solution_id": "sol-uuid-xxx",
  "session_id": "sess-9f3c2a1b",
  "question": "Where is the shipment of order ORD-001?",
  "solution": "The order has shipped...",
  "intent": "business_query",
  "status": "pending"
}

intent is optional

When intent is omitted, the system recognizes it automatically. Manual annotation by the agent is recommended for accuracy and easier downstream categorization and retrieval.

Complete Workflow Example¶

The following Python example shows the full flow from "view pending → solution entry":

import httpx

BASE = "http://localhost:8000"
HEADERS = {"Content-Type": "application/json", "X-API-Key": ""}
AGENT_ID = "agent-001"

def agent_workflow():
    """Agent complete workflow: from viewing pending to entering a solution."""
    # 1. View the pending list, sorted by priority descending
    pending = httpx.get(f"{BASE}/api/v1/agent/sessions/pending", headers=HEADERS).json()
    if not pending:
        print("No pending sessions")
        return
    session = pending[0]  # take the highest priority
    session_id = session["session_id"]
    print(f"Accepting session: {session_id} priority={session['priority']}")

    # 2. View session details to understand the user's request and the bot's attempted solutions
    detail = httpx.get(f"{BASE}/api/v1/agent/sessions/{session_id}", headers=HEADERS).json()
    card = detail["escalation_card"]
    print(f"User request: {card['conversation_summary']}")
    print(f"Attempted solutions: {card['attempted_solutions']}")

    # 3. Accept the session (CAS; only one concurrent accept succeeds)
    accept = httpx.post(
        f"{BASE}/api/v1/agent/sessions/{session_id}/accept",
        headers=HEADERS, json={"agent_id": AGENT_ID},
    )
    if accept.status_code == 409:
        print("Already accepted by another agent")
        return
    print(f"Accept succeeded: {accept.json()['agent_status']}")

    # 4. Send a message to reassure the user
    httpx.post(f"{BASE}/api/v1/agent/sessions/{session_id}/messages",
               headers=HEADERS, json={"content": "Hello, I will help you handle this issue"})

    # 5. Knowledge recommendation: retrieve related knowledge to assist the reply
    knowledge = httpx.post(
        f"{BASE}/api/v1/agent/sessions/{session_id}/knowledge-recommend",
        headers=HEADERS, json={"query": card['conversation_summary'], "top_k": 3},
    ).json()
    print(f"Recommended knowledge: {knowledge['total']} entries")

    # 6. Business assist: query the business system for real-time data
    business = httpx.post(
        f"{BASE}/api/v1/agent/sessions/{session_id}/business-assist",
        headers=HEADERS, json={"query": "Check the user's order status"},
    ).json()
    print(f"Business result: {business['result']['reply']}")

    # 7. Mark resolved
    httpx.post(f"{BASE}/api/v1/agent/sessions/{session_id}/resolve",
               headers=HEADERS, json={"note": "Checked and informed the user"})
    print("Session marked resolved")

    # 8. Consolidate the solution back to the KB; the bot can match it next time
    httpx.post(f"{BASE}/api/v1/agent/sessions/{session_id}/solution",
               headers=HEADERS, json={
                   "question": card['conversation_summary'],
                   "solution": business['result']['reply'],
                   "intent": "business_query",
               })
    print("Solution consolidated; awaiting review and ingestion")

agent_workflow()

FAQ¶

What happens when multiple agents concurrently accept the same session?¶

The accept endpoint has built-in CAS (Compare-And-Swap) protection: only the first request that transitions the state from pending to assigned succeeds; the rest return 409. On receiving a 409, the frontend should refresh the pending list and prompt "This session has been accepted by another agent".

Can I query knowledge before accepting?¶

Yes. knowledge-recommend does not require the assigned state; the agent can preview knowledge before accepting to prepare solutions in advance. However, messages can only be sent after assigned.

How long after consolidation can a solution be retrieved?¶

A solution first enters the pending review queue. It is only ingested as a FAQ after passing review via POST /api/v1/escalation/solutions/{solution_id}/approve. After ingestion, you must call POST /api/v1/performance/cache/invalidate to clear the hot cache before the new solution can be retrieved and matched by the chat endpoint. See the Escalation Tutorial.

Next Steps¶

Escalation Tutorial: escalation triggers and solution review/ingestion
Business System Integration Tutorial: masking and adapter details for the business assist endpoint
Chat Endpoint Tutorial: how escalation is triggered