API Reference¶
All external interfaces of this system use the /api/v1/ prefix and follow RESTful conventions. Request and response bodies are JSON (file uploads use multipart). This document lists all endpoints grouped by module.
Transport conventions
- All endpoints (except health check and read-only ops endpoints) are authenticated via the
verify_api_keydependency. - Request header
Content-Type: application/json(except file uploads, which usemultipart/form-data). - Time fields use ISO8601 (UTC) format, e.g.
2026-07-03T08:30:00+00:00.
Authentication¶
The system authenticates via the X-API-Key request header. The logic is implemented by verify_api_key in app/core/security.py.
# When API_KEY is empty, the system enters no-auth mode; no X-API-Key header required
curl -H "Content-Type: application/json" \
http://localhost:8000/api/v1/chat \
-d '{"message": "Hello"}'
Security note
Development mode is for local debugging only. Production must set a non-empty API_KEY; otherwise any caller can access the system.
Unified Error Response¶
Beyond HTTP status codes, all error responses follow a unified structure:
| Status Code | Meaning | Trigger Scenario |
|---|---|---|
| 200 | Success | Successful GET / POST / PUT |
| 201 | Created | Resource-creating endpoints such as creating an experiment |
| 204 | No Content | Endpoints with no response body, such as recording experiment metrics |
| 400 | Bad Request | Invalid parameter value (e.g., alert level not in enum) |
| 401 | Unauthorized | X-API-Key missing or invalid |
| 404 | Not Found | session_id / doc_id / report_id / trace_id does not exist |
| 409 | Conflict | State transition conflicts such as duplicate agent acceptance or sending messages to a non-assigned session |
| 422 | Validation Failed | Pydantic validation failure (e.g., min_length=1 constraint) |
| 500 | Internal Error | Uncaught server-side exception |
Endpoint List Overview¶
Chat Module¶
| Endpoint | Method | Description | Auth |
|---|---|---|---|
/api/v1/chat |
POST | Synchronous chat; returns a full ChatResponse |
✅ |
/api/v1/chat/stream |
POST | SSE streaming chat; event sequence meta → token → done | ✅ |
/api/v1/gateway |
POST | Unified multi-channel access gateway | ✅ |
Agent Assist¶
| Endpoint | Method | Description | Auth |
|---|---|---|---|
/api/v1/agent/sessions/pending |
GET | Pending session list (sorted by priority descending) | ✅ |
/api/v1/agent/sessions/{session_id} |
GET | Session details (including EscalationCard + history) | ✅ |
/api/v1/agent/sessions/{session_id}/accept |
POST | Agent accepts the session (CAS pending → assigned) | ✅ |
/api/v1/agent/sessions/{session_id}/messages |
POST | Agent sends a message appended to history | ✅ |
/api/v1/agent/sessions/{session_id}/knowledge-recommend |
POST | Knowledge recommendation assist | ✅ |
/api/v1/agent/sessions/{session_id}/business-assist |
POST | Business query assist (with masking) | ✅ |
/api/v1/agent/sessions/{session_id}/resolve |
POST | Mark resolved (CAS assigned → resolved) | ✅ |
/api/v1/agent/sessions/{session_id}/solution |
POST | Consolidate a solution back to the knowledge base | ✅ |
Knowledge Base Management¶
| Endpoint | Method | Description | Auth |
|---|---|---|---|
/api/v1/knowledge/ingest |
POST | Document ingestion (multipart) | ✅ |
/api/v1/knowledge/stats |
GET | Knowledge base statistics | ✅ |
/api/v1/knowledge/documents |
GET | Paginated document list | ✅ |
/api/v1/knowledge/documents/{doc_id} |
GET | Document details (with version history) | ✅ |
/api/v1/knowledge/documents/{doc_id} |
DELETE | Delete a document (removes all chunks) | ✅ |
/api/v1/knowledge/quality/check |
POST | Quality check inspection | ✅ |
/api/v1/knowledge/documents/{doc_id}/rollback |
POST | Roll back to a specified version | ✅ |
/api/v1/knowledge/canary/ingest |
POST | Write to the canary collection | ✅ |
/api/v1/knowledge/canary/compare |
POST | Compare main collection vs canary collection | ✅ |
Document Updates¶
| Endpoint | Method | Description | Auth |
|---|---|---|---|
/api/v1/update/full |
POST | Full update (monthly) | ✅ |
/api/v1/update/incremental |
POST | Incremental update (weekly) | ✅ |
/api/v1/update/file |
POST | Single-file real-time update | ✅ |
/api/v1/update/status |
GET | Result of the most recent update | ✅ |
Ticket Mining and Retrieval Tuning¶
| Endpoint | Method | Description | Auth |
|---|---|---|---|
/api/v1/mining/tickets |
POST | Trigger historical ticket knowledge mining | ✅ |
/api/v1/mining/status |
GET | Most recent mining report | ✅ |
/api/v1/tuner/params |
GET | Query current tuning parameters | ✅ |
/api/v1/tuner/params |
PUT | Update tuning parameters (takes effect immediately) | ✅ |
/api/v1/tuner/reset |
POST | Reset to default parameters | ✅ |
Retrieval Evaluation¶
| Endpoint | Method | Description | Auth |
|---|---|---|---|
/api/v1/evaluation/run |
POST | Trigger an evaluation run | ✅ |
/api/v1/evaluation/reports |
GET | Historical report summary list | ✅ |
/api/v1/evaluation/reports/{report_id} |
GET | Single report details | ✅ |
Performance Monitoring¶
| Endpoint | Method | Description | Auth |
|---|---|---|---|
/api/v1/performance/metrics |
GET | Comprehensive performance metrics | ❌ |
/api/v1/performance/cache/stats |
GET | Hot cache statistics | ❌ |
/api/v1/performance/cache/invalidate |
POST | Clear the hot cache | ❌ |
Performance endpoints are unauthenticated
Performance monitoring endpoints are not authenticated, so ops dashboards and knowledge base update pipelines can call them without credentials. In production, we recommend adding IP allowlists or authentication at the reverse proxy layer.
Observability¶
| Endpoint | Method | Description | Auth |
|---|---|---|---|
/api/v1/observability/circuit-breakers |
GET | List all circuit breaker states | ❌ |
/api/v1/observability/circuit-breakers/{name}/reset |
POST | Manually reset a circuit breaker | ❌ |
/api/v1/observability/alerts |
GET | Query alert list (supports filtering) | ❌ |
/api/v1/observability/health |
GET | Health check report | ❌ |
/api/v1/observability/token-usage |
GET | Token usage statistics | ❌ |
Monitoring (Agent dimension)¶
| Endpoint | Method | Description | Auth |
|---|---|---|---|
/api/v1/monitor/overview |
GET | System overview (total traces, success rate, active sessions) | ❌ |
/api/v1/monitor/traces |
GET | Recent trace list summary | ❌ |
/api/v1/monitor/traces/{trace_id} |
GET | Single trace details (including steps) | ❌ |
/api/v1/monitor/agents |
GET | Each Agent's current state | ❌ |
/api/v1/monitor/sessions |
GET | Active session list | ❌ |
Operations Management¶
| Endpoint | Method | Description | Auth |
|---|---|---|---|
/api/v1/operations/experiments |
POST | Create an experiment (overwrites if it exists) | ❌ |
/api/v1/operations/experiments |
GET | List all experiments | ❌ |
/api/v1/operations/experiments/{name}/results |
GET | Query experiment results | ❌ |
/api/v1/operations/experiments/{name}/metrics |
POST | Record one metric | ❌ |
/api/v1/operations/dashboard |
GET | Operations dashboard (30s cache) | ❌ |
/api/v1/operations/release-checklist |
GET | Go-live checklist report | ❌ |
Escalation¶
| Endpoint | Method | Description | Auth |
|---|---|---|---|
/api/v1/escalation/solution |
POST | Human-entered solution (pending review) | ✅ |
/api/v1/escalation/solutions/pending |
GET | List pending solutions | ✅ |
/api/v1/escalation/solutions/{solution_id}/approve |
POST | Approve and ingest as a FAQ | ✅ |
Health Check¶
| Endpoint | Method | Description | Auth |
|---|---|---|---|
/api/v1/health |
GET | Service health status (liveness probe) | ❌ |
Endpoint Details¶
Chat Module¶
POST /api/v1/chat¶
Synchronous chat endpoint; returns a full ChatResponse after multi-agent orchestration.
Request body (ChatRequest):
| Field | Type | Required | Description |
|---|---|---|---|
message |
string | ✅ | User message |
session_id |
string | ❌ | Session ID; auto-created when empty |
channel |
string | ❌ | Channel: web/app/wechat/dingtalk/api, default web |
user_id |
string | ❌ | User ID |
Response body (ChatResponse):
{
"session_id": "sess-abc123",
"reply": "Hello, I have found the relevant content for you...",
"status": "ok",
"data": {
"intent": "knowledge_qa",
"sources": [{"source": "product_manual.md", "score": 0.92}],
"escalate_to_human": false,
"escalation_card": null,
"turn_count": 1,
"failed_attempts": 0,
"emotion_score": 0.8,
"sub_tasks": []
}
}
curl example
POST /api/v1/chat/stream¶
SSE streaming chat; event sequence meta → token (multiple) → done.
Event format: event: <type>\ndata: <json>\n\n
| Event Type | data Fields | Description |
|---|---|---|
meta |
intent, sources, escalate? |
Metadata; the first event |
token |
content |
Streaming token chunk |
done |
turn_count, escalate, answer |
Full answer and escalation flag |
error |
message |
Error event (HTTP still 200) |
curl example
Nginx buffering configuration
The streaming endpoint already sets X-Accel-Buffering: no, but if buffering still occurs behind Nginx/CDN, disable it explicitly in the Nginx config:
Agent Assist¶
GET /api/v1/agent/sessions/pending¶
Returns all pending sessions, sorted by EscalationPriority descending.
Response body: List[AgentSessionSummary]
[
{
"session_id": "sess-abc",
"user_id": "u-001",
"channel": "web",
"agent_status": "pending",
"priority": "urgent",
"turn_count": 3,
"created_at": "2026-07-03T08:00:00+00:00"
}
]
GET /api/v1/agent/sessions/{session_id}¶
Returns session details, including the EscalationCard and full history. Rebuilt on the fly on cache miss.
Path parameters:
| Parameter | Type | Description |
|---|---|---|
session_id |
string | Session ID |
Status codes:
| Status Code | Description |
|---|---|
| 200 | Success |
| 404 | Session does not exist |
POST /api/v1/agent/sessions/{session_id}/accept¶
Agent accepts the session. CAS ensures only one concurrent acceptance succeeds.
Request body (AcceptRequest, optional):
| Field | Type | Description |
|---|---|---|
agent_id |
string | Agent ID |
Status codes:
| Status Code | Description |
|---|---|
| 200 | Acceptance succeeded |
| 404 | Session does not exist |
| 409 | Session is already assigned/resolved; cannot accept |
POST /api/v1/agent/sessions/{session_id}/messages¶
Agent sends a message, appended to history.
Request body (AgentMessageRequest):
| Field | Type | Required | Description |
|---|---|---|---|
content |
string | ✅ | Message content |
Response body (AgentMessageResponse):
Status codes:
| Status Code | Description |
|---|---|
| 200 | Sent successfully |
| 404 | Session does not exist |
| 409 | Current state is not assigned; cannot send |
POST /api/v1/agent/sessions/{session_id}/knowledge-recommend¶
Knowledge recommendation assist; reuses HybridRetriever.retrieve.
Request body (KnowledgeRecommendRequest):
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
query |
string | ✅ | — | Query text |
top_k |
int | ❌ | 5 | Number of chunks to return |
Response body (KnowledgeRecommendResponse):
{
"chunks": [
{"content": "Return process...", "score": 0.92, "source": "return_policy.md"}
],
"total": 1
}
Retrieval failure fallback
On retrieval exception, it degrades to an empty chunks list without error, ensuring the agent workbench is not interrupted.
POST /api/v1/agent/sessions/{session_id}/business-assist¶
Business query assist; reuses BusinessAgent.execute (with masking).
Request body (BusinessAssistRequest):
| Field | Type | Required | Description |
|---|---|---|---|
query |
string | ✅ | Business query text |
Response body (BusinessAssistResponse):
{
"result": {
"reply": "Your order #12345 has shipped...",
"data": {"order_id": "12345", "phone_masked": "138****8888"},
"error": null,
"need_confirmation": false,
"scene": "order_query"
},
"masked_fields": ["phone_masked"]
}
POST /api/v1/agent/sessions/{session_id}/resolve¶
Marks the session as resolved. CAS guards assigned → resolved.
Request body (ResolveRequest, optional):
| Field | Type | Description |
|---|---|---|
note |
string | Resolution note |
Status codes:
| Status Code | Description |
|---|---|
| 200 | Marked successfully |
| 404 | Session does not exist |
| 409 | Current state is not assigned |
POST /api/v1/agent/sessions/{session_id}/solution¶
Records a human solution, consolidated as a FAQ candidate (enters the pending review queue).
Request body (SolutionSubmitRequest):
| Field | Type | Required | Description |
|---|---|---|---|
question |
string | ✅ | User question (min_length=1) |
solution |
string | ✅ | Solution (min_length=1) |
intent |
string | ❌ | Intent label; recognized by the system when empty |
Response body (HumanSolutionRecord):
{
"solution_id": "sol-uuid",
"session_id": "sess-abc",
"question": "How do I return a product?",
"solution": "Please click Return on the order page...",
"intent": "knowledge_qa",
"status": "pending",
"created_at": "2026-07-03T08:30:00+00:00"
}
Status codes:
| Status Code | Description |
|---|---|
| 200 | Recorded successfully |
| 404 | Session does not exist |
| 422 | Empty question/solution triggers validation failure |
Knowledge Base Management¶
POST /api/v1/knowledge/ingest¶
Upload a document for ingestion (multipart).
Request body (form-data):
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
file |
file | ✅ | — | File to ingest |
product_category |
string | ❌ | unknown |
Product category |
applicable_version |
string | ❌ | latest |
Applicable version |
knowledge_type |
string | ❌ | — | faq/policy/doc/tutorial/ticket |
published_at |
string | ❌ | — | Publish time ISO8601 |
register |
bool | ❌ | false | Whether to register to the document registry |
validate_quality |
bool | ❌ | false | Whether to run quality checks at ingestion |
Response body (IngestResult):
{
"doc_id": "doc-uuid",
"source": "return_policy.md",
"chunk_count": 12,
"status": "success",
"errors": []
}
curl example
GET /api/v1/knowledge/stats¶
Returns knowledge base statistics.
Response body (KnowledgeStats):
{
"total_chunks": 1280,
"total_documents": 15,
"collection_name": "knowledge_base",
"embedding_dim": 1024
}
GET /api/v1/knowledge/documents¶
Paginated list of registered documents.
Query parameters:
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
limit |
int | 20 | 1–200 | Page size |
offset |
int | 0 | ≥0 | Starting offset |
GET /api/v1/knowledge/documents/{doc_id}¶
Query a single document's details, including full version history.
Response body (DocumentDetail):
{
"doc_id": "doc-uuid",
"source": "return_policy.md",
"current_version": "v2",
"status": "active",
"created_at": "2026-06-01T00:00:00+00:00",
"updated_at": "2026-07-01T00:00:00+00:00",
"versions": [
{"version": "v1", "doc_hash": "abc", "status": "superseded", "chunk_count": 10, "created_at": "..."},
{"version": "v2", "doc_hash": "def", "status": "active", "chunk_count": 12, "created_at": "..."}
]
}
Document not found
When the document does not exist, an empty detail with status: "not_found" is returned; HTTP remains 200, and the frontend prompts accordingly.
DELETE /api/v1/knowledge/documents/{doc_id}¶
Delete a document by doc_id; removes all chunks of the document from the vector store.
Response body (DeleteResult):
POST /api/v1/knowledge/quality/check¶
Run a batch quality inspection on ingested content.
Request body (QualityCheckRequest):
| Field | Type | Description |
|---|---|---|
source |
string | Filter by source (optional) |
doc_id |
string | Filter by doc_id (optional) |
Response body (QualityReport): includes inspection results such as duplicate chunks, terminology hit rate, and sensitive word hits.
POST /api/v1/knowledge/documents/{doc_id}/rollback¶
Roll back a document to a specified version.
Request body (RollbackRequest):
| Field | Type | Description |
|---|---|---|
target_version |
string | Target version number |
POST /api/v1/knowledge/canary/ingest & /canary/compare¶
Two canary verification endpoints; both use CanaryRequest as the request body:
| Field | Type | Description |
|---|---|---|
doc_id |
string | Document ID |
version |
string | Target version |
sample_queries |
list[string] | Sample queries for comparison |
Document Updates¶
POST /api/v1/update/full¶
Trigger a full update (monthly).
Request body (UpdateRequest):
| Field | Type | Description |
|---|---|---|
dir_path |
string | Directory to scan |
extensions |
list[string] | File extension filter |
POST /api/v1/update/incremental¶
Trigger an incremental update (weekly); processes only new files or files whose hash changed.
POST /api/v1/update/file¶
Single-file real-time update.
Request body (UpdateSingleFileRequest):
| Field | Type | Description |
|---|---|---|
file_path |
string | Absolute file path |
metadata |
dict | Metadata overrides |
GET /api/v1/update/status¶
Returns the most recent update result; empty if never run.
Retrieval Evaluation¶
POST /api/v1/evaluation/run¶
Trigger an evaluation run.
Request body (EvaluationRunRequest):
| Field | Type | Default | Description |
|---|---|---|---|
testset_path |
string | Built-in 30 cases | External test set path |
top_k |
int | rerank_top_k |
Retrieval top_k |
Response body (EvaluationReport): includes report_id, total_queries, recall@5, hit_rate, and other metrics.
GET /api/v1/evaluation/reports¶
List historical report summaries, sorted by time descending.
GET /api/v1/evaluation/reports/{report_id}¶
Query a single report's details; returns 404 if it does not exist.
Performance Monitoring¶
GET /api/v1/performance/metrics¶
Returns comprehensive performance metrics.
Response body (MetricsResponse):
{
"metrics": {
"cache_hit_rate": 0.65,
"concurrent_requests": 3,
"model_router_stats": {"small_llm": 120, "main_llm": 30},
"avg_response_ms": 1850
}
}
GET /api/v1/performance/cache/stats¶
Returns hot cache statistics: hit/miss counts, hit rate, current entry count, LRU evictions.
POST /api/v1/performance/cache/invalidate¶
Clear the hot cache (call after knowledge base updates).
Response body (InvalidateResult):
Must call after knowledge base updates
After ingesting/deleting/rolling back knowledge base entries, you must call this endpoint to clear the cache; otherwise HotQueryCache will return stale replies.
Observability¶
GET /api/v1/observability/circuit-breakers¶
Returns a dict of name → CircuitBreakerStats, including state, failure count, and last failure time.
POST /api/v1/observability/circuit-breakers/{name}/reset¶
Manually reset a specified circuit breaker to CLOSED. Returns 404 if the breaker does not exist.
GET /api/v1/observability/alerts¶
Query the alert list, supports level / source / since filters.
Query parameters:
| Parameter | Type | Description |
|---|---|---|
level |
string | info/warn/error/critical |
source |
string | Source, e.g. token_usage |
since |
string | ISO8601 start time |
GET /api/v1/observability/health¶
Run all health checks and return an aggregate report.
GET /api/v1/observability/token-usage¶
Returns Token usage statistics for the specified window.
Query parameters:
| Parameter | Type | Default | Values |
|---|---|---|---|
window |
string | hour |
minute/hour/day |
Monitoring (Agent dimension)¶
GET /api/v1/monitor/overview¶
Returns a system overview: total traces, success rate, average duration, active sessions.
GET /api/v1/monitor/traces¶
Returns recent trace list summary (without steps).
Query parameters:
| Parameter | Type | Default | Range |
|---|---|---|---|
limit |
int | 50 | 1–200 |
GET /api/v1/monitor/traces/{trace_id}¶
Returns a single trace's details (including steps and sub_tasks). Returns 404 if not found.
GET /api/v1/monitor/agents¶
Returns each Agent's current state (including uncalled agents with a call count of 0).
GET /api/v1/monitor/sessions¶
Returns the active session list, sorted by last-active time descending.
Operations Management¶
POST /api/v1/operations/experiments¶
Create an experiment; if it already exists, it is overwritten and historical metrics are cleared. Returns 201.
Request body (CreateExperimentRequest): includes experiment name, variant definitions, and traffic split.
GET /api/v1/operations/experiments¶
List all experiments.
GET /api/v1/operations/experiments/{name}/results¶
Query experiment results; returns 404 if the experiment does not exist.
POST /api/v1/operations/experiments/{name}/metrics¶
Record one experiment metric; returns 204. Recording is allowed even if the experiment does not exist, for replay.
GET /api/v1/operations/dashboard¶
Returns aggregated dashboard data, cached for 30 seconds. force_refresh=true bypasses the cache.
GET /api/v1/operations/release-checklist¶
Runs the go-live checklist and returns a report. Each check runs independently; a failure does not interrupt the others.
Escalation¶
POST /api/v1/escalation/solution¶
A human agent enters a solution (pending review).
Request body (HumanSolutionRequest):
| Field | Type | Required | Description |
|---|---|---|---|
session_id |
string | ✅ | Session ID |
question |
string | ✅ | User question |
solution |
string | ✅ | Solution |
intent |
string | ❌ | Intent label |
GET /api/v1/escalation/solutions/pending¶
List all pending solutions.
POST /api/v1/escalation/solutions/{solution_id}/approve¶
Approve a solution and ingest it as a FAQ. Returns 404 if the solution does not exist or ingestion fails.
Closed-loop value
After approval, the next time the bot retrieves a similar question, it can match this solution, forming the closed loop of "human handles → consolidate → bot answers next time".
Health Check¶
GET /api/v1/health¶
Liveness probe endpoint; not authenticated.
Response body (HealthResponse):
{
"status": "ok",
"app": "Intelligent Customer Service System",
"version": "0.1.0",
"timestamp": "2026-07-03T08:30:00+00:00"
}
Appendix¶
Endpoint Authentication Summary¶
flowchart LR
A[Request] --> B{API_KEY non-empty?}
B -- No --> C[Development mode: no auth]
B -- Yes --> D{Endpoint requires auth?}
D -- No --> E[Pass through directly<br/>Performance/Observability/Monitor/Operations/Health]
D -- Yes --> F{X-API-Key matches?}
F -- Yes --> E
F -- No --> G[401 Unauthorized]
OpenAPI Documentation¶
FastAPI auto-generated interactive documentation:
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc