Skip to content

Intelligent Customer Service System Integrated with Enterprise Knowledge Base

Intelligent Customer Service System Integrated with Enterprise Knowledge Base

An enterprise-grade intelligent customer service system powered by dual engines: "Multi-Agent Collaboration + RAG Knowledge Enhancement"

:material-robot-happy: Multi-Agent Collaboration :material-book-search: Hybrid Retrieval + RAG :material-rocket: Avg Response ≤ 3s :material-account-tie: Agent Assist Workbench :material-chart-line: Langfuse Observability

Get Started View on GitHub


Core Capabilities

Multi-Agent Collaboration

Based on the LangGraph "1+5" architecture: 1 orchestration Agent coordinates 5 specialized Agents (Knowledge Retrieval / Business Query / Sentiment Analysis / Ticket Processing / Dialog Generation). Automatically degrades to synchronous orchestration when LangGraph is unavailable.

Learn the Architecture

Hybrid Retrieval + RAG

Query rewriting → vector retrieval + BM25 dual-path recall → RRF fusion → Reranker reranking → LLM generation. No forced answers when similarity is below threshold, achieving Recall@5 = 1.0.

View Retrieval Pipeline

Agent Assist Workbench

After escalation, sessions can be taken over by human agents, supporting context continuation, knowledge/business assisted queries, and solution archival back to the knowledge base. 8 API endpoints complete the human-AI collaboration gap.

View Agent Endpoints

Performance Optimization

HotQueryCache hot path caching, ModelRouter large/small model routing, IntentCache same-intent reuse, and intent recognition fast path. First token for knowledge Q&A < 1s.

View Performance Optimization

Langfuse Observability

All 11 LLM call points are tagged with prompt name/version. Trace visualization covers the full chain. Token/cost/latency are automatically reported, with automatic degradation when not configured.

View Observability

Fallback Strategy

Seven-layer fallback for LLM / BGE / LangGraph / Redis / Business API / Langfuse ensures availability. The main path never blocks.

View Fallback Strategy


Performance Metrics

Validated under a real DeepSeek LLM + BGE embedding environment:

Metric Target Actual Pass
Recall@5 ≥ 0.85 1.0
Hit Rate ≥ 0.90 0.9333
Hallucination Rate ≤ 0.10 0.0
Independent Resolution Rate ≥ 60% 80%
Avg Response Time ≤ 3s 2.27s

Project Structure Overview

app/
├── api/v1/              # Access Layer: REST API endpoints
│   ├── chat.py          # Chat endpoint (sync + SSE streaming)
│   ├── agent.py         # Agent assist endpoints (8)
│   └── ...              # Knowledge base / evaluation / performance / observability / operations
├── agents/              # Agent collaboration layer
│   ├── orchestrator.py  # Orchestration Agent
│   ├── graph.py         # LangGraph state machine orchestration
│   └── ...              # 5 specialized Agents + LLMClient
├── core/                # Core infrastructure
│   ├── session.py       # Session management (incl. agent state)
│   ├── performance.py   # HotQueryCache / ModelRouter
│   └── langfuse_client.py   # Langfuse tracing
├── knowledge/           # Knowledge and data layer
│   ├── hybrid_retriever.py  # Hybrid retrieval
│   └── ...              # Reranker / version management / quality validation
└── schemas/             # Pydantic data models

tests/                   # 668+ test cases

Next Steps