Skip to main content
How It Works

From Question toVerified Answer

For Decision Makers

Most AI tools match your question to similar text. That works for simple lookups — but falls apart on complex questions, multi-document reasoning, or anything requiring verification.

Courdx runs three retrieval strategies in parallel, validates every result, and cites every fact to the exact source sentence. You get answers you can trust — not answers you have to double-check.

For Technical Teams

The entire pipeline is orchestrated by 15+ LangGraph agentic workflows — not a monolithic prompt chain. Each stage is independently configurable, observable, and tuneable from the admin panel.

Semantic search (6 embedding models), Memgraph knowledge graph traversal, BM25 keyword matching, cross-encoder reranking, Corrective RAG, and RAGAS evaluation — all running in a production-grade pipeline.

The RAG Pipeline

Every Query. Five Stages, Nine Steps.

Nine steps grouped into five intelligent stages. Hover over each node to explore the pipeline — notice how three retrieval strategies branch in parallel, then converge through fusion and validation.

RETRIEVALVALIDATION01QuestionNatural language02DecomposeBreak into sub-queries03Vector SearchSemantic similarity04Graph TraversalEntity relationships05BM25 KeywordExact term match06RRF FusionRank & merge results07Cross-EncoderDeep relevance scoring08ValidateSelf-correction09Cite & RespondVerified answer9-step pipeline orchestrated by LangGraph · 15+ agentic workflows

Data Flow

From Raw Document to Searchable Knowledge

Every document goes through ingestion, parsing, chunking, entity extraction, embedding, and indexing — fully automated.

IngestDocuments, APIs, Databases1Parse & ChunkSmart segmentation2Embed & IndexVector + Graph storage3RetrieveMulti-strategy search4ValidateCheck & cite sources5RespondVerified answers6DocumentLifecycle

Three Pillars of Intelligent Retrieval

Each pillar solves a fundamental problem that basic retrieval systems ignore.

Intelligent Retrieval

We Don't Just Search. We Understand.

Five layers of intelligence between your question and your answer: query decomposition breaks complex questions apart, HyDE predicts what ideal answers look like, multi-strategy search retrieves from vectors, graphs, and keywords simultaneously, cross-encoder reranking scores true relevance, and Corrective RAG self-corrects when confidence drops.

Query DecompositionPredictive Embeddings (HyDE)Multi-Strategy SearchCross-Encoder Reranking

Knowledge Graph

See the Connections Others Miss

Courdx automatically extracts entities and relationships from your documents and stores them in a Memgraph knowledge graph. Community detection algorithms cluster related entities into groups — so you can ask big-picture questions like "what themes recur across our compliance reports?" that vector search can't touch.

Entity ExtractionRelationship MappingCommunity DetectionCross-Document Reasoning

Trust & Citations

Every Fact. Every Source. Verified.

Every claim cites the exact sentence in the source document — not just "found in Document 3." Confidence scores tell you how strong each citation is. Input and output guardrails block prompt injection, PII leaks, and hallucinated content. Full audit trail for compliance.

Sentence-Level CitationsConfidence ScoresInput/Output GuardrailsAudit Trails

Multi-Dimensional Retrieval

Not Just Vector Search. Multi-Strategy Intelligence.

A single retrieval method always has blind spots. Vector search misses exact terms. Keyword search misses synonyms. Neither understands entity relationships.

Courdx runs all three simultaneously and fuses the results with Reciprocal Rank Fusion (RRF) — so you get the best of every approach in one ranked list.

Semantic Understanding

Vector embeddings capture meaning beyond keywords

Graph Reasoning

Memgraph knowledge graph reveals entity relationships

Keyword Precision

BM25 matching for codes, identifiers, and exact phrases

Query Intelligence

HyDE prediction + decomposition + smart routing

20%40%60%80%100%SemanticUnderstandingKeywordMatchingEntityRelationshipsMulti-hopReasoningContextRetentionSourceVerification
Courdx Multi-Strategy
Traditional Search

Step by Step

The Complete Journey

From natural language question to verified, cited answer — the five stages that power every query.

01

Your Question

Natural language query — plain English, no syntax required

02

Query Decomposition

Complex questions are broken into focused sub-queries. Each part gets its own retrieval pass.

  • Predicts ideal answer shape using HyDE (Hypothetical Document Embeddings)
  • Expands acronyms, synonyms, and domain-specific terms
  • Routes each sub-query to the optimal retrieval strategy
03

Multi-Strategy Retrieval

Three search strategies run in parallel — each catching what the others miss.

  • Vector search: semantic similarity via 6 embedding models including Qwen3, Nomic, MxBai, and Snowflake Arctic
  • Knowledge graph: Memgraph traversal for entity relationships across documents
  • BM25 keyword: precision matching for codes, names, identifiers, and exact phrases
  • Reciprocal Rank Fusion (RRF) merges all results into one optimized ranking
04

Intelligent Validation

Every result is scored, reranked, and validated before it reaches the LLM.

  • Cross-encoder model scores each query-document pair for true relevance (not just similarity)
  • Low confidence triggers Corrective RAG — automatic query refinement and retry
  • Falls back to web search if internal sources are insufficient
  • RAGAS evaluation framework measures faithfulness, relevance, and answer quality
05

Cited, Verified Response

The final answer is synthesized from validated sources with mandatory citations.

  • Every fact cited to the exact sentence in the source document
  • Confidence score per citation — not just per answer
  • Output guardrails check for hallucinations, PII leaks, and toxic content
  • Full audit trail: who asked, what was retrieved, what was generated
Keyword Only45%
Enterprise Target
Vector Search68%
Enterprise Target
Graph + Vector78%
Enterprise Target
Courdx Multi-Strategy95%
Enterprise Target
Based on internal benchmarks on enterprise document retrieval tasks

Measurable Results

Up to 95% Accuracy. Measured.

Each retrieval strategy catches what the others miss. Combined with cross-encoder reranking and Corrective RAG, the multi-strategy approach closes the gaps that make basic AI search unreliable.

Up to 95%
Retrieval Accuracy
+40%
vs. Basic AI Search
3-stage
Retrieval Pipeline
100%
Cited Answers

Verifiable Answers

Every Fact. Every Source. Verified.

Click any citation to see the exact source sentence highlighted in context. No more "the AI said so."

C
Courdx Response

The company achieved $2.4M in Q3 revenue[1], representing a 23% YoY growth[2]. The main drivers were the new enterprise contracts signed in July and August[3].

Hover over highlighted text to see source

Source Documents
[1]Q3 Financial Reportp.4

$2.4M in Q3 revenue

98%
match
[2]Q3 Financial Reportp.4

23% YoY growth

95%
match
[3]Sales Pipeline UpdateSlide 12

enterprise contracts signed in July and August

92%
match
Sentence-level citation
High confidence (90%+)
Medium confidence

Every fact in Courdx responses is traced back to the exact sentence in your source documents

Observable Pipeline

Watch It Work. In Real Time.

Courdx isn't a black box. The admin panel includes 15 dedicated health pages showing real-time system metrics, retrieval quality, and cost analytics. You see exactly what the system is doing — and where to tune it.

Real-time CPU, memory, and queue monitoring
Per-query latency and retrieval quality tracking
Failed document detection with automatic retry
Vespa cluster health and 20,000+ chunk indexing status
LLM token cost analytics per model and per query
System Monitoring Dashboard
Courdx system monitoring dashboard showing real-time CPU usage, memory consumption, API performance metrics, and error logs

Research-Backed

Built on Peer-Reviewed Research

Not prompt engineering. Real retrieval science, implemented in production.

01

GraphRAG Community Detection

Clusters related entities for big-picture questions

02

Corrective RAG

Self-healing retrieval that refines when confidence drops

03

Cross-Encoder Reranking

Deep relevance scoring, not just cosine similarity

04

Reciprocal Rank Fusion

Merges multi-strategy results into one optimal ranking

05

HyDE Predictive Embeddings

Generates ideal answer shape, then finds matching docs

06

Query Decomposition

Breaks complex questions into precisely answerable parts

07

LangGraph Orchestration

15+ agentic workflows coordinating retrieval and validation

08

RAGAS Evaluation

Automated quality measurement: faithfulness, relevance, accuracy

See the Pipeline In Action

Bring your hardest question and your real documents. We'll trace the entire retrieval pipeline live.