September 12, 2025

RAG-Chat

RAG-Chat is a retrieval-augmented generation application built around asynchronous document ingestion, OpenAI embeddings, Qdrant vector search, BullMQ workers, and citation-backed responses.

GitHub

RAG-Chat was built to understand the full document-to-answer pipeline in a retrieval-augmented system instead of stopping at a basic chat UI. Users can upload files, process them in the background, store embeddings in a vector index, and ask questions that return grounded answers tied back to source chunks.

The architecture separates ingestion from serving: upload routes enqueue work into BullMQ, background workers extract and chunk text with format-specific parsers, embeddings are stored in Qdrant, and chat routes retrieve relevant context before generating responses. The result is a more realistic document workflow with processing status, semantic search, and citation-backed answers.

Highlights

Separated ingestion from query serving with BullMQ workers and Redis so document extraction and embedding could run asynchronously without blocking chat.
Built a retrieval pipeline around format-specific parsers, chunking, OpenAI embeddings, and Qdrant similarity search for source-grounded answers.

Separated ingestion from query serving with BullMQ workers and Redis so document extraction and embedding could run asynchronously without blocking chat.
Built a retrieval pipeline around format-specific parsers, chunking, OpenAI embeddings, and Qdrant similarity search for source-grounded answers.
Returned cited responses and batch-processing status, making the system useful for both user trust and operational debugging.

RAG-Chat

Highlights

Tech stack

More projects

Edward

Bonkers by Foyer

Agentic chat