Data Science · One build, frame by frame
FreeRAG
Open RAG, no API tax.
A retrieval-augmented generation system built to run without paid inference APIs — local embeddings, vector retrieval, and grounded responses over your own documents.
01 — The problem
RAG demos almost always assume a paid LLM/embedding API, which makes them expensive to run and impossible to self-host privately.
02 — What I built
Wired local Hugging Face embeddings into a vector store with a retrieval + grounding pipeline behind a FastAPI service, keeping every component swappable and offline-capable.
03 — What changed
A fully self-hostable RAG stack that answers over private documents with citations and zero per-token cost.
See it for yourself