Beta
railtracks.retrieval is in beta. Please expect API changes between minor releases.
Migrated from older modules
railtracks.rag and railtracks.vector_stores are removed. Everything
now lives under railtracks.retrieval.
This module is in a beta and we are actively polishing the pieces.
Quickstart
railtracks.retrieval is the module for everything that turns raw sources
into a searchable index and queries it back — ingestion (loading,
chunking, embedding) and vector search, with the same runtime
answering both.
RetrievalRuntime pipelines the four stages:
flowchart LR
Sources["Sources
(files, URLs, dirs)"]
Loader{{"1. Loader"}}
Chunker{{"2. Chunker"}}
Embedder{{"3. Embedder"}}
Store[("4. Store")]
Query(["Query"])
Results>"RetrievalResult"]
Sources --> Loader
Loader --> |Documents| Chunker
Chunker --> |Chunks| Embedder
Embedder --> |EmbeddedChunks| Store
Query --> Embedder
Store --> Results
classDef source fill:#60A5FA,fill-opacity:0.3
classDef process fill:#FBBF24,fill-opacity:0.3
classDef store fill:#34D399,fill-opacity:0.3
classDef output fill:#FECACA,fill-opacity:0.3
class Sources,Query source;
class Loader,Chunker,Embedder process;
class Store store;
class Results output;
Minimal pipeline
import asyncio
from railtracks.retrieval import RetrievalRuntime
from railtracks.retrieval.chunking import SentenceChunker
from railtracks.retrieval.embedding import OpenAIEmbedding
from railtracks.retrieval.loaders import TextLoader
from railtracks.retrieval.stores import InMemoryVectorBackend, VectorStore
async def main():
runtime = RetrievalRuntime(
chunker=SentenceChunker(chunk_size=7, overlap=2),
embedder=OpenAIEmbedding(model="text-embedding-3-small"),
store=VectorStore(InMemoryVectorBackend()),
batch_size=16, # smaller batch size for faster embedding in this example
)
stats = await runtime.ingest_all(loader=TextLoader("./docs"))
print(f"ingested {stats.documents_loaded} docs / {stats.chunks_embedded} chunks")
result = await runtime.retrieve("how do I configure observability?", top_k=5)
for hit in result.chunks:
print(f" [{hit.score:.3f}] {hit.chunk.content}")
asyncio.run(main())
Three options decide the shape of a runtime: chunker, embedder,
store.
Loader is a parameter passed to .ingest(...) or .ingest_all(...) allowing reading of file systems with different file types.
Where to go next
| You want to… | Read |
|---|---|
| Get documents into the store (streaming events, re-ingest, multi-tenant writes, sanitization, token guards) | Ingestion |
| Run vector search (top-k, metadata filters, per-call scope) or attach a runtime to an agent | Retrieval |
| Understand the internals, async model, and to customize things | Components → Design |
Key types
The following data models flow through the pipeline. Each links to the page that owns its full description.
| Type | What it is |
|---|---|
Document |
One unit of source content produced by a loader. |
Chunk |
A slice of a Document produced by a chunker carrying document_id and metadata. |
EmbeddedChunk |
A chunk plus its embedding vector and model name. |
StoreEntry |
The atomic unit a store reads and writes. |
RetrievalResult |
What runtime.retrieve() returns: ranked RetrievedChunks plus the query. |
StoreScope |
A hard-filter namespace: a label dict ({"user_id": "alice"}, {"organization": "acme"}, etc.) enforced as equality filters on every read and write. |
Stage choices
Pick the right component for each stage. Each link goes to the page that covers the trade-offs.
| Stage | Built-in options | Picked by |
|---|---|---|
| Load | TextLoader, CSVLoader, PyPDFLoader, PyPDFOCRLoader, HuggingFaceDatasetLoader, JSONLoader, LangChainLoaderAdapter |
Ingestion overview |
| Chunk | RecursiveCharacterChunker, MarkdownHeaderChunker, SentenceChunker, FixedTokenChunker |
Chunking methods |
| Embed | OpenAIEmbedding, AzureEmbedding, OllamaEmbedding, LiteLLMEmbedding |
Embeddings methods |
| Store | VectorStore with InMemoryVectorBackend, ChromaBackend, or PgvectorBackend |
Store backends |