Embeddings: Built-in methods

Four embedders ship with Railtracks. The picks below are opinionated - they reflect what works in production today, not an exhaustive enumeration.

Summary

Class	Provider	`default_batch_size`
`OpenAIEmbedding`	OpenAI API	`None` (set per call)
`AzureEmbedding`	Azure OpenAI	`None` (set per call)
`OllamaEmbedding`	Local Ollama	`1`
`LiteLLMEmbedding`	Any LiteLLM provider	`None` (set per call)

All four inherit from Embedding (see overview for the base class API) and are re-exported from railtracks.retrieval.embedding.

Defaults you should know:

For production on OpenAI: OpenAIEmbedding("text-embedding-3-small") is the right starting point.
For local development with no API costs: OllamaEmbedding() with nomic-embed-text.
For anything else: LiteLLMEmbedding(model="provider/model-name").

`OpenAIEmbedding`

OpenAIEmbedding(
    model: str = "text-embedding-3-small",
    *,
    api_key: str | None = None,
    dimensions: int | None = None,
)

Parameter	Description
`model`	Embedding model name. Defaults to `text-embedding-3-small`.
`api_key`	OpenAI API key. Falls back to `OPENAI_API_KEY`.
`dimensions`	Truncate vectors to this size. Only supported by `text-embedding-3-*` models.

from railtracks.retrieval.embedding import OpenAIEmbedding

# Default: small model, full dimensionality
embedder = OpenAIEmbedding()

# Large model with truncated vectors (smaller storage, slight quality cost)
embedder = OpenAIEmbedding(model="text-embedding-3-large", dimensions=256)

When to use: production workloads on OpenAI. Start with text-embedding-3-small; switch to text-embedding-3-large when retrieval quality plateaus. Truncating dimensions on the large model (dimensions=256 or 512) gives most of the quality at a fraction of storage cost; measure before you commit to the full 3072.

`AzureEmbedding`

AzureEmbedding(
    deployment: str,
    *,
    api_base: str,
    api_version: str,
    api_key: str | None = None,
)

Parameter	Description
`deployment`	Azure deployment name (as configured in your Azure OpenAI resource).
`api_base`	Azure OpenAI endpoint (e.g. `https://my-resource.openai.azure.com`).
`api_version`	Azure API version (e.g. `"2024-02-01"`).
`api_key`	Azure API key. Falls back to `AZURE_API_KEY`.

from railtracks.retrieval.embedding import AzureEmbedding

embedder = AzureEmbedding(
    deployment="my-embedding-deployment",
    api_base="https://my-resource.openai.azure.com",
    api_version="2024-02-01",
)

When to use: when your organisation routes OpenAI calls through Azure for compliance, networking, or billing reasons. Behaves identically to OpenAIEmbedding at the model level; pick based on infra constraints, not retrieval quality.

`OllamaEmbedding`

OllamaEmbedding(
    model: str = "nomic-embed-text",
    *,
    api_base: str = "http://localhost:11434",
)

Parameter	Description
`model`	Ollama model name. Defaults to `nomic-embed-text`.
`api_base`	Ollama server URL. Defaults to `http://localhost:11434`.

OllamaEmbedding.default_batch_size is 1 because Ollama processes requests sequentially: astream_batches becomes one API call per chunk. That's fine for local dev; don't use Ollama for bulk re-indexing unless you're prepared for the wall-clock hit.

from railtracks.retrieval.embedding import OllamaEmbedding

# Local server, default model
embedder = OllamaEmbedding()

# Different model or remote Ollama instance
embedder = OllamaEmbedding(model="mxbai-embed-large", api_base="http://gpu-box:11434")

When to use: local development without API costs, air-gapped environments, or one-off experiments. Pull the model first: ollama pull nomic-embed-text.

`LiteLLMEmbedding`

The generic base the three providers above are built on. Use it directly to reach any provider LiteLLM supports but that doesn't have a dedicated Railtracks class (Cohere, Voyage, Mistral, Vertex AI, Bedrock, …).

LiteLLMEmbedding(
    model: str,
    api_key: str | None = None,
    api_base: str | None = None,
    api_version: str | None = None,
    **litellm_kwargs,
)

Parameter	Description
`model`	LiteLLM model string including provider prefix (e.g. `"cohere/embed-english-v3.0"`).
`api_key`	Provider API key. Falls back to the provider's env var.
`api_base`	Override the default base URL.
`api_version`	API version (required for some providers).
`**litellm_kwargs`	Any additional keyword arguments forwarded to `litellm.aembedding`.

from railtracks.retrieval.embedding import LiteLLMEmbedding

embedder = LiteLLMEmbedding(
    model="cohere/embed-english-v3.0",
    api_key="...",
)

Custom providers

To add a provider not covered above, subclass Embedding and implement aembed:

from railtracks.retrieval.embedding import Embedding, EmbeddingMetrics, TextEmbeddings


class MyEmbedding(Embedding):
    default_batch_size = 64

    async def aembed(self, texts: list[str]) -> TextEmbeddings:
        vectors = await my_async_client.encode(texts)
        return TextEmbeddings(
            vectors=vectors,
            metrics=EmbeddingMetrics(vector_count=len(vectors)),
        )

If your provider only has a blocking API, subclass SyncEmbedding instead and implement _embed_sync. The mixin runs it in a thread pool so the rest of the pipeline stays non-blocking:

from railtracks.retrieval.embedding import SyncEmbedding, TextEmbeddings


class MyBlockingEmbedding(SyncEmbedding):
    default_batch_size = 32

    def _embed_sync(self, texts: list[str]) -> TextEmbeddings:
        vectors = my_blocking_client.encode(texts)
        return TextEmbeddings(vectors=vectors)

Embeddings: Built-in methods

Summary

OpenAIEmbedding

AzureEmbedding

OllamaEmbedding

LiteLLMEmbedding

Custom providers

See also

`OpenAIEmbedding`

`AzureEmbedding`

`OllamaEmbedding`

`LiteLLMEmbedding`