Custom Providers
Octipus supports custom-provider flavors for connecting to LLM endpoints that aren’t backed by a first-party provider class. Pick the one that matches the upstream wire format:
| Flavor | provider value | Wire format | Use for |
|---|---|---|---|
| Custom OpenAI-compatible | custom-openai | OpenAI /v1/chat/completions | vLLM, Together, Groq, Fireworks, DeepInfra, internal OpenAI-shaped proxies |
| Custom Gemini-compatible | custom-gemini | Native Google Gemini (candidates[].content.parts[]) | Vertex AI, Google AI Studio (native), Gemini-fronting proxies |
Both are stateless — configuration lives entirely on the model_config row and is loaded per call. Add as many models against the same upstream as needed; each row points at its own endpoint and key.
Configuration
Section titled “Configuration”A custom-provider model row uses the existing model_config columns plus a metadata.customProvider block:
{ name: 'tpg-flash', // user-facing name (unique) modelId: 'gemini-3-flash-preview', // model id sent upstream provider: 'custom-gemini', // routes to the custom provider endpoint: 'https://api.example.com', // base URL (no trailing slash) apiKeyRef: 'tpg_api_key', // vault entry name (or 'env:VAR_NAME') metadata: { customProvider: { auth: { type: 'bearer' }, // 'bearer' | 'header' | 'query' requestEnvelope: 'gemini-blocks-config', // pathOverride: '/generate', // optional, defaults per flavor // extraHeaders: { 'X-Org': 'foo' }, // optional }, },}Auth schemes
Section titled “Auth schemes”auth.type | Required fields | Wire effect |
|---|---|---|
bearer | — | Authorization: Bearer <key> |
header | headerName (e.g. x-api-key) | <headerName>: <key> |
query | paramName (e.g. key) | ?<paramName>=<key> |
API key resolution
Section titled “API key resolution”apiKeyRef is resolved in this order:
env:VAR_NAMEprefix → readsprocess.env.VAR_NAMEdirectly- Vault lookup by name (
getVault().getByName('system', apiKeyRef)) - Fallback env var:
CUSTOM_OPENAI_API_KEYorCUSTOM_GEMINI_API_KEY
Use env: for local development, the vault for shared/production.
Request envelopes (Gemini-compat only)
Section titled “Request envelopes (Gemini-compat only)”The requestEnvelope field controls how the request body is shaped.
standard (default)
Section titled “standard (default)”Native Google Gemini wire format. Path defaults to:
POST {endpoint}/v1beta/models/{modelId}:generateContentPOST {endpoint}/v1beta/models/{modelId}:streamGenerateContent(streaming)
Body:
{ "contents": [{ "role": "user", "parts": [{ "text": "..." }] }], "systemInstruction": { "parts": [{ "text": "..." }] }, "generationConfig": { "temperature": 0.7, "maxOutputTokens": 4096 }, "tools": [{ "functionDeclarations": [] }]}Use for: the real Google Gemini API and Vertex AI generative endpoints.
gemini-blocks-config
Section titled “gemini-blocks-config”A bespoke envelope used by some Gemini-fronting proxies. Single path (default /generate), with Anthropic-style content blocks and a camelCase config:{} wrapper.
{ "mode": "text", "model": "gemini-3-flash-preview", "messages": [{ "role": "user", "content": [{ "type": "text", "text": "..." }] }], "stream": false, "config": { "temperature": 0.7, "maxTokens": 4096, "response_schema": { "type": "OBJECT", "properties": {} }, "tools": [{ "name": "...", "description": "...", "parameters": {} }] }}The response shape is identical to standard (native Gemini candidates[]).
Streaming
Section titled “Streaming”- OpenAI-compat: SSE via the OpenAI SDK. Standard
delta.content/delta.tool_callsdeltas. - Gemini-compat: SSE with
data: {gemini-chunk}\n\nevents. Each event contains the full chunk shape; text deltas and tool-call deltas are yielded as they arrive.
Tool calling
Section titled “Tool calling”Both flavors support function/tool calling via the standard Octipus tool schema (OpenAI-style). The Gemini-compat provider translates schemas on the way out (functionDeclarations) and parses functionCall parts on the way in.
What’s not supported (yet)
Section titled “What’s not supported (yet)”- Image / multi-modal input — text only for now
- Embeddings — neither custom flavor implements
embed() - Batch / parallel mode (proxy-specific feature)
Related
Section titled “Related”- Model Management — providers, topic routing, health checks
- LiteLLM Proxy — route through a proxy instead of per-model endpoints