Skip to content

Custom Providers

Octipus supports custom-provider flavors for connecting to LLM endpoints that aren’t backed by a first-party provider class. Pick the one that matches the upstream wire format:

Flavorprovider valueWire formatUse for
Custom OpenAI-compatiblecustom-openaiOpenAI /v1/chat/completionsvLLM, Together, Groq, Fireworks, DeepInfra, internal OpenAI-shaped proxies
Custom Gemini-compatiblecustom-geminiNative Google Gemini (candidates[].content.parts[])Vertex AI, Google AI Studio (native), Gemini-fronting proxies

Both are stateless — configuration lives entirely on the model_config row and is loaded per call. Add as many models against the same upstream as needed; each row points at its own endpoint and key.

A custom-provider model row uses the existing model_config columns plus a metadata.customProvider block:

{
name: 'tpg-flash', // user-facing name (unique)
modelId: 'gemini-3-flash-preview', // model id sent upstream
provider: 'custom-gemini', // routes to the custom provider
endpoint: 'https://api.example.com', // base URL (no trailing slash)
apiKeyRef: 'tpg_api_key', // vault entry name (or 'env:VAR_NAME')
metadata: {
customProvider: {
auth: { type: 'bearer' }, // 'bearer' | 'header' | 'query'
requestEnvelope: 'gemini-blocks-config',
// pathOverride: '/generate', // optional, defaults per flavor
// extraHeaders: { 'X-Org': 'foo' }, // optional
},
},
}
auth.typeRequired fieldsWire effect
bearerAuthorization: Bearer <key>
headerheaderName (e.g. x-api-key)<headerName>: <key>
queryparamName (e.g. key)?<paramName>=<key>

apiKeyRef is resolved in this order:

  1. env:VAR_NAME prefix → reads process.env.VAR_NAME directly
  2. Vault lookup by name (getVault().getByName('system', apiKeyRef))
  3. Fallback env var: CUSTOM_OPENAI_API_KEY or CUSTOM_GEMINI_API_KEY

Use env: for local development, the vault for shared/production.

The requestEnvelope field controls how the request body is shaped.

Native Google Gemini wire format. Path defaults to:

  • POST {endpoint}/v1beta/models/{modelId}:generateContent
  • POST {endpoint}/v1beta/models/{modelId}:streamGenerateContent (streaming)

Body:

{
"contents": [{ "role": "user", "parts": [{ "text": "..." }] }],
"systemInstruction": { "parts": [{ "text": "..." }] },
"generationConfig": { "temperature": 0.7, "maxOutputTokens": 4096 },
"tools": [{ "functionDeclarations": [] }]
}

Use for: the real Google Gemini API and Vertex AI generative endpoints.

A bespoke envelope used by some Gemini-fronting proxies. Single path (default /generate), with Anthropic-style content blocks and a camelCase config:{} wrapper.

{
"mode": "text",
"model": "gemini-3-flash-preview",
"messages": [{ "role": "user", "content": [{ "type": "text", "text": "..." }] }],
"stream": false,
"config": {
"temperature": 0.7,
"maxTokens": 4096,
"response_schema": { "type": "OBJECT", "properties": {} },
"tools": [{ "name": "...", "description": "...", "parameters": {} }]
}
}

The response shape is identical to standard (native Gemini candidates[]).

  • OpenAI-compat: SSE via the OpenAI SDK. Standard delta.content / delta.tool_calls deltas.
  • Gemini-compat: SSE with data: {gemini-chunk}\n\n events. Each event contains the full chunk shape; text deltas and tool-call deltas are yielded as they arrive.

Both flavors support function/tool calling via the standard Octipus tool schema (OpenAI-style). The Gemini-compat provider translates schemas on the way out (functionDeclarations) and parses functionCall parts on the way in.

  • Image / multi-modal input — text only for now
  • Embeddings — neither custom flavor implements embed()
  • Batch / parallel mode (proxy-specific feature)