Models

Foxl supports models from multiple providers. Use Foxl relay credits, your own API keys (BYOK), or your ChatGPT Plus/Pro subscription (OAuth).

Foxl Relay Models

Included with your Foxl plan. No setup needed - just sign in and chat.

Model	Provider	Context	Best For
Claude Fable 5	Bedrock	1M tokens	State-of-the-art on most benchmarks; ambitious long-running work, advanced vision (Pro/Ultra only)
Claude Opus 4.8	Bedrock	1M tokens	New default; SOTA on SWE-bench Pro / Verified, longer autonomous runs, lower output variance for enterprise workflows
Claude Opus 4.7	Bedrock	1M tokens	Previous flagship; same API surface as 4.8
Claude Opus 4.6	Bedrock	1M tokens	Older Opus; still 1M context
Claude Sonnet 4.6	Bedrock	1M tokens	Best balance of speed and quality
Claude Haiku 4.5	Bedrock	200K tokens	Fastest model, near-frontier intelligence
GLM 5	Bedrock (Z.AI)	200K tokens	Agentic coding, long-horizon tasks
Kimi K2.5	Bedrock (Moonshot)	256K tokens	Agentic coding and reasoning, vision
GPT-5.5	Bedrock (OpenAI)	1M tokens	OpenAI frontier reasoning, on your Foxl credits
GPT-5.4	Bedrock (OpenAI)	1M tokens	General-purpose OpenAI reasoning, on your Foxl credits

GPT-5.5 and GPT-5.4 run on Foxl relay credits through Amazon Bedrock's native OpenAI Responses API - no ChatGPT subscription or OpenAI key required. They are also available via your own ChatGPT subscription (OAuth) below if you prefer to use that instead.

Claude Fable 5 ships with built-in safeguards. Prompts in sensitive domains (cybersecurity, biology, chemistry, health) are automatically answered by Opus 4.8 instead and billed at Opus rates. Fable 5 is a Pro/Ultra model and is not available on the free tier.

Subscription (OAuth) Models

Use your existing subscriptions. Desktop only - see Providers for setup. Foxl calls the vendor API directly with your OAuth token; tool use, adaptive thinking, and streaming all flow through Foxl's normal agent loop.

Claude Code (Anthropic Pro/Max)

Model	Context	Best For
Fable 5 (Claude Code mode)	1M tokens	State-of-the-art; most capable for ambitious, long-running work
Opus 4.8 (Claude Code mode)	1M tokens	New flagship; coding, agentic workflows, long-horizon autonomy
Opus 4.7 (Claude Code mode)	1M tokens	Stable Opus tier; same API surface as 4.8
Sonnet 4.6 (Claude Code mode)	200K tokens	Balanced speed and quality (200K cap on subscription pool)

Claude Code (OAuth) runs in compatibility mode so requests route through your Claude Pro/Max subscription instead of pay-as-you-go "Extra usage". In this mode, Foxl-specific tools (memory, subagents, schedules, channel send, browser extension, view image) are disabled - only Bash, Read, Grep, and WebFetch are available. Haiku 4.5 is not exposed in this mode. For the full Foxl tool surface, use an Anthropic API key (BYOK) or the foxl.ai relay.

OpenAI (ChatGPT Plus/Pro)

Model	Context	Best For
GPT-5.5	1M tokens	Latest frontier, reasoning + streaming
GPT-5.4	1M tokens	General-purpose reasoning
GPT-5.4 Mini	400K tokens	Fast, cost-aware tasks
GPT-5.3 Codex	400K tokens	Code-heavy tasks

Also exposed as a built-in tool: gpt-image-2 via the generate_image tool. No API key, no per-image billing - powered by your ChatGPT Plus/Pro OAuth session. Accepts up to 10 input images for edit / compose mode.

Gemini CLI (Google)

Model	Context	Best For
Gemini 2.5 Pro	1M tokens	Long documents, deep analysis
Gemini 2.5 Flash	1M tokens	Fast, cost-effective

BYOK Models

Bring your own API key to access models from any provider. See AI Providers for setup.

Model	Provider	Context	API Key From
Claude Fable 5	Anthropic	1M tokens	console.anthropic.com
Claude Opus 4.8	Anthropic	1M tokens	console.anthropic.com
Claude Opus 4.7	Anthropic	1M tokens	console.anthropic.com
Claude Opus 4.6	Anthropic	1M tokens	console.anthropic.com
Claude Sonnet 4.6	Anthropic	1M tokens	console.anthropic.com
Claude Haiku 4.5	Anthropic	200K tokens	console.anthropic.com
GPT-4.1	OpenAI	1M tokens	platform.openai.com
GPT-4.1 Mini	OpenAI	1M tokens	platform.openai.com
o3	OpenAI	200K tokens	platform.openai.com
Gemini 2.5 Pro	Google	1M tokens	aistudio.google.com
Gemini 2.5 Flash	Google	1M tokens	aistudio.google.com
Llama 3, Mistral, etc.	Ollama	Varies	Free - ollama.com

Adaptive Thinking

Claude Fable 5, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6 support adaptive thinking - Claude dynamically decides when and how much to think based on the complexity of your request. No manual budget setting needed.

Simple questions: Claude responds directly without thinking overhead
Complex problems: Claude automatically engages deep reasoning
Agentic workflows: Claude can think between tool calls (interleaved thinking)

Adaptive thinking (type: "adaptive") is the recommended mode for Fable 5, Opus 4.8 / 4.7 / 4.6, and Sonnet 4.6 (Fable 5 accepts only adaptive). Haiku 4.5 uses type: "enabled" with a budget_tokens parameter instead.

Opus 4.7 and 4.8 also reject non-default temperature / top_p / top_k with HTTP 400 - Foxl strips those automatically on those models, so the API uses its calibrated default.

You can toggle thinking on/off in the model selector. For adaptive-thinking models, Settings also exposes a reasoning effort level (the options come from the selected model's capabilities - up to low / medium / high / xhigh / max); models without adjustable reasoning show no effort selector.

Task Budgets (Beta, Fable 5 and Opus 4.7+)

Settings includes a Task Budget selector (off / 25K / 50K / 100K / 250K / 500K tokens). When set, the model receives an output_config.task_budget and the task-budgets-2026-03-13 beta header on every request, and uses the remaining budget to plan and pace its work across one agentic turn (tool calls + thinking + response).

Advisory, not enforced. The model treats it as guidance, not a hard ceiling.
Minimum 20K when enabled. Anthropic rejects positive values below that.
Plumbs through the foxl relay path and the direct Anthropic API path. Bedrock Converse rejects output_config.task_budget ("Extra inputs are not permitted"), so Foxl omits it on the direct-Bedrock route - the budget applies only on relay and Anthropic-API transports.

Thinking and Cost

Thinking consumes output tokens. When thinking is enabled, the model may use 2-10x more output tokens depending on task complexity. This directly increases credit cost:

Simple question without thinking: ~0.01 credits (Sonnet)
Same question with thinking: ~0.05-0.15 credits (Sonnet)
Complex reasoning with thinking: ~0.30-1.0 credits (Opus)

For cost-sensitive usage, disable thinking for simple tasks. For complex coding, analysis, or multi-step reasoning, thinking significantly improves quality and is worth the extra cost. See Credits for detailed per-token pricing.

Go to Settings
Select a provider (Anthropic, OpenAI, Google, etc.)
Enter your API key
Select a model from that provider

When using your own keys, no Foxl credits are consumed.

Subscription OAuth

You can use your ChatGPT Plus/Pro subscription directly - no API key needed.

OpenAI OAuth

Run npx @openai/codex login to create ~/.codex/auth.json
Select OpenAI (OAuth) in Settings > Providers - Foxl calls OpenAI's Codex Responses API directly with your subscription token

Foxl auto-detects your credentials. See AI Providers for details on all supported providers.

Foxl Relay Models

Subscription (OAuth) Models

Claude Code (Anthropic Pro/Max)

OpenAI (ChatGPT Plus/Pro)

Gemini CLI (Google)

BYOK Models

Adaptive Thinking

Task Budgets (Beta, Fable 5 and Opus 4.7+)

Thinking and Cost

Model Selection

Desktop App

Web App

Bring Your Own Key (Desktop Only)

Subscription OAuth

On this page