Models & Providers Hub

OpenClaw Models & Providers

25+ providers. Local to cloud. Your choice.

OpenClaw is model-agnostic. Plug in any LLM provider, switch models per agent, and set up automatic failovers. This page answers: which models are supported, what they cost, and which one you should pick.

Start the Free CourseDeployment Guide

The short answer

Which model should you use?

This is the question we get asked most. The honest answer depends on what you're building, what you're willing to spend, and how much you care about privacy. Here's the quick version.

Best all-around for agent tasks

Anthropic Claude (Sonnet 4.6 or Opus 4.6)

Strongest reasoning and tool-calling. Sonnet for daily use, Opus for complex multi-step work. This is what we use for every client project.

Tightest budget, still cloud-hosted

OpenAI Codex OAuth or Qwen free tier

Codex OAuth lets you reuse your ChatGPT subscription with OpenClaw. Qwen gives you 2,000 free requests per day. Both get you running without any API billing.

Zero API cost, full privacy

Ollama with Llama 3.3 or Qwen 2.5 Coder

Run on your own hardware. Requires a decent GPU for good speed, or CPU inference if you're patient. Data never leaves your machine.

Want to try many models quickly

OpenRouter (200+ models, one API key)

Swap models without managing multiple accounts. Pay provider rates plus a small markup. We use this whenever we're evaluating a new model.

Team with spend controls needed

LiteLLM proxy with virtual keys

Self-hosted proxy. Set max_budget and budget_duration per key. Route to any backend. This is our go-to for team deployments.

Cloud providers

Premium Hosted

These are the heavy hitters. If you want the best reasoning, the most reliable tool calling, and you're okay paying per token, this is where you start. We run Anthropic Claude for every client build at The Operator Vault, and it's what we recommend to anyone doing serious agent work.

Anthropic (Claude)

~$3-75/M tokens (varies by model)

Models: Claude Opus 4.6, Sonnet 4.6, Haiku 4.5

Auth: API key

Claude is our default recommendation for serious agent work. The reasoning quality and tool calling are a step above everything else we've tested. If you're building automations that need to think through multi-step problems, Claude is where you should be.

OpenAI (GPT)

~$0.15-60/M tokens (varies by model)

Models: GPT-5.1 Codex, GPT-4o, GPT-4o-mini

Auth: API key or ChatGPT Codex OAuth

Solid all-around choice, especially if you already have an OpenAI account. Tool calling is strong, and Codex OAuth lets you reuse your existing ChatGPT subscription instead of paying per token. Great as a fallback behind Claude.

Amazon Bedrock

Same as direct API + AWS margin

Models: Claude, Llama, Mistral, and more via AWS

Auth: AWS credentials (IAM role or access key)

If your team already lives in AWS, Bedrock saves you from managing another API key. OpenClaw auto-discovers which models are available in your region. The slight AWS markup is worth it for the unified billing.

Run on your hardware

Local / Self-Hosted

If you want to experiment without spending a dollar, this is where to start. Download a model, point OpenClaw at localhost, and you're running. Quality won't match Claude or GPT for complex tasks, but for simple automations and learning the ropes, local models work surprisingly well. Most of our community members start here.

Ollama

$0/month

Models: Llama 3.3, Qwen 2.5 Coder, Mistral, DeepSeek R1

Auth: None needed (local)

The easiest way to run local models. OpenClaw auto-discovers whatever you have installed and you can start running agents immediately. Most of our community members start here when they want to try local inference. Just pull a model and go.

vLLM

$0/month

Models: Any HuggingFace model you load

Auth: None needed (local)

Better throughput than Ollama for production workloads. If you're running multiple agents or need faster inference, vLLM's OpenAI-compatible API is the move. Takes a bit more setup, but worth it for heavier use.

LM Studio

$0/month

Models: Any GGUF model you download

Auth: None needed (local)

The most visual option. Download models through a clean GUI, then OpenClaw connects via the openai-responses API on localhost:1234. Good for people who prefer clicking over terminal commands.

Model brokers

Aggregators / Gateways

Think of these as model brokers. Instead of managing API keys for every provider, you get one key that routes to whatever model you need. We lean on OpenRouter for testing and LiteLLM for team deployments where we need spend controls. If you're not sure which model to commit to, start here.

OpenRouter

Provider pricing + small markup

Models: 200+ models across all major providers

Auth: OpenRouter API key

We use OpenRouter when testing new models before committing to a provider. One API key, instant access to 200+ models. The markup is small and worth it for the flexibility. Swap between Claude, GPT, Llama, and dozens of others without touching your config.

LiteLLM

Self-hosted proxy, you control billing

Models: Any model behind your LiteLLM proxy

Auth: LiteLLM API key

This is what we recommend for teams. Set up virtual keys with spend limits so each team member or project has its own budget. You control the proxy, you see every request, and you can route to any backend provider. Takes 10 minutes to set up, saves headaches later.

Hugging Face Inference

Free tier available, then pay-per-use

Models: Open-source models hosted by HF

Auth: HF token (fine-grained, with inference permission)

A solid free option for testing open-source models without running them locally. Use the :fastest or :cheapest suffix to let HF pick the best backend. Good for experimenting before you decide whether to run the model yourself.

Data stays yours

Privacy-First

Some projects require that data never touches a third-party server. Whether it's client confidentiality, regulatory requirements, or personal preference, these options keep everything under your control. We've used Venice AI for client work where data can't leave a controlled environment, and Ollama for fully air-gapped setups.

Venice AI

Credit-based (private models cheaper)

Models: Llama, Qwen, DeepSeek, Mistral (private mode) + Claude, GPT (anonymized)

Auth: Venice API key

Venice is interesting for privacy-conscious operators. We've tested it for client work where data can't leave a controlled environment. Their private mode is zero-logging, and their anonymized mode strips metadata before proxying to major providers. A good middle ground between privacy and model quality.

Ollama (local)

$0

Models: Any model you download

Auth: None

The ultimate privacy option. Nothing leaves your machine, period. No API calls, no logging, no third parties. If you're handling sensitive data and need a complete air gap, run Ollama on a machine that never touches the internet after you download the model.

Use what you already pay for

Subscription Reuse

Already paying for a ChatGPT subscription or have access to Qwen's free tier? You can use those with OpenClaw instead of paying for a separate API key. This is the most budget-friendly way to run cloud-hosted models.

OpenAI Codex OAuth

Your ChatGPT subscription

Models: GPT-5.1 Codex, GPT-4o

Auth: OAuth login via openclaw onboard --auth-choice openai-codex

If you're already paying for ChatGPT Plus or Pro, this lets you use that subscription with OpenClaw instead of paying for API tokens on top. No API key needed. Just authenticate through OAuth and you're running. One of the easiest ways to get started.

Qwen OAuth

Free tier: 2,000 requests/day

Models: Qwen Coder, Qwen Vision

Auth: Device-code OAuth (plugin required)

Genuinely free and surprisingly capable. Run openclaw plugins enable qwen-portal-auth to activate it, authenticate via device code, and you get 2,000 requests per day at zero cost. A great starting point if you want to test OpenClaw without any financial commitment.

Reliability

Automatic model failover

Provider outages happen. Rate limits hit at the worst times. Instead of manually switching models when something breaks, you can set up a chain. OpenClaw tries your first choice, and if that fails, automatically moves to the next. We set this up on every production deployment because the last thing you want is an agent going dark at 2 AM because one API had a hiccup.

openclaw.json
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-sonnet-4-6",
        "fallbacks": [
          "openai/gpt-4o",
          "ollama/llama3.3"
        ]
      }
    }
  }
}
1

Try primary model

anthropic/claude-sonnet-4-6

2

If Anthropic is down or rate-limited, try first fallback

openai/gpt-4o

3

If OpenAI also fails, fall back to local

ollama/llama3.3 (no API, no downtime)

The real tradeoff

Local vs Hosted models

This is the first decision you need to make. Both paths work. The right choice depends on your budget, your hardware, and how much you care about model quality versus privacy. Here's an honest breakdown from what we've seen building automations for clients.

Hosted (API)

Claude, GPT, and Gemini run on massive GPU clusters managed by their providers. You pay per token, and the quality is consistently high. For complex agent tasks (multi-step reasoning, tool calling chains, long context analysis), hosted models are still significantly ahead of local alternatives. This is what we use for every client build, and what we recommend if your budget allows it.

Best reasoning and tool-calling quality
No GPU or high-RAM hardware needed
Always up-to-date with latest model releases
Pay-per-use: $5-20+/month for typical agent usage
Data leaves your machine (check provider privacy policies)

Local (Ollama / vLLM)

Open-source models run on your own hardware. Zero API costs, full privacy, and no rate limits. The tradeoff is compute: you need a capable GPU for fast inference, or accept slower CPU-only speeds. Tool calling support varies by model. If you're just learning OpenClaw or running simple automations, local models are a great way to start without spending anything.

Zero ongoing API costs
Complete data privacy, nothing leaves your network
No rate limits or provider downtime
Requires GPU ($500+) or high RAM (32 GB+) for good speed
Tool calling quality lags behind Claude and GPT for now

For most of our clients, we start with Claude Sonnet as the primary model and set up Ollama as a local fallback. This gives you the best reasoning when it matters, with zero-cost coverage when the API is down. If you're just starting out and want to keep costs at zero, go Ollama-only and upgrade when you need more capability.

Setup reference

How to configure each provider

Setting up a provider takes about 30 seconds. Set the environment variable, pick the model reference, and you're running. The onboarding wizard handles most of this automatically, but here's the cheat sheet if you want to configure things manually.

ProviderEnv VariableModel Ref Example
AnthropicANTHROPIC_API_KEYanthropic/claude-opus-4-6
OpenAIOPENAI_API_KEYopenai/gpt-5.1-codex
OllamaOLLAMA_API_KEY="ollama-local"ollama/llama3.3
OpenRouterOPENROUTER_API_KEYopenrouter/anthropic/claude-sonnet-4-5
Venice AIVENICE_API_KEYvenice/llama-3.3-70b
LiteLLMLITELLM_API_KEY(your proxy model ref)
Amazon BedrockAWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEYamazon-bedrock/us.anthropic.claude-opus-4-6-v1:0
Hugging FaceHF_TOKENhuggingface/(model-name)

Most people overthink provider setup. Pick one, set the env var, restart the Gateway. You can always swap later. If you are paralyzed by choice, start with Anthropic (best quality) or Ollama (free). You can add the others in 30 seconds whenever you want.

Kevin Jeppesen, Founder of The Operator Vault

Written by

Kevin Jeppesen

Founder, The Operator Vault

Kevin is an early OpenClaw adopter who has saved an estimated 400 to 500 hours through AI automation. He stress-tests new workflows daily, sharing what actually works through step-by-step guides and a security-conscious approach to operating AI with real tools.

Models FAQ

Common questions about models

Configure your first
AI model in the Workshop.

Our free course walks you through OpenClaw setup, including model provider configuration. One hour, 100% free, lifetime access.

Deploy OpenClawStart HereInstallation GuideDocker SetupCloud VPS SetupSecurity GuideWorkshop