What AI models does OpenClaw support?

OpenClaw supports 25+ model providers including Anthropic (Claude), OpenAI (GPT), Google Gemini, Ollama (local), vLLM, LM Studio, Amazon Bedrock, OpenRouter, Venice AI, LiteLLM, Hugging Face, MiniMax, Moonshot, Together AI, NVIDIA, and more. Any OpenAI-compatible or Anthropic-compatible API endpoint can be added as a custom provider.

Can I run OpenClaw with free AI models?

Yes. Ollama and vLLM let you run open-source models like Llama 3.3 and Qwen on your own hardware at zero cost. Qwen offers a free tier with 2,000 requests per day via OAuth. Hugging Face has a free inference tier. You can also use OpenAI Codex OAuth to reuse your existing ChatGPT subscription.

How do I switch between AI models in OpenClaw?

Set the model in your agent config with the format provider/model-name (for example, anthropic/claude-opus-4-6 or ollama/llama3.3). You can configure different models per agent and set up automatic fallbacks so if one provider is down, OpenClaw switches to another.

What is the cheapest way to run OpenClaw?

The cheapest hosted option is to reuse an existing ChatGPT subscription via Codex OAuth, or use Qwen's free tier with 2,000 requests per day. The cheapest overall option is running local models through Ollama on hardware you already own, which costs $0 per month in API fees.

Does OpenClaw work with local models like Ollama?

Yes. OpenClaw auto-discovers tool-capable models from your local Ollama installation. Point it at http://127.0.0.1:11434 and it pulls available models automatically. Models like Llama 3.3, Qwen 2.5 Coder, and Mistral work well for agent tasks.

Can I use multiple AI providers at the same time?

Yes. OpenClaw supports model failover with a fallbacks array in your agent config. You can set a primary model (like Claude Opus) with fallbacks to cheaper or local models. Different agents on the same Gateway can use different providers entirely.

Models & Providers Hub

OpenClaw Models & Providers

25+ providers. Local to cloud. Your choice.

OpenClaw is model-agnostic. Plug in any LLM provider, switch models per agent, and set up automatic failovers. This page answers: which models are supported, what they cost, and which one you should pick.

Start the Free Course Deployment Guide

The short answer

Which model should you use?

This is the question we get asked most. The honest answer depends on what you're building, what you're willing to spend, and how much you care about privacy. Here's the quick version.

“Best all-around for agent tasks”

Anthropic Claude (Sonnet 4.6 or Opus 4.6)

Strongest reasoning and tool-calling. Sonnet for daily use, Opus for complex multi-step work. This is what we use for every client project.

“Tightest budget, still cloud-hosted”

OpenAI Codex OAuth or Qwen free tier

Codex OAuth lets you reuse your ChatGPT subscription with OpenClaw. Qwen gives you 2,000 free requests per day. Both get you running without any API billing.

“Zero API cost, full privacy”

Ollama with Llama 3.3 or Qwen 2.5 Coder

Run on your own hardware. Requires a decent GPU for good speed, or CPU inference if you're patient. Data never leaves your machine.

“Want to try many models quickly”

OpenRouter (200+ models, one API key)

Swap models without managing multiple accounts. Pay provider rates plus a small markup. We use this whenever we're evaluating a new model.

“Team with spend controls needed”

LiteLLM proxy with virtual keys

Self-hosted proxy. Set max_budget and budget_duration per key. Route to any backend. This is our go-to for team deployments.

Cloud providers

Premium Hosted

These are the heavy hitters. If you want the best reasoning, the most reliable tool calling, and you're okay paying per token, this is where you start. We run Anthropic Claude for every client build at The Operator Vault, and it's what we recommend to anyone doing serious agent work.

Anthropic (Claude)

~$3-75/M tokens (varies by model)

Models: Claude Opus 4.6, Sonnet 4.6, Haiku 4.5

Auth: API key

Claude is our default recommendation for serious agent work. The reasoning quality and tool calling are a step above everything else we've tested. If you're building automations that need to think through multi-step problems, Claude is where you should be.

OpenAI (GPT)

~$0.15-60/M tokens (varies by model)

Models: GPT-5.1 Codex, GPT-4o, GPT-4o-mini

Auth: API key or ChatGPT Codex OAuth

Solid all-around choice, especially if you already have an OpenAI account. Tool calling is strong, and Codex OAuth lets you reuse your existing ChatGPT subscription instead of paying per token. Great as a fallback behind Claude.

Amazon Bedrock

Same as direct API + AWS margin

Models: Claude, Llama, Mistral, and more via AWS

Auth: AWS credentials (IAM role or access key)

If your team already lives in AWS, Bedrock saves you from managing another API key. OpenClaw auto-discovers which models are available in your region. The slight AWS markup is worth it for the unified billing.

Run on your hardware

Local / Self-Hosted

If you want to experiment without spending a dollar, this is where to start. Download a model, point OpenClaw at localhost, and you're running. Quality won't match Claude or GPT for complex tasks, but for simple automations and learning the ropes, local models work surprisingly well. Most of our community members start here.

Ollama

$0/month

Models: Llama 3.3, Qwen 2.5 Coder, Mistral, DeepSeek R1

Auth: None needed (local)

The easiest way to run local models. OpenClaw auto-discovers whatever you have installed and you can start running agents immediately. Most of our community members start here when they want to try local inference. Just pull a model and go.

vLLM

$0/month

Models: Any HuggingFace model you load

Auth: None needed (local)

Better throughput than Ollama for production workloads. If you're running multiple agents or need faster inference, vLLM's OpenAI-compatible API is the move. Takes a bit more setup, but worth it for heavier use.

LM Studio

$0/month

Models: Any GGUF model you download

Auth: None needed (local)

The most visual option. Download models through a clean GUI, then OpenClaw connects via the openai-responses API on localhost:1234. Good for people who prefer clicking over terminal commands.

Model brokers

Aggregators / Gateways

Think of these as model brokers. Instead of managing API keys for every provider, you get one key that routes to whatever model you need. We lean on OpenRouter for testing and LiteLLM for team deployments where we need spend controls. If you're not sure which model to commit to, start here.

OpenRouter

Provider pricing + small markup

Models: 200+ models across all major providers

Auth: OpenRouter API key

We use OpenRouter when testing new models before committing to a provider. One API key, instant access to 200+ models. The markup is small and worth it for the flexibility. Swap between Claude, GPT, Llama, and dozens of others without touching your config.

LiteLLM

Self-hosted proxy, you control billing

Models: Any model behind your LiteLLM proxy

Auth: LiteLLM API key

This is what we recommend for teams. Set up virtual keys with spend limits so each team member or project has its own budget. You control the proxy, you see every request, and you can route to any backend provider. Takes 10 minutes to set up, saves headaches later.

Hugging Face Inference

Free tier available, then pay-per-use

Models: Open-source models hosted by HF

Auth: HF token (fine-grained, with inference permission)

A solid free option for testing open-source models without running them locally. Use the :fastest or :cheapest suffix to let HF pick the best backend. Good for experimenting before you decide whether to run the model yourself.

Data stays yours

Privacy-First

Some projects require that data never touches a third-party server. Whether it's client confidentiality, regulatory requirements, or personal preference, these options keep everything under your control. We've used Venice AI for client work where data can't leave a controlled environment, and Ollama for fully air-gapped setups.

Venice AI

Credit-based (private models cheaper)

Models: Llama, Qwen, DeepSeek, Mistral (private mode) + Claude, GPT (anonymized)

Auth: Venice API key

Venice is interesting for privacy-conscious operators. We've tested it for client work where data can't leave a controlled environment. Their private mode is zero-logging, and their anonymized mode strips metadata before proxying to major providers. A good middle ground between privacy and model quality.

Ollama (local)

Models: Any model you download

Auth: None

The ultimate privacy option. Nothing leaves your machine, period. No API calls, no logging, no third parties. If you're handling sensitive data and need a complete air gap, run Ollama on a machine that never touches the internet after you download the model.

Use what you already pay for

Subscription Reuse

Already paying for a ChatGPT subscription or have access to Qwen's free tier? You can use those with OpenClaw instead of paying for a separate API key. This is the most budget-friendly way to run cloud-hosted models.

OpenAI Codex OAuth

Your ChatGPT subscription

Models: GPT-5.1 Codex, GPT-4o

Auth: OAuth login via openclaw onboard --auth-choice openai-codex

If you're already paying for ChatGPT Plus or Pro, this lets you use that subscription with OpenClaw instead of paying for API tokens on top. No API key needed. Just authenticate through OAuth and you're running. One of the easiest ways to get started.

Qwen OAuth

Free tier: 2,000 requests/day

Models: Qwen Coder, Qwen Vision

Auth: Device-code OAuth (plugin required)

Genuinely free and surprisingly capable. Run openclaw plugins enable qwen-portal-auth to activate it, authenticate via device code, and you get 2,000 requests per day at zero cost. A great starting point if you want to test OpenClaw without any financial commitment.

Reliability

Automatic model failover

Provider outages happen. Rate limits hit at the worst times. Instead of manually switching models when something breaks, you can set up a chain. OpenClaw tries your first choice, and if that fails, automatically moves to the next. We set this up on every production deployment because the last thing you want is an agent going dark at 2 AM because one API had a hiccup.

openclaw.json

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-sonnet-4-6",
        "fallbacks": [
          "openai/gpt-4o",
          "ollama/llama3.3"
        ]
      }
    }
  }
}

Try primary model

anthropic/claude-sonnet-4-6

If Anthropic is down or rate-limited, try first fallback

openai/gpt-4o

If OpenAI also fails, fall back to local

ollama/llama3.3 (no API, no downtime)

The real tradeoff

Local vs Hosted models

This is the first decision you need to make. Both paths work. The right choice depends on your budget, your hardware, and how much you care about model quality versus privacy. Here's an honest breakdown from what we've seen building automations for clients.

Hosted (API)

Claude, GPT, and Gemini run on massive GPU clusters managed by their providers. You pay per token, and the quality is consistently high. For complex agent tasks (multi-step reasoning, tool calling chains, long context analysis), hosted models are still significantly ahead of local alternatives. This is what we use for every client build, and what we recommend if your budget allows it.

Best reasoning and tool-calling quality

No GPU or high-RAM hardware needed

Always up-to-date with latest model releases

Pay-per-use: $5-20+/month for typical agent usage

Data leaves your machine (check provider privacy policies)

Local (Ollama / vLLM)

Open-source models run on your own hardware. Zero API costs, full privacy, and no rate limits. The tradeoff is compute: you need a capable GPU for fast inference, or accept slower CPU-only speeds. Tool calling support varies by model. If you're just learning OpenClaw or running simple automations, local models are a great way to start without spending anything.

Zero ongoing API costs

Complete data privacy, nothing leaves your network

No rate limits or provider downtime

Requires GPU ($500+) or high RAM (32 GB+) for good speed

Tool calling quality lags behind Claude and GPT for now

For most of our clients, we start with Claude Sonnet as the primary model and set up Ollama as a local fallback. This gives you the best reasoning when it matters, with zero-cost coverage when the API is down. If you're just starting out and want to keep costs at zero, go Ollama-only and upgrade when you need more capability.

Setup reference

How to configure each provider

Setting up a provider takes about 30 seconds. Set the environment variable, pick the model reference, and you're running. The onboarding wizard handles most of this automatically, but here's the cheat sheet if you want to configure things manually.

Provider	Env Variable	Model Ref Example
Anthropic	ANTHROPIC_API_KEY	anthropic/claude-opus-4-6
OpenAI	OPENAI_API_KEY	openai/gpt-5.1-codex
Ollama	OLLAMA_API_KEY="ollama-local"	ollama/llama3.3
OpenRouter	OPENROUTER_API_KEY	openrouter/anthropic/claude-sonnet-4-5
Venice AI	VENICE_API_KEY	venice/llama-3.3-70b
LiteLLM	LITELLM_API_KEY	(your proxy model ref)
Amazon Bedrock	AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY	amazon-bedrock/us.anthropic.claude-opus-4-6-v1:0
Hugging Face	HF_TOKEN	huggingface/(model-name)

Most people overthink provider setup. Pick one, set the env var, restart the Gateway. You can always swap later. If you are paralyzed by choice, start with Anthropic (best quality) or Ollama (free). You can add the others in 30 seconds whenever you want.

Written by

Kevin Jeppesen

Founder, The Operator Vault

Kevin is an early OpenClaw adopter who has saved an estimated 400 to 500 hours through AI automation. He stress-tests new workflows daily, sharing what actually works through step-by-step guides and a security-conscious approach to operating AI with real tools.

Models FAQ

Common questions about models

Configure your first
AI model in the Workshop.

Our free course walks you through OpenClaw setup, including model provider configuration. One hour, 100% free, lifetime access.

Start the Free Course Deploy OpenClaw

Deploy OpenClaw Start Here Installation Guide Docker Setup Cloud VPS Setup Security Guide Workshop

OpenClaw Models & Providers

Which model should you use?

Premium Hosted

Anthropic (Claude)

OpenAI (GPT)

Amazon Bedrock

Local / Self-Hosted

Ollama

vLLM

LM Studio

Aggregators / Gateways

OpenRouter

LiteLLM

Hugging Face Inference

Privacy-First

Venice AI

Ollama (local)

Subscription Reuse

OpenAI Codex OAuth

Qwen OAuth

Automatic model failover

Local vs Hosted models

Hosted (API)

Local (Ollama / vLLM)

How to configure each provider

Kevin Jeppesen

Common questions about models

Configure your firstAI model in the Workshop.

Configure your first
AI model in the Workshop.