Connecting to Other Providers

2026-03-06 // claude-codeaitools

─────────────────────────────────────────

Claude Code 8 / 8

Introduction

By overriding the base URL and auth token, you can point Claude Code at any provider that exposes an Anthropic-compatible endpoint. Many providers and API aggregators do this, and even local servers like Ollama can translate on the fly. This guide covers all the ways to connect Claude Code to third-party providers.

The key principle: keep your default claude command connected to your Anthropic subscription. Use wrapper scripts for everything else. This way you never accidentally burn through a third-party API budget when you meant to use your Max subscription, or vice versa.

If you haven’t read the earlier guides, start with Installing Claude Code.

The Environment Variables

Claude Code’s provider configuration is driven entirely by environment variables. Here are the ones that matter:

Variable	Purpose
`ANTHROPIC_BASE_URL`	Override the API endpoint. Set this to point Claude Code at a different provider.
`ANTHROPIC_AUTH_TOKEN`	Bearer token sent in the `Authorization` header. Use this for providers that expect a bearer token.
`ANTHROPIC_API_KEY`	API key sent via the `x-api-key` header. This is what Anthropic’s own API uses.
`ANTHROPIC_MODEL`	Override the default model name.
`ANTHROPIC_DEFAULT_OPUS_MODEL`	Map Opus-tier requests to a specific model name.
`ANTHROPIC_DEFAULT_SONNET_MODEL`	Map Sonnet-tier requests to a specific model name.
`ANTHROPIC_DEFAULT_HAIKU_MODEL`	Map Haiku-tier requests to a specific model name.
`CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC`	Set to `1` to disable telemetry and update checks. Required for local providers.
`API_TIMEOUT_MS`	Request timeout in milliseconds. Increase for slow providers.

Don't put provider vars in your shell profile

Adding ANTHROPIC_BASE_URL or ANTHROPIC_AUTH_TOKEN to your .bashrc or .zshrc will override your Anthropic subscription for every claude invocation. Keep these variables scoped to wrapper scripts or per-project settings instead.

AWS Bedrock

Claude Code has built-in Bedrock support — no URL hacking required:

bash

─ □ ×

claude --provider bedrock

The relevant environment variables:

Variable	Purpose
`ANTHROPIC_MODEL`	The Bedrock model ID (e.g. `us.anthropic.claude-sonnet-4-20250514-v1:0`)
`AWS_REGION`	AWS region (e.g. `us-east-1`)
`AWS_PROFILE`	Named AWS CLI profile to use for credentials

Bedrock uses your existing AWS credentials (from ~/.aws/credentials, environment variables, or IAM roles), so there’s no separate API key to manage. Make sure your IAM user or role has bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream permissions.

Google Vertex AI

Also built-in:

bash

─ □ ×

claude --provider vertex

Variable	Purpose
`CLOUD_ML_REGION`	GCP region (e.g. `us-east5`)
`ANTHROPIC_VERTEX_PROJECT_ID`	Your GCP project ID

Like Bedrock, Vertex uses your existing GCP credentials — authenticate with gcloud auth application-default login if you haven’t already.

Azure AI

Azure doesn’t have built-in provider support, but you can point Claude Code at your Azure AI endpoint using the base URL override:

bash

─ □ ×

export ANTHROPIC_BASE_URL="https://your-resource.services.ai.azure.com/anthropic/v1" export ANTHROPIC_API_KEY="your-azure-api-key" claude

Replace your-resource with your Azure AI Services resource name.

OpenRouter

OpenRouter aggregates many model providers behind a single API. Point Claude Code at it like this:

bash

─ □ ×

export ANTHROPIC_BASE_URL="https://openrouter.ai/api/v1" export ANTHROPIC_AUTH_TOKEN="sk-or-..." claude --model anthropic/claude-sonnet-4

OpenRouter uses bearer token auth, so use ANTHROPIC_AUTH_TOKEN (not ANTHROPIC_API_KEY). The model names follow OpenRouter’s provider/model naming convention.

z.ai

Same pattern as OpenRouter — different base URL, different token:

bash

─ □ ×

export ANTHROPIC_BASE_URL="https://api.z.ai/api/anthropic" export ANTHROPIC_AUTH_TOKEN="your-z-ai-key" export API_TIMEOUT_MS="3000000" claude

z.ai provides access to various models. You can use the model mapping variables to map Claude’s model tiers to z.ai’s model names:

example
123
export ANTHROPIC_DEFAULT_OPUS_MODEL="GLM-5"
export ANTHROPIC_DEFAULT_SONNET_MODEL="GLM-4.7"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="GLM-4.5-Air"

With these mappings set, Claude Code’s model switcher (/model) and the --model flag work transparently — requesting “Opus” actually sends the request for GLM-5, and so on.

Increase the timeout

Some providers are slower than Anthropic’s direct API. Setting API_TIMEOUT_MS to a higher value (the example above uses 3,000,000ms / 50 minutes) prevents Claude Code from timing out on long operations.

Ollama (Local Models)

Ollama lets you run models locally. Claude Code can talk to it since Ollama exposes an OpenAI-compatible API:

bash

─ □ ×

export ANTHROPIC_AUTH_TOKEN="ollama" export ANTHROPIC_BASE_URL="http://localhost:11434" export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 claude --model glm-4.7-flash

A few things to note:

ANTHROPIC_AUTH_TOKEN needs a non-empty value, but Ollama doesn’t actually check it. Any string works.
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 is required. Without it, Claude Code tries to contact Anthropic’s servers for telemetry and update checks, which will fail or leak information you don’t want leaking when running locally.
--model must match the model name you’ve pulled in Ollama (ollama pull glm-4.7-flash).

Local model quality

Local models are useful for experimentation, private work, and offline scenarios. Don’t expect the same quality of output as Claude — these are typically much smaller models. They work best for straightforward tasks like code formatting, simple refactors, and template generation.

The Wrapper Script Pattern

Here’s the approach I recommend: leave your claude command untouched (connected to your Anthropic subscription), and create small wrapper scripts for each provider.

Here’s a template based on the pattern I use:

~/.local/bin/my-provider
12345678910111213141516171819202122232425262728293031323334353637383940
#!/bin/bash
# Launch Claude Code with a third-party provider

# Parse flags
LOCAL_MODE=""
ARGS=()
for arg in "$@"; do
  if [ "$arg" = "--local" ]; then
    LOCAL_MODE="local"
  else
    ARGS+=("$arg")
  fi
done

if [ "$LOCAL_MODE" = "local" ]; then
  # Local Ollama mode
  export ANTHROPIC_AUTH_TOKEN="ollama"
  export ANTHROPIC_BASE_URL="http://localhost:11434"
  export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
  exec claude --model your-local-model "${ARGS[@]}"
else
  # Remote API mode
  # Pull the API key from a secrets manager (1Password shown here)
  export ANTHROPIC_AUTH_TOKEN="$(op item get 'Your Item' --reveal --field 'API Key')"

  if [ -z "$ANTHROPIC_AUTH_TOKEN" ]; then
    echo "Error: Failed to retrieve API key"
    exit 1
  fi

  export ANTHROPIC_BASE_URL="https://api.your-provider.com/v1"
  export API_TIMEOUT_MS="3000000"

  # Optional: map model tiers to provider-specific names
  export ANTHROPIC_DEFAULT_OPUS_MODEL="provider-large"
  export ANTHROPIC_DEFAULT_SONNET_MODEL="provider-medium"
  export ANTHROPIC_DEFAULT_HAIKU_MODEL="provider-small"

  exec claude "${ARGS[@]}"
fi

Make it executable and drop it in your PATH:

bash

─ □ ×

chmod +x ~/.local/bin/my-provider

Now you have two commands:

claude — your Anthropic subscription, untouched
my-provider — your third-party provider
my-provider --local — local Ollama

The key details:

exec claude replaces the wrapper process with Claude Code, so you don’t leave a dangling shell process.
Secrets stay out of the script. The example uses 1Password CLI (op), but any secrets manager works — pass, Bitwarden CLI, AWS Secrets Manager, or even gpg-encrypted files. The point is: don’t hardcode API keys in the script.
Flags control the mode. Adding --local switches to Ollama. You can add as many modes as you need — --qwen3, --bedrock, whatever makes sense for your setup.
All extra arguments pass through. "${ARGS[@]}" forwards everything else to claude, so flags like --model, --print, -p, and --allowedTools all work normally.

Multiple local models

You can support multiple local models by adding more flags. For example, --local for one model and --qwen3-local for another, each setting a different --model value.

Per-Project Provider Overrides

If a specific project always uses a particular provider, you can set the environment variables in that project’s settings.json instead of using a wrapper script:

.claude/settings.json
1234567
{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.your-provider.com/v1",
    "ANTHROPIC_AUTH_TOKEN": "your-token",
    "API_TIMEOUT_MS": "3000000"
  }
}

This only affects Claude Code sessions started in that project directory. Your global claude command remains untouched everywhere else.

Secrets in settings.json

Unlike wrapper scripts, settings.json can’t call out to a secrets manager. If you use this approach, be careful not to commit the file with real API keys. Add .claude/settings.json to your .gitignore, or use a .claude/settings.local.json file which is local-only by convention.

Custom Status Lines for Multiple Providers

When you’re switching between providers, it’s useful to see at a glance which one you’re connected to and how much of your quota you’ve used. Claude Code’s statusLine setting supports custom shell commands that can detect the active provider and display relevant usage data.

Enabling the Status Line

Add a statusLine block to your settings.json:

~/.claude/settings.json
123456
{
  "statusLine": {
    "type": "command",
    "command": "bash ~/.claude/statusline-command.sh"
  }
}

Claude Code runs this command periodically and displays the output at the bottom of the TUI. The script receives a JSON blob on stdin containing session data — context window usage, workspace info, and more.

Detecting the Active Provider

The wrapper script pattern makes provider detection straightforward. Since each wrapper sets different environment variables, your status line script can check those to determine the mode:

~/.claude/statusline-command.sh
1234567891011121314151617181920212223242526
#!/bin/bash

# ANSI color codes
RED='\033[31m'
GREEN='\033[32m'
YELLOW='\033[33m'
BLUE='\033[34m'
MAGENTA='\033[35m'
CYAN='\033[36m'
RESET='\033[0m'
DIM='\033[2m'

# Detect mode based on environment variables set by wrapper scripts
if [[ "$ANTHROPIC_AUTH_TOKEN" == "ollama"* ]] || [[ "$ANTHROPIC_BASE_URL" == *"localhost"* ]]; then
  LOCAL_MODE=true
  GLM_MODE=false
  MODE_PREFIX="Local"
elif [[ "$ANTHROPIC_BASE_URL" == *"z.ai"* ]]; then
  LOCAL_MODE=false
  GLM_MODE=true
  MODE_PREFIX="GLM"
else
  LOCAL_MODE=false
  GLM_MODE=false
  MODE_PREFIX="Claude"
fi

The script checks ANTHROPIC_AUTH_TOKEN and ANTHROPIC_BASE_URL — the same variables your wrapper scripts set. When you launch claude normally, neither is set, so it shows “Claude”. When you launch through a wrapper, the variables are present and the status line adapts.

Displaying Context Window Usage

The JSON input from stdin includes context window data. Extract it to show how much of the context window you’ve consumed:

12345678910111213
# Read JSON input from stdin
input=$(cat)

# Calculate context window percentage
usage=$(echo "$input" | jq '.context_window.current_usage')
context_pct=0
if [ "$usage" != "null" ]; then
  current=$(echo "$usage" | jq '.input_tokens + .cache_creation_input_tokens + .cache_read_input_tokens')
  size=$(echo "$input" | jq '.context_window.context_window_size')
  if [ "$size" -gt 0 ] 2>/dev/null; then
    context_pct=$((current * 100 / size))
  fi
fi

Fetching Rate Limit Usage

For Anthropic subscriptions, the OAuth API exposes your 5-hour and weekly usage. For other providers, you can hit their quota APIs. The key is to cache the results so you’re not hammering the API on every status line refresh:

12345678910111213141516171819202122232425262728293031323334353637383940414243
# Cache API responses for 60 seconds
CACHE_TTL=60
NOW=$(date +%s)
CACHE_FILE="$HOME/.claude/usage-cache.json"

use_cache=false
if [ -f "$CACHE_FILE" ]; then
  cache_time=$(jq -r '.cached_at // 0' "$CACHE_FILE" 2>/dev/null)
  if [ $((NOW - cache_time)) -lt $CACHE_TTL ]; then
    use_cache=true
  fi
fi

if [ "$use_cache" = true ]; then
  five_hour_used_pct=$(jq -r '.five_hour_used_pct // "?"' "$CACHE_FILE")
  week_used_pct=$(jq -r '.week_used_pct // "?"' "$CACHE_FILE")
else
  # Fetch from Anthropic OAuth API
  CREDENTIALS_FILE="$HOME/.claude/.credentials.json"
  access_token=$(jq -r '.claudeAiOauth.accessToken // empty' "$CREDENTIALS_FILE" 2>/dev/null)

  if [ -n "$access_token" ]; then
    response=$(curl -s --max-time 2 \
      -H "Authorization: Bearer $access_token" \
      -H "Content-Type: application/json" \
      -H "anthropic-beta: oauth-2025-04-20" \
      "https://api.anthropic.com/api/oauth/usage" 2>/dev/null)

    if [ -n "$response" ] && echo "$response" | jq -e '.five_hour' >/dev/null 2>&1; then
      five_hour_used_pct=$(echo "$response" | jq -r '.five_hour.utilization // 0' | awk '{printf "%.0f", $1}')
      week_used_pct=$(echo "$response" | jq -r '.seven_day.utilization // 0' | awk '{printf "%.0f", $1}')

      # Cache the result
      cat > "$CACHE_FILE" << EOF
{
  "five_hour_used_pct": "$five_hour_used_pct",
  "week_used_pct": "$week_used_pct",
  "cached_at": $NOW
}
EOF
    fi
  fi
fi

For local Ollama mode, skip the API calls entirely — there’s no quota to track.

Color-Coding by Usage Level

Color the percentages based on how close you are to the limit — green when you have plenty of headroom, yellow when you’re getting close, red when you’re nearly out:

12345678910111213
color_usage() {
  local pct=$1
  local label=$2
  if [ "$pct" = "?" ]; then
    echo -e "${DIM}${label}:?%${RESET}"
  elif [ "$pct" -lt 50 ]; then
    echo -e "${GREEN}${label}:${pct}%${RESET}"
  elif [ "$pct" -lt 80 ]; then
    echo -e "${YELLOW}${label}:${pct}%${RESET}"
  else
    echo -e "${RED}${label}:${pct}%${RESET}"
  fi
}

Assembling the Output

Build the final status line by joining the parts with a separator. Include the provider mode, usage percentages, context window, and optionally the current branch and repo name:

12345678910111213141516171819202122232425262728293031323334353637
parts=()

# Mode prefix in cyan
parts+=("${CYAN}${MODE_PREFIX}${RESET}")

# Rate limit usage (skip for local mode)
if [ "$LOCAL_MODE" != true ]; then
  parts+=("$(color_usage "$five_hour_used_pct" "5h")")
fi

# Context window usage with color
if [ "$context_pct" -lt 50 ]; then
  parts+=("${CYAN}${context_pct}% ctx${RESET}")
elif [ "$context_pct" -lt 80 ]; then
  parts+=("${YELLOW}${context_pct}% ctx${RESET}")
else
  parts+=("${RED}${context_pct}% ctx${RESET}")
fi

# Branch in blue, repo in magenta
branch=$(git -C "$cwd" rev-parse --abbrev-ref HEAD 2>/dev/null)
repo=$(basename "$(git -C "$cwd" rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null)
[ -n "$branch" ] && parts+=("${BLUE}${branch}${RESET}")
[ -n "$repo" ] && parts+=("${MAGENTA}${repo}${RESET}")

# Join with dimmed separator
SEP="${DIM} | ${RESET}"
result=""
for i in "${!parts[@]}"; do
  if [ $i -eq 0 ]; then
    result="${parts[$i]}"
  else
    result="${result}${SEP}${parts[$i]}"
  fi
done

printf "%b" "$result"

The end result looks something like:

1
Claude | 5h:23% | wk:8% | 12% ctx | main | my-project

Or when using a third-party provider:

1
GLM | 5h:67% | 31% ctx | feature-branch | my-project

Or local:

1
Local | 5% ctx | main | my-project

At a glance you can see which provider you’re on, how much quota remains, and how full your context window is. The color coding means you don’t even need to read the numbers — green means go, red means start a new conversation or wait for your quota to reset.

Tips

Don’t pollute your shell profile. This bears repeating: adding ANTHROPIC_BASE_URL to .bashrc is the number one mistake. It silently redirects every claude invocation to your third-party provider, including when you forget it’s set.
Test with claude --model first. Before building a wrapper script, test the connection manually with export statements and claude --model your-model. Once it works, wrap it up.
Increase API_TIMEOUT_MS for slow providers. The default timeout is fine for Anthropic’s API, but some providers (especially local Ollama on modest hardware) need more time.
Use CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 for local models. Always. It prevents Claude Code from trying to phone home when you’re running offline or against a local server.
The --provider flag is only for Bedrock and Vertex. For everything else, use the environment variable approach.

Wrapping Up

Claude Code’s provider flexibility means you’re not locked into a single API source. Use Bedrock or Vertex for enterprise deployments, OpenRouter or z.ai for model variety, and Ollama for local experimentation — all through the same familiar CLI.

The wrapper script pattern keeps everything clean: claude is always your primary Anthropic connection, and purpose-built scripts handle everything else. No environment variable conflicts, no accidental billing surprises.

If you missed the earlier guides in this series:

Installing Claude Code
Using the Claude Code TUI
CLAUDE.md and Settings
Skills, Agents, and MCP Servers
Hooks
Plugins and Marketplaces
Agent Teams
Connecting to Other Providers (you are here)

← Previous Agent Teams