Connecting to Other Providers
- 1 Installing Claude Code
- 2 Using the Claude Code TUI
- 3 CLAUDE.md and Settings
- 4 Skills, Agents, and MCP Servers
- 5 Hooks
- 6 Plugins and Marketplaces
- 7 Agent Teams
- 8 Connecting to Other Providers
Introduction
By overriding the base URL and auth token, you can point Claude Code at any provider that exposes an Anthropic-compatible endpoint. Many providers and API aggregators do this, and even local servers like Ollama can translate on the fly. This guide covers all the ways to connect Claude Code to third-party providers.
The key principle: keep your default claude command connected to your Anthropic subscription. Use wrapper scripts for everything else. This way you never accidentally burn through a third-party API budget when you meant to use your Max subscription, or vice versa.
If you haven’t read the earlier guides, start with Installing Claude Code.
The Environment Variables
Claude Code’s provider configuration is driven entirely by environment variables. Here are the ones that matter:
| Variable | Purpose |
|---|---|
ANTHROPIC_BASE_URL | Override the API endpoint. Set this to point Claude Code at a different provider. |
ANTHROPIC_AUTH_TOKEN | Bearer token sent in the Authorization header. Use this for providers that expect a bearer token. |
ANTHROPIC_API_KEY | API key sent via the x-api-key header. This is what Anthropic’s own API uses. |
ANTHROPIC_MODEL | Override the default model name. |
ANTHROPIC_DEFAULT_OPUS_MODEL | Map Opus-tier requests to a specific model name. |
ANTHROPIC_DEFAULT_SONNET_MODEL | Map Sonnet-tier requests to a specific model name. |
ANTHROPIC_DEFAULT_HAIKU_MODEL | Map Haiku-tier requests to a specific model name. |
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC | Set to 1 to disable telemetry and update checks. Required for local providers. |
API_TIMEOUT_MS | Request timeout in milliseconds. Increase for slow providers. |
ANTHROPIC_BASE_URL or ANTHROPIC_AUTH_TOKEN to your .bashrc or .zshrc will override your Anthropic subscription for every claude invocation. Keep these variables scoped to wrapper scripts or per-project settings instead.AWS Bedrock
Claude Code has built-in Bedrock support — no URL hacking required:
The relevant environment variables:
| Variable | Purpose |
|---|---|
ANTHROPIC_MODEL | The Bedrock model ID (e.g. us.anthropic.claude-sonnet-4-20250514-v1:0) |
AWS_REGION | AWS region (e.g. us-east-1) |
AWS_PROFILE | Named AWS CLI profile to use for credentials |
Bedrock uses your existing AWS credentials (from ~/.aws/credentials, environment variables, or IAM roles), so there’s no separate API key to manage. Make sure your IAM user or role has bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream permissions.
Google Vertex AI
Also built-in:
| Variable | Purpose |
|---|---|
CLOUD_ML_REGION | GCP region (e.g. us-east5) |
ANTHROPIC_VERTEX_PROJECT_ID | Your GCP project ID |
Like Bedrock, Vertex uses your existing GCP credentials — authenticate with gcloud auth application-default login if you haven’t already.
Azure AI
Azure doesn’t have built-in provider support, but you can point Claude Code at your Azure AI endpoint using the base URL override:
Replace your-resource with your Azure AI Services resource name.
OpenRouter
OpenRouter aggregates many model providers behind a single API. Point Claude Code at it like this:
OpenRouter uses bearer token auth, so use ANTHROPIC_AUTH_TOKEN (not ANTHROPIC_API_KEY). The model names follow OpenRouter’s provider/model naming convention.
z.ai
Same pattern as OpenRouter — different base URL, different token:
z.ai provides access to various models. You can use the model mapping variables to map Claude’s model tiers to z.ai’s model names:
example
export ANTHROPIC_DEFAULT_OPUS_MODEL="GLM-5"
export ANTHROPIC_DEFAULT_SONNET_MODEL="GLM-4.7"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="GLM-4.5-Air"With these mappings set, Claude Code’s model switcher (/model) and the --model flag work transparently — requesting “Opus” actually sends the request for GLM-5, and so on.
API_TIMEOUT_MS to a higher value (the example above uses 3,000,000ms / 50 minutes) prevents Claude Code from timing out on long operations.Ollama (Local Models)
Ollama lets you run models locally. Claude Code can talk to it since Ollama exposes an OpenAI-compatible API:
A few things to note:
ANTHROPIC_AUTH_TOKENneeds a non-empty value, but Ollama doesn’t actually check it. Any string works.CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1is required. Without it, Claude Code tries to contact Anthropic’s servers for telemetry and update checks, which will fail or leak information you don’t want leaking when running locally.--modelmust match the model name you’ve pulled in Ollama (ollama pull glm-4.7-flash).
The Wrapper Script Pattern
Here’s the approach I recommend: leave your claude command untouched (connected to your Anthropic subscription), and create small wrapper scripts for each provider.
Here’s a template based on the pattern I use:
~/.local/bin/my-provider
#!/bin/bash
# Launch Claude Code with a third-party provider
# Parse flags
LOCAL_MODE=""
ARGS=()
for arg in "$@"; do
if [ "$arg" = "--local" ]; then
LOCAL_MODE="local"
else
ARGS+=("$arg")
fi
done
if [ "$LOCAL_MODE" = "local" ]; then
# Local Ollama mode
export ANTHROPIC_AUTH_TOKEN="ollama"
export ANTHROPIC_BASE_URL="http://localhost:11434"
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
exec claude --model your-local-model "${ARGS[@]}"
else
# Remote API mode
# Pull the API key from a secrets manager (1Password shown here)
export ANTHROPIC_AUTH_TOKEN="$(op item get 'Your Item' --reveal --field 'API Key')"
if [ -z "$ANTHROPIC_AUTH_TOKEN" ]; then
echo "Error: Failed to retrieve API key"
exit 1
fi
export ANTHROPIC_BASE_URL="https://api.your-provider.com/v1"
export API_TIMEOUT_MS="3000000"
# Optional: map model tiers to provider-specific names
export ANTHROPIC_DEFAULT_OPUS_MODEL="provider-large"
export ANTHROPIC_DEFAULT_SONNET_MODEL="provider-medium"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="provider-small"
exec claude "${ARGS[@]}"
fiMake it executable and drop it in your PATH:
Now you have two commands:
claude— your Anthropic subscription, untouchedmy-provider— your third-party providermy-provider --local— local Ollama
The key details:
exec claudereplaces the wrapper process with Claude Code, so you don’t leave a dangling shell process.- Secrets stay out of the script. The example uses 1Password CLI (
op), but any secrets manager works —pass, Bitwarden CLI, AWS Secrets Manager, or evengpg-encrypted files. The point is: don’t hardcode API keys in the script. - Flags control the mode. Adding
--localswitches to Ollama. You can add as many modes as you need —--qwen3,--bedrock, whatever makes sense for your setup. - All extra arguments pass through.
"${ARGS[@]}"forwards everything else toclaude, so flags like--model,--print,-p, and--allowedToolsall work normally.
--local for one model and --qwen3-local for another, each setting a different --model value.Per-Project Provider Overrides
If a specific project always uses a particular provider, you can set the environment variables in that project’s settings.json instead of using a wrapper script:
.claude/settings.json
{
"env": {
"ANTHROPIC_BASE_URL": "https://api.your-provider.com/v1",
"ANTHROPIC_AUTH_TOKEN": "your-token",
"API_TIMEOUT_MS": "3000000"
}
}This only affects Claude Code sessions started in that project directory. Your global claude command remains untouched everywhere else.
settings.json can’t call out to a secrets manager. If you use this approach, be careful not to commit the file with real API keys. Add .claude/settings.json to your .gitignore, or use a .claude/settings.local.json file which is local-only by convention.Custom Status Lines for Multiple Providers
When you’re switching between providers, it’s useful to see at a glance which one you’re connected to and how much of your quota you’ve used. Claude Code’s statusLine setting supports custom shell commands that can detect the active provider and display relevant usage data.
Enabling the Status Line
Add a statusLine block to your settings.json:
~/.claude/settings.json
{
"statusLine": {
"type": "command",
"command": "bash ~/.claude/statusline-command.sh"
}
}Claude Code runs this command periodically and displays the output at the bottom of the TUI. The script receives a JSON blob on stdin containing session data — context window usage, workspace info, and more.
Detecting the Active Provider
The wrapper script pattern makes provider detection straightforward. Since each wrapper sets different environment variables, your status line script can check those to determine the mode:
~/.claude/statusline-command.sh
#!/bin/bash
# ANSI color codes
RED='\033[31m'
GREEN='\033[32m'
YELLOW='\033[33m'
BLUE='\033[34m'
MAGENTA='\033[35m'
CYAN='\033[36m'
RESET='\033[0m'
DIM='\033[2m'
# Detect mode based on environment variables set by wrapper scripts
if [[ "$ANTHROPIC_AUTH_TOKEN" == "ollama"* ]] || [[ "$ANTHROPIC_BASE_URL" == *"localhost"* ]]; then
LOCAL_MODE=true
GLM_MODE=false
MODE_PREFIX="Local"
elif [[ "$ANTHROPIC_BASE_URL" == *"z.ai"* ]]; then
LOCAL_MODE=false
GLM_MODE=true
MODE_PREFIX="GLM"
else
LOCAL_MODE=false
GLM_MODE=false
MODE_PREFIX="Claude"
fiThe script checks ANTHROPIC_AUTH_TOKEN and ANTHROPIC_BASE_URL — the same variables your wrapper scripts set. When you launch claude normally, neither is set, so it shows “Claude”. When you launch through a wrapper, the variables are present and the status line adapts.
Displaying Context Window Usage
The JSON input from stdin includes context window data. Extract it to show how much of the context window you’ve consumed:
# Read JSON input from stdin
input=$(cat)
# Calculate context window percentage
usage=$(echo "$input" | jq '.context_window.current_usage')
context_pct=0
if [ "$usage" != "null" ]; then
current=$(echo "$usage" | jq '.input_tokens + .cache_creation_input_tokens + .cache_read_input_tokens')
size=$(echo "$input" | jq '.context_window.context_window_size')
if [ "$size" -gt 0 ] 2>/dev/null; then
context_pct=$((current * 100 / size))
fi
fiFetching Rate Limit Usage
For Anthropic subscriptions, the OAuth API exposes your 5-hour and weekly usage. For other providers, you can hit their quota APIs. The key is to cache the results so you’re not hammering the API on every status line refresh:
# Cache API responses for 60 seconds
CACHE_TTL=60
NOW=$(date +%s)
CACHE_FILE="$HOME/.claude/usage-cache.json"
use_cache=false
if [ -f "$CACHE_FILE" ]; then
cache_time=$(jq -r '.cached_at // 0' "$CACHE_FILE" 2>/dev/null)
if [ $((NOW - cache_time)) -lt $CACHE_TTL ]; then
use_cache=true
fi
fi
if [ "$use_cache" = true ]; then
five_hour_used_pct=$(jq -r '.five_hour_used_pct // "?"' "$CACHE_FILE")
week_used_pct=$(jq -r '.week_used_pct // "?"' "$CACHE_FILE")
else
# Fetch from Anthropic OAuth API
CREDENTIALS_FILE="$HOME/.claude/.credentials.json"
access_token=$(jq -r '.claudeAiOauth.accessToken // empty' "$CREDENTIALS_FILE" 2>/dev/null)
if [ -n "$access_token" ]; then
response=$(curl -s --max-time 2 \
-H "Authorization: Bearer $access_token" \
-H "Content-Type: application/json" \
-H "anthropic-beta: oauth-2025-04-20" \
"https://api.anthropic.com/api/oauth/usage" 2>/dev/null)
if [ -n "$response" ] && echo "$response" | jq -e '.five_hour' >/dev/null 2>&1; then
five_hour_used_pct=$(echo "$response" | jq -r '.five_hour.utilization // 0' | awk '{printf "%.0f", $1}')
week_used_pct=$(echo "$response" | jq -r '.seven_day.utilization // 0' | awk '{printf "%.0f", $1}')
# Cache the result
cat > "$CACHE_FILE" << EOF
{
"five_hour_used_pct": "$five_hour_used_pct",
"week_used_pct": "$week_used_pct",
"cached_at": $NOW
}
EOF
fi
fi
fiFor local Ollama mode, skip the API calls entirely — there’s no quota to track.
Color-Coding by Usage Level
Color the percentages based on how close you are to the limit — green when you have plenty of headroom, yellow when you’re getting close, red when you’re nearly out:
color_usage() {
local pct=$1
local label=$2
if [ "$pct" = "?" ]; then
echo -e "${DIM}${label}:?%${RESET}"
elif [ "$pct" -lt 50 ]; then
echo -e "${GREEN}${label}:${pct}%${RESET}"
elif [ "$pct" -lt 80 ]; then
echo -e "${YELLOW}${label}:${pct}%${RESET}"
else
echo -e "${RED}${label}:${pct}%${RESET}"
fi
}Assembling the Output
Build the final status line by joining the parts with a separator. Include the provider mode, usage percentages, context window, and optionally the current branch and repo name:
parts=()
# Mode prefix in cyan
parts+=("${CYAN}${MODE_PREFIX}${RESET}")
# Rate limit usage (skip for local mode)
if [ "$LOCAL_MODE" != true ]; then
parts+=("$(color_usage "$five_hour_used_pct" "5h")")
fi
# Context window usage with color
if [ "$context_pct" -lt 50 ]; then
parts+=("${CYAN}${context_pct}% ctx${RESET}")
elif [ "$context_pct" -lt 80 ]; then
parts+=("${YELLOW}${context_pct}% ctx${RESET}")
else
parts+=("${RED}${context_pct}% ctx${RESET}")
fi
# Branch in blue, repo in magenta
branch=$(git -C "$cwd" rev-parse --abbrev-ref HEAD 2>/dev/null)
repo=$(basename "$(git -C "$cwd" rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null)
[ -n "$branch" ] && parts+=("${BLUE}${branch}${RESET}")
[ -n "$repo" ] && parts+=("${MAGENTA}${repo}${RESET}")
# Join with dimmed separator
SEP="${DIM} | ${RESET}"
result=""
for i in "${!parts[@]}"; do
if [ $i -eq 0 ]; then
result="${parts[$i]}"
else
result="${result}${SEP}${parts[$i]}"
fi
done
printf "%b" "$result"The end result looks something like:
Claude | 5h:23% | wk:8% | 12% ctx | main | my-projectOr when using a third-party provider:
GLM | 5h:67% | 31% ctx | feature-branch | my-projectOr local:
Local | 5% ctx | main | my-projectAt a glance you can see which provider you’re on, how much quota remains, and how full your context window is. The color coding means you don’t even need to read the numbers — green means go, red means start a new conversation or wait for your quota to reset.
Tips
- Don’t pollute your shell profile. This bears repeating: adding
ANTHROPIC_BASE_URLto.bashrcis the number one mistake. It silently redirects everyclaudeinvocation to your third-party provider, including when you forget it’s set. - Test with
claude --modelfirst. Before building a wrapper script, test the connection manually withexportstatements andclaude --model your-model. Once it works, wrap it up. - Increase
API_TIMEOUT_MSfor slow providers. The default timeout is fine for Anthropic’s API, but some providers (especially local Ollama on modest hardware) need more time. - Use
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1for local models. Always. It prevents Claude Code from trying to phone home when you’re running offline or against a local server. - The
--providerflag is only for Bedrock and Vertex. For everything else, use the environment variable approach.
Wrapping Up
Claude Code’s provider flexibility means you’re not locked into a single API source. Use Bedrock or Vertex for enterprise deployments, OpenRouter or z.ai for model variety, and Ollama for local experimentation — all through the same familiar CLI.
The wrapper script pattern keeps everything clean: claude is always your primary Anthropic connection, and purpose-built scripts handle everything else. No environment variable conflicts, no accidental billing surprises.
If you missed the earlier guides in this series:
- Installing Claude Code
- Using the Claude Code TUI
- CLAUDE.md and Settings
- Skills, Agents, and MCP Servers
- Hooks
- Plugins and Marketplaces
- Agent Teams
- Connecting to Other Providers (you are here)