Available

@cf/meta/llama-3.3-70b-instruct-fp8-fast

@cf/meta/llama-3.3-70b-instruct-fp8-fast — free model from Cloudflare Workers AI.

@cf/meta/llama-3.3-70b-instruct-fp8-fast — Free API Specifications

Context 131K
Max Output 131K
Modality text
Rate Limit 10K neurons/day (shared)
Card Required No
OpenAI Compatible Yes

How to Configure @cf/meta/llama-3.3-70b-instruct-fp8-fast for Free

Base URL https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run
How to get an API key Get API Key →

One-Click Config for Claude Code, Cursor & More

Claude Code

# Claude Code works via OpenRouter's Anthropic-compatible API.
# Note: Only paid Anthropic Claude models are supported (e.g. claude-sonnet-4.6, claude-opus-4).
# Browse available Claude models at: https://openrouter.ai/models?q=anthropic

# Add to ~/.zshrc or ~/.bashrc
export OPENROUTER_API_KEY="<your-openrouter-api-key>"  # Get at https://openrouter.ai/settings/keys
export ANTHROPIC_BASE_URL="https://openrouter.ai/api"
export ANTHROPIC_AUTH_TOKEN="$OPENROUTER_API_KEY"
export ANTHROPIC_API_KEY=""  # Must be explicitly empty to avoid conflicts

# Optional: pin specific models for each role
# export ANTHROPIC_DEFAULT_SONNET_MODEL="anthropic/claude-sonnet-4.6"
# export ANTHROPIC_DEFAULT_HAIKU_MODEL="anthropic/claude-haiku-4.5"

# Then simply run: claude

Cursor

# Cursor → Settings (⚙️) → Models → Add Model
# Enter the model name exactly as shown, then fill in:
#   Override OpenAI Base URL: https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run
#   OpenAI API Key: <your-api-key>   # Get at https://dash.cloudflare.com/profile/api-tokens
# Click "Verify" to confirm the connection, then enable the model.
#
# Model name to add: @cf/meta/llama-3.3-70b-instruct-fp8-fast

Codex

# Add to ~/.zshrc or ~/.bashrc
export OPENAI_BASE_URL="https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run"
export OPENAI_API_KEY="<your-api-key>"  # Get at https://dash.cloudflare.com/profile/api-tokens

# Then run:
codex --model "@cf/meta/llama-3.3-70b-instruct-fp8-fast"

Gemini CLI

# ~/.gemini/settings.json
{
  "apiKey": "<your-api-key>",
  "model": "@cf/meta/llama-3.3-70b-instruct-fp8-fast"
}
# Get API key at https://dash.cloudflare.com/profile/api-tokens

OpenCode

// ~/.config/opencode/opencode.json
{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "free-llm": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Free LLM",
      "options": {
        "baseURL": "https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run",
        "apiKey": "<your-api-key>"
      },
      "models": {
        "@cf/meta/llama-3.3-70b-instruct-fp8-fast": { "name": "@cf/meta/llama-3.3-70b-instruct-fp8-fast" }
      }
    }
  }
}
// Get API key at https://dash.cloudflare.com/profile/api-tokens

Hermes

# Step 1 — Edit config.yaml
# Windows: C:\Users\<you>\AppData\Local\hermes\config.yaml
# macOS/Linux: ~/.config/hermes/config.yaml

model:
  default: @cf/meta/llama-3.3-70b-instruct-fp8-fast
  provider: custom
  base_url: ${CUSTOM_BASE_URL}
  api_key: ${CUSTOM_API_KEY}
  model_aliases:
    @cf/meta/llama-3.3-70b-instruct-fp8-fast:
      model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast"
      provider: "custom"

# Step 2 — Edit .env (same directory as config.yaml)
# Windows: C:\Users\<you>\AppData\Local\hermes\.env
# macOS/Linux: ~/.config/hermes/.env

# ========================
# Custom API (OpenAI-compatible)
# ========================
CUSTOM_API_KEY=<your-api-key>        # Get at https://dash.cloudflare.com/profile/api-tokens
CUSTOM_BASE_URL=https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run

OpenClaw

// ~/.openclaw/openclaw.json  (JSON5 format)
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "@cf/meta/llama-3.3-70b-instruct-fp8-fast",
      },
    },
  },
  "models": {
    "providers": {
      // Option A — Built-in provider (OpenAI, Anthropic, Google…)
      // Just add apiKey; OpenClaw handles the baseUrl automatically
      // "openai": { "apiKey": "<your-api-key>" },

      // Option B — Custom OpenAI-compatible base URL (e.g. OpenRouter, NVIDIA)
      "free-llm": {
        "baseUrl": "https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run",
        "apiKey": "<your-api-key>",  // Get at https://dash.cloudflare.com/profile/api-tokens
        "api": "openai-completions", // openai-completions | anthropic-messages | …
        "models": [
          { "id": "@cf/meta/llama-3.3-70b-instruct-fp8-fast", "name": "@cf/meta/llama-3.3-70b-instruct-fp8-fast" },
        ],
      },
    },
  },
}
// Apply: openclaw gateway restart
// Verify: openclaw doctor --fix

Frequently Asked Questions about @cf/meta/llama-3.3-70b-instruct-fp8-fast

Is @cf/meta/llama-3.3-70b-instruct-fp8-fast free to use?

Yes. @cf/meta/llama-3.3-70b-instruct-fp8-fast is available on a permanently free tier via Cloudflare Workers AI. No credit card is required — simply sign up and get your API key. The free tier includes a rate limit of 10K neurons/day (shared).

What is @cf/meta/llama-3.3-70b-instruct-fp8-fast best for?

@cf/meta/llama-3.3-70b-instruct-fp8-fast is optimized for chat tasks. It supports text modalities, with a context window of 131K tokens and a maximum output of 131K tokens. @cf/meta/llama-3.3-70b-instruct-fp8-fast — free model from Cloudflare Workers AI.

Is @cf/meta/llama-3.3-70b-instruct-fp8-fast OpenAI-compatible?

Yes. @cf/meta/llama-3.3-70b-instruct-fp8-fast uses an OpenAI-compatible API endpoint at https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run. You can use it with the OpenAI Python/JS SDK, or any tool that accepts a custom baseURL — including Claude Code (cc), Cursor, Codex, and OpenCode.

How do I get an API key for @cf/meta/llama-3.3-70b-instruct-fp8-fast?

Visit Cloudflare Workers AI's API key page to register and generate a free API key. Once you have the key, use the configuration snippets above to set up Claude Code, Cursor, or your preferred AI coding tool.