Llama 3.3 70B Instruct
Llama 3.3 70B Instruct has 11 API providers. The lowest input price is $0.1000/1M tokens from OpenRouter. 10 providers auto-refresh every 6 h (Updated 12m ago); 1 manually maintained.
The official baseline is $0.5900/1M input tokens, while the lowest relay quote is $0.1000 (OpenRouter) — 83% below official. Relay discounts come with different latency and SLA terms: latency-sensitive production traffic is safer on the official API, while batch and offline workloads benefit most from cheap relays.
Llama 3.3 70B Instruct supports a 128K context window. At today's lowest rate, processing 1M input + 1M output tokens costs about $0.4200. The table below lists current per-provider quotes.
Frequently Asked Questions
How is Llama 3.3 70B Instruct API pricing calculated?
LLM APIs are billed per 1M input and output tokens separately. Official providers set the base price; relay providers typically offer 20–80% discounts.
What is the cheapest way to access Llama 3.3 70B Instruct API?
The lowest current input price is $0.1000 per 1M tokens from OpenRouter. Prices update in real time — bookmark this page for the latest rates.
Can I use Llama 3.3 70B Instruct API from China?
Providers marked 🇨🇳 in the table support China-mainland access. Check each provider's documentation for details.
How often is Llama 3.3 70B Instruct API pricing updated?
10 sources on this page are live-scraped every ~6 hours (Updated 12m ago); 1 is manually maintained and updated when official prices change.
How reliable is the Llama 3.3 70B Instruct API pricing data?
Live-scraped prices come directly from provider APIs and are generally accurate. Manually maintained prices are sourced from official pricing pages and periodically verified. For critical decisions, confirm the latest price via the provider's official link.
How can I tell if a price is live-scraped or manually maintained?
Each provider row shows a small label next to the name: a green "Live" badge means prices are automatically fetched on a schedule; a gray "Manual" badge means human-curated. Live data not refreshed in 24+ hours turns amber.
Related