MetaOpen-weight·131K context·70B params·Llama 3 Community License

Llama 3.3 70B Instructmeta/llama-3-3-70b-instruct

Name: Llama 3.3 70B Instruct API
Brand: Meta
Availability: InStock

Meta's Llama 3.3 70B Instruct matches or exceeds Llama 3.1 405B on several benchmarks at 6x smaller size. Most widely hosted open-weight model; available on every major inference platform at competitive per-token prices.

Cheapest blended:$0.88 / 1M tokenson Together.ai · 1 provider listed

Pricing across providers

Sort by:

Provider	Input /1M	Output /1M	Blended /1M	Latency p50	Format	Freshness	Action
Together.ai meta-llama/Llama-3.3-70B-Instruct-Turbo	$0.88	$0.88	$0.88	190ms	OpenAI-compatible	Verified 3d ago	Try →

Affiliate disclosure: We may earn a commission from qualified signups. Pricing independence is enforced at the data layer — see our Editorial Independence Policy.

Works with

Point any of these clients at a hosting's base URL — they all speak at least one of this model's endpoint protocols (OPENAI_COMPATIBLE).

Capabilities

chat
coding
reasoning
multilingual
long_context

Languages: en, de, fr, it, pt, hi, es, th

Benchmarks

HumanEval0-shot pass@1 · official · source
88.4%
MMLU5-shot · official · source
86.8%

Code samples

Example using Together.ai — the cheapest hosting for this model as of last verification. Swap base_url and model to use a different provider from the matrix above.

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.together.xyz/v1",
)

response = client.chat.completions.create(
    model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

Technical specs

Context: 131K
Max output: 4K
Parameters: 70B
Release: 2024-12-06
Training cutoff: 2023-12-01
License: Llama 3 Community License

Hugging Face →Official blog →

Similar models

Qwen3 72B InstructOpen-weight
Alibaba·72B params·131K ctx

Compare with

Llama 3.3 70B Instruct vs Qwen3 72B Instruct
Comparison planned — not yet published

Frequently asked

How much does Llama 3.3 70B Instruct cost?+

The cheapest public hosting is $0.88 per 1M blended tokens on Together.ai. 1 total providers are listed above with per-input / per-output / cached pricing.

Is Llama 3.3 70B Instruct open-source? Can I fine-tune it?+

Yes. Llama 3.3 70B Instruct is open-weight under the Llama 3 Community License license. Weights are available on Hugging Face for local inference, fine-tuning, and commercial use (see license for specific terms).

Is Llama 3.3 70B Instruct OpenAI-compatible?+

Most listed hostings expose an OpenAI-compatible API, so you can point an existing openai SDK client at the Provider's base_url and use the Provider's model name. See the Code Samples above for a copy-pasteable example.

What's the maximum context window for Llama 3.3 70B Instruct?+

The model supports up to 131,072 tokens of context (input + output). Some hosted versions may impose a smaller limit — check the "Context" column in the pricing matrix for each provider.