Llama 3.3 70B by Cerebras

The fastest tokens/sec available free - wafer-scale inference. OpenAI-compatible.

Access
Free API tier
Free limits
~14k tok/min
Modality
text
Credit card
Not required
Commercial use
Allowed
Model ID
llama-3.3-70b
Base URL
https://api.cerebras.ai/v1
Last verified
June 2026

How to use Llama 3.3 70B

The fastest tokens/sec available free - wafer-scale inference. OpenAI-compatible.

Quickstart

curl https://api.cerebras.ai/v1/chat/completions \
  -H "Authorization: Bearer $KEY" \
  -d '{"model":"llama-3.3-70b","messages":[{"role":"user","content":"hi"}]}'

Frequently asked

Is Llama 3.3 70B free?

Yes. Cerebras offers it as Free API tier with these limits: ~14k tok/min. No credit card is required.

Can I use Llama 3.3 70B commercially?

Yes, commercial use is allowed. Verify the current license or terms before shipping.

How do I start using Llama 3.3 70B?

The fastest tokens/sec available free - wafer-scale inference. OpenAI-compatible.

Related free models

Want this wired into your business?

We build production automations and agents on free and paid models, picked for your workload and budget.

Book a build call