Groq Llama 3.3 70B

Rẻ Phổ biến

Llama 3.3 70B chạy trên Groq LPU — tốc độ inference cực nhanh (>300 tok/s). Giá rẻ.

128K

Context Window

Max Output

Nhanh

Tốc độ

3/5

Chất lượng

Bảng giá

Loại	Giá gốc ($)	Futrix API (₫)	Bao gồm
Input / 1M tokens	$0.59	20.095đ/1M	Smart routing tự động Cache tiết kiệm 35-60% 1 API key cho 99+ models Thanh toán VND/USD/Crypto
Output / 1M tokens	$0.79	26.907đ/1M

* Giá Futrix API cao hơn giá gốc ~30% nhưng bao gồm smart routing, semantic cache (tiết kiệm thực tế 35-60%), fallback tự động, và thanh toán VND.

Tính năng hỗ trợ

Chat Function Calling JSON Mode Streaming

Quick Start

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.futrixapi.com/v1",
    api_key="sk-ftx-your-key"
)

response = client.chat.completions.create(
    model="groq-llama-3.3-70b",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

cURL

curl -X POST https://api.futrixapi.com/v1/chat/completions \
  -H "Authorization: Bearer sk-ftx-your-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "groq-llama-3.3-70b", "messages": [{"role": "user", "content": "Hello!"}]}'

Dùng thử Groq Llama 3.3 70B Xem API Docs