Simple pay-as-you-go pricing
Pay only for what you use. No subscriptions, no commitments. Choose from our curated collection of production-ready models.
| Model | Input tokens | Output tokens |
|---|---|---|
LLaMA 3.2 90B VisionMeta's cutting-edge multimodal model combining vision and language understanding with exceptional reasoning capabilities. multimodal | $2.00 per 1M tokens | $4.00 per 1M tokens |
LLaMA 3.1 405BMeta's largest and most capable open-source model with exceptional reasoning capabilities and extended context length. language | $1.50 per 1M tokens | $3.00 per 1M tokens |
LLaMA 3.1 70BHigh-performance model with superior reasoning and 128K context window, perfect for enterprise deployments. language | $1.00 per 1M tokens | $2.00 per 1M tokens |
LLaMA 3.1 8BMeta's latest language model with improved reasoning capabilities and multilingual support. Excellent for general-purpose applications. language | $0.50 per 1M tokens | $1.00 per 1M tokens |
DeepSeek-V2.5 CoderDeepSeek's most advanced coding model with mixture of experts architecture, trained on massive code datasets. code | $1.50 per 1M tokens | $3.00 per 1M tokens |
Qwen2.5 72B InstructAlibaba's flagship model with exceptional multilingual capabilities and advanced reasoning across 29+ languages. language | $1.00 per 1M tokens | $2.00 per 1M tokens |
Mixtral 8x22B InstructMistral's largest Mixture of Experts model with exceptional performance across diverse tasks and function calling. language | $1.50 per 1M tokens | $3.00 per 1M tokens |
Phi-3.5 Mini InstructMicrosoft's latest compact model with impressive reasoning capabilities, perfect for edge deployment and mobile applications. language | $0.50 per 1M tokens | $1.00 per 1M tokens |
DeepSeek Coder V2Advanced coding model with exceptional performance on programming tasks. code | $1.00 per 1M tokens | $2.00 per 1M tokens |
LLaMA 3.1Meta's open-source LLaMA 3.1 with improved reasoning and instruction following. llama | $0.50 per 1M tokens | $1.00 per 1M tokens |
Mistral 7BMistral 7B: strong accuracy for instruction tasks and low-latency inference. mistral | $0.50 per 1M tokens | $1.00 per 1M tokens |
How it works
Choose your model, send your requests, and pay only for the tokens you use.
Input tokens are charged at a lower rate than output tokens.