Lunar Platform
Lunar offers a complete set of tools for developers who want to distill, route, and deploy AI models. Our platform provides:Lunar SDK
Python & TypeScript SDK for LLM inference with intelligent routing, fallbacks, cost tracking, and built-in evaluations.
GPU Instances
Deploy and run open-source models on dedicated NVIDIA GPUs, from L4 to H200.
Platform Features
Lunar SDK: OpenAI-Compatible LLM Access
Lunar SDK: OpenAI-Compatible LLM Access
Access 12+ LLM providers through a single API. Features smart routing with AI task classification, automatic fallbacks, per-request cost tracking, and a comprehensive evaluation framework with 15+ built-in scorers.
GPU Instances: Deploy Any Model
GPU Instances: Deploy Any Model
Run open-source models like LLaMA, Qwen, DeepSeek, and more on dedicated GPU instances. Choose from 6 tiers ranging from NVIDIA L4 (24GB) to H200 clusters (1128GB).
Get Started
- Python
- TypeScript
- REST API
- GPU Instances
- CLI
Pricing
| Tier | GPU | VRAM | Price | Best For |
|---|---|---|---|---|
| XS | 1x L4 | 24GB | ~$0.20/h | 7B-13B models |
| S | 1x L40S | 48GB | ~$0.60/h | 13B-34B models |
| M | 4x A10G | 96GB | ~$1.80/h | 70B INT4 |
| L | 4x L40S | 192GB | ~$3.50/h | 70B FP16 |
| XL | 8x A100 | 320-640GB | ~$12/h | 180B models |
| XXL | 8x H100/H200 | 640-1128GB | ~$20-30/h | 405B models |
Full Pricing Details
View complete pricing information