Quickstart

This guide will help you make your first API call with Lunar SDK.

1. Install the SDK

Python
TypeScript

pip install lunar

npm install lunar

2. Set Your API Key

Set the LUNAR_API_KEY environment variable:

export LUNAR_API_KEY="your-api-key"

Or pass it directly to the client:

Python
TypeScript

from lunar import Lunar

client = Lunar(api_key="your-api-key")

import { Lunar } from "lunar";

const client = new Lunar({ apiKey: "your-api-key" });

3. Make Your First Request

Chat Completion

Python
TypeScript

from lunar import Lunar

client = Lunar()

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

print(response.choices[0].message.content)
# Output: The capital of France is Paris.

import { Lunar } from "lunar";

const client = new Lunar();

const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "What is the capital of France?" },
  ],
});

console.log(response.choices[0].message.content);
// Output: The capital of France is Paris.

Text Completion

Python
TypeScript

response = client.completions.create(
    model="gpt-4o-mini",
    prompt="The capital of France is",
    max_tokens=10
)

print(response.choices[0].text)
# Output: Paris, which is also...

const response = await client.completions.create({
  model: "gpt-4o-mini",
  prompt: "The capital of France is",
  max_tokens: 10,
});

console.log(response.choices[0].text);
// Output: Paris, which is also...

4. Check Cost and Usage

Every response includes detailed usage information:

Python
TypeScript

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}]
)

# Token counts
print(f"Prompt tokens: {response.usage.prompt_tokens}")
print(f"Completion tokens: {response.usage.completion_tokens}")
print(f"Total tokens: {response.usage.total_tokens}")

# Cost breakdown
print(f"Input cost: ${response.usage.input_cost_usd}")
print(f"Output cost: ${response.usage.output_cost_usd}")
print(f"Total cost: ${response.usage.total_cost_usd}")

# Performance
print(f"Latency: {response.usage.latency_ms}ms")

const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Hello!" }],
});

// Token counts
console.log(`Prompt tokens: ${response.usage?.prompt_tokens}`);
console.log(`Completion tokens: ${response.usage?.completion_tokens}`);
console.log(`Total tokens: ${response.usage?.total_tokens}`);

// Cost breakdown
console.log(`Input cost: $${response.usage?.input_cost_usd}`);
console.log(`Output cost: $${response.usage?.output_cost_usd}`);
console.log(`Total cost: $${response.usage?.total_cost_usd}`);

// Performance
console.log(`Latency: ${response.usage?.latency_ms}ms`);

5. Use Fallbacks

Add fallback models to automatically retry if the primary model fails:

Python
TypeScript

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
    fallbacks=["claude-3-haiku", "llama-3.1-8b"]
)

const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Hello!" }],
  fallbacks: ["claude-3-haiku", "llama-3.1-8b"],
});

6. Streaming

Stream responses token by token:

Python
TypeScript

stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

const stream = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Hello!" }],
  stream: true,
});

for await (const chunk of stream) {
  if (chunk.choices[0].delta.content) {
    process.stdout.write(chunk.choices[0].delta.content);
  }
}

7. Async Usage (Python)

For async Python applications, use AsyncLunar:

from lunar import AsyncLunar

async def main():
    async with AsyncLunar() as client:
        response = await client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": "Hello!"}]
        )
        print(response.choices[0].message.content)

The TypeScript SDK is async by default — all methods return Promises. No separate async client is needed.

Next Steps

Installation

Full configuration options

Chat Completions

Detailed Chat API guide

Fallbacks

Configure intelligent fallbacks

Evaluations

Test your LLM outputs

Getting Started

Lunar SDK

Pricing

Quickstart

Quickstart

1. Install the SDK

2. Set Your API Key

3. Make Your First Request

Chat Completion

Text Completion

4. Check Cost and Usage

5. Use Fallbacks

6. Streaming

7. Async Usage (Python)

Next Steps

Installation

Chat Completions

Fallbacks

Evaluations

Getting Started

Lunar SDK

Pricing

​Quickstart

​1. Install the SDK

​2. Set Your API Key

​3. Make Your First Request

​Chat Completion

​Text Completion

​4. Check Cost and Usage

​5. Use Fallbacks

​6. Streaming

​7. Async Usage (Python)

​Next Steps

Installation

Chat Completions

Fallbacks

Evaluations

Quickstart

1. Install the SDK

2. Set Your API Key

3. Make Your First Request

Chat Completion

Text Completion

4. Check Cost and Usage

5. Use Fallbacks

6. Streaming

7. Async Usage (Python)

Next Steps