Now in private beta · 2026

The drop-in LLM router for your stack

Use Dari's OpenAI-compatible router with your existing SDK. Swap the base URL and route every request by cost, latency, quality, or policy — no rewrites, no lock-in.

Request access See how it works

1 line: to integrate
100+: models routed
~12ms: routing overhead

app.ts

1import OpenAI from "openai";
2
3const client = new OpenAI({
4  // The only change you make:
5  baseURL: "https://api.usedari.co/v1",
6  apiKey: process.env.DARI_API_KEY,
7});
8
9const res = await client.chat.completions.create({
10  model: "auto",          // let Dari pick
11  messages: [{ role: "user", content: "Hi" }],
12  // route by your own rules:
13  metadata: { route: "cheapest", max_latency_ms: 800 },
14});

One key. Routed across every major provider.

OpenAIAnthropicGoogle VertexAWS BedrockGroqFireworksMistralCohere

// capabilities

One endpoint. Every routing strategy you need.

Dari sits between your app and the model providers, making smart decisions on every request based on the rules you care about.

Truly drop-in

OpenAI-compatible API. Change one base URL and your existing SDK calls just work — Python, TS, or curl.

Route by cost

Automatically send each request to the cheapest model that meets your quality bar. Cut spend without touching code.

Route by latency

Set a latency budget per request. Dari picks the fastest healthy provider and fails over in milliseconds.

Route by quality

Define eval-backed quality tiers and let Dari upgrade hard prompts to stronger models automatically.

Route by policy

Enforce data residency, PII rules, and approved-provider lists with policies that travel with every call.

Observability built in

Per-request traces, spend, and latency across every model in one dashboard. Export to your stack.

// how it works

From import to intelligent routing in minutes

01
Swap the base URL
Point your OpenAI client at api.usedari.co/v1 and drop in your Dari key. Nothing else changes.
02
Declare your rules
Set routing intent per request or per project — cheapest, fastest, highest quality, or policy-constrained.
03
Dari routes & fails over
Each call is scored against live model health, price, and latency, then sent to the best provider with automatic fallback.
04
Watch it in the dashboard
Trace every request, compare models, and see exactly where your spend and latency go in real time.

// live routing

Pick an intent. See where Dari sends it.

The same prompt, routed four different ways. Change the strategy and watch the selected model update.

POST /v1/chat/completionsrouted

Selected model

llama-3.3-70b

Provider

Groq

Est. cost

$0.0006 / 1K

p50 latency

210 ms

Cheapest model meeting your quality floor.

// request access

Join the Dari private beta

Access is whitelist-only during the beta. Drop your work email and we'll approve teams in batches. Approved accounts unlock login and checkout.

No spam. We only email about your access status.